4 1 Preface 2 Internet Infrastructure In this chapter I will give a short overview over todays internet infrastructure and the processes involved, when establishing a connection between two hosts over the internet. Important points I will mention: backbones, Internet Service Provider, End User Connection establishing a connection: TCP/IP, DNS, HTTP. static versus dynamic content internet bandwidth today identifying the bottlenecks 3 Caching techniques In this chapter I will discuss different caching techniques used for static and dynamic content caching. On the static content caching side I will concentrate on proxy caching and on a special case of proxy caching : content distribution networks. Dynamic content is not as easy to cache as static content as it is essential to always check whether the cached content is not already outdated. Dynamic content caching can be implemented at the process of content generation (e.g. database caching, caching on the application side), at the process of content delivery or even at the client side (e.g. browser caching). 3.1 Proxy Servers Proxy servers act as intermediary servers between users who request content and servers that serve content. When a user sends a request for content he does not send the request directly to the server that serves the content but to the proxy server that fetches the requested content and delivers it to the user. In most cases proxy servers cache the content they fetch. When a user sends a request for content that the proxy server already keeps in its cache, the proxy server delivers the content directly from its cache. This technique can save bandwidth and reduce response times. But proxy servers can go 4

5 further and prefetch popular content in order to reduce the response time also for the first user who sends a request. In this chapter I will describe the functioning of a proxy server (e.g. Squid), how prefetching can be realised and I will present an approach to cache dynamic content. 3.2 Content Delivery Networks (CDNs) Content delivery networks are a network of servers that cache mainly static content. The servers are geographically distributed and the nearest or most idle server delivers the content to the user. As there is not a single server that has to handle all requests and most of the time the answering server is located near the user, response times are kept low and bandwidth is saved. CDNs can be compared with distributer/retailer warehouses [HoKiSm04], only that they store and distribute content instead of physical goods. In this chapter I will describe the different architectural approaches to content delivery networks (overlay approach versus network approach) and some optimization models for CDNs. 4 Peer-to-Peer networks (P2P) Peer-to-Peer networks changed the storage mode of the internet from a content located in center mode to content located in edge mode [Shir02]. In almost all P2P network models (the napster model is an exception) the whole network does not depend on centerly located servers that could be single points of failure or potential bottlenecks, but on its peers. This leads to high scalability and high redundancy. In this chapter I will discuss different architectural approaches to P2P networks. 4.1 Definition In this section I will give a definition of P2P networks as they are not only used to share files, but also to share computing capacity (e.g. 4.2 P2P network architecture In this section I will describe the most common P2P network architectures. I will outline the main differences and show the improvements in newer architectures. 5

6 4.2.1 The napster model The napster model is one of the first P2P network models. It needs centrally located infrastructure to organize its nodes and is therefore not as reliable as other P2P network architectures that do not need any central server The gnutella model The gnutella model does not need any centrally based infrastructure, it is completely based on its nodes. Its biggest shortcoming is its poor search performance HAN (hierarchical network architecture) model The HAN introduces super nodes into the gnutella model to achieve significant search performance gains The BitTorrent model 4.3 P2P network traffic analysis This section will contain an anlysis of traffic patterns on P2P networks. 4.4 P2P networks to handle flash crowds In this section I would like to introduce a model according to [RuSaSt04]. 5 Other techniques to improve internet performance In this chapter I want to give a short overview of server side techniques to improve WWW performance. Load balancers use different policies to distribute user requests to several back end servers. If possible I would like to describe the load balancer of the LPIS (Lehrveranstaltungs- und Prüfungsinformationssystem der Wirtschaftsuniversität Wien). Compression can also lead to significant performance improvements. Most of the pictures used on web sites aready use a compressed format such as JPEG, PNG or GIF but most of the plain text information travels to the user in an uncompressed form. Software such as the Apache module mod gzip compresses the output of the web server before it delivers it to the user. This technique could significantly reduce the traffic needs of web pages. 6

From Internet Data Centers to Data Centers in the Cloud This case study is a short extract from a keynote address given to the Doctoral Symposium at Middleware 2009 by Lucy Cherkasova of HP Research Labs

1 1 Distributed Systems What are distributed systems? How would you characterize them? Components of the system are located at networked computers Cooperate to provide some service No shared memory Communication

Overlay Networks An overlay is a logical network on top of the physical network Routing Overlays The simplest kind of overlay Virtual Private Networks (VPN), supported by the routers If no router support

Department of Computer Science Institute for System Architecture, Chair for Computer Networks Caching, Content Distribution and Load Balancing Motivation Which optimization means do exist? Where should

By George Pallis and Athena Vakali Insight and Perspectives for CONTENT DELIVERY NETWORKS Striking a balance between the costs for Web content providers and the quality of service for Web customers. More

Peer to peer networks: sharing between peers Trond Aspelund Abstract In this literature survey we look at peer-to-peer networks. We first see how peer-to-peer networks distinguish themselves from the client/server

A Novel Load Balancing Optimization Algorithm Based on Peer-to-Peer Technology in Streaming Media College of Computer Science, South-Central University for Nationalities, Wuhan 430074, China shuwanneng@yahoo.com.cn

Peer to peer networking: Main aspects and conclusions from the view of Internet service providers Gerhard Haßlinger, Department of Computer Science, Darmstadt University of Technology, Germany Abstract:

Protagonist International Journal of Management And Technology (PIJMT) Online ISSN- 2394-3742 Vol 2 No 3 (May-2015) A Qualitative Approach To Design An Algorithm And Its Implementation For Dynamic Load

Improving Content Delivery by Exploiting the Utility of CDN Servers George Pallis Computer Science Department University of Cyprus gpallis@cs.ucy.ac.cy Abstract. As more aspects of our work and life move

Multicast vs. P2P for content distribution Abstract Many different service architectures, ranging from centralized client-server to fully distributed are available in today s world for Content Distribution

ISSN Volume 1, No.1, September October 2012 International Journal of Science the and Internet. Applied However, Information this trend leads Technology to sudden burst of Available Online at http://warse.org/pdfs/ijmcis01112012.pdf

Caching Dynamic Content with Automatic Fragmentation Ikram Chabbouh and Mesaac Makpangou Abstract. In this paper we propose a fragment-based caching system that aims at improving the performance of Webbased

Content Distribution Infrastructures Carsten Griwodz, Thomas Plagemann, Ralf Steinmetz 2nd July 2004 1 Public Outreach Since the early days of the world wide web (WWW), the information infrastructure provided

The Influence of Web Page Images on the Performance of Web Servers Cristina Hava Muntean, Jennifer McManis and John Murphy Performance Engineering Laboratory, School of Electronic Engineering, Dublin City