Transcription

2 c 2003 James Hall This technical report is based on a dissertation submitted April 2003 by the author for the degree of Doctor of Philosophy to the University of Cambridge, King s College. Some figures in this document are best viewed in colour. If you received a black-and-white copy, please consult the online version if necessary. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: Series editor: Markus Kuhn ISSN

3 Abstract Passive network monitoring offers the possibility of gathering a wealth of data about the traffic traversing the network and the communicating processes generating that traffic. Significant advantages include the non-intrusive nature of data capture and the range and diversity of the traffic and driving applications which may be observed. Conversely there are also associated practical difficulties which have restricted the usefulness of the technique: increasing network bandwidths can challenge the capacity of monitors to keep pace with passing traffic without data loss, and the bulk of data recorded may become unmanageable. Much research based upon passive monitoring has in consequence been limited to that using a sub-set of the data potentially available, typically TCP/IP packet headers gathered using Tcpdump or similar monitoring tools. The bulk of data collected is thereby minimised, and with the possible exception of packet filtering, the monitor s available processing power is available for the task of collection and storage. As the data available for analysis is drawn from only a small section of the network protocol stack, detailed study is largely confined to the associated functionality and dynamics in isolation from activity at other levels. Such lack of context severely restricts examination of the interaction between protocols which may in turn lead to inaccurate or erroneous conclusions. The work described in this dissertation attempts to address some of these limitations. A new passive monitoring architecture Nprobe is presented, based upon off the shelf components and which, by using clusters of probes, is scalable to keep pace with current high bandwidth networks without data loss. Monitored packets are fully captured, but are subject to the minimum processing in real time needed to identify and associate data of interest across the target set of protocols. Only this data is extracted and stored. The data reduction ratio thus achieved allows examination of a wider range of encapsulated protocols without straining the probe s storage capacity. Full analysis of the data harvested from the network is performed off-line. The activity of interest within each protocol is examined and is integrated across the range of protocols, allowing their interaction to be studied. The activity at higher levels informs study of the lower levels, and that at lower levels infers detail of the higher. A technique for dynamically modelling TCP connections is presented, which, by using data from both the transport and higher levels of the protocol stack, differentiates between the effects of network and endprocess activity. The balance of the dissertation presents a study of Web traffic using Nprobe. Data collected from the IP, TCP, HTTP and HTML levels of the stack is integrated to identify the patterns of network activity involved in downloading whole Web pages: by using the links contained in HTML documents observed by the monitor, together with data extracted from the HTML headers of downloaded contained objects, the set of TCP connections used, and the way in which browsers use them, are studied as a whole. An analysis of the degree and distribution of delay is presented and contributes to the understanding of performance as perceived by the user. The effects of packet loss on whole page download times are examined, particularly those losses occurring early in the lifetime of connections before reliable estimations of round trip times are established. The implications of such early packet losses for pages downloads using persistent connections are also examined by simulations using the detailed data available.

4

5 5 Acknowledgements I must firstly acknowledge the very great contribution made by Micromuse Ltd., without whose financial support for three years it would not have been possible for me to undertake the research leading to this dissertation, and to the Computer Laboratory whose further support allowed me to complete it. Considerable thanks is due to my supervisor, Ian Leslie, for procuring my funding, for his advice and encouragement, his reading and re-reading of the many following chapters, his suggestions for improvement and, perhaps above all, his gift for identifying central issues on the occasions when I have become bogged down in the minutiae. Thanks also to Ian Pratt who is responsible for many of the concepts underlying the Nprobe monitor, and who has been an invaluable source of day-to day and practical advice and guidance. Although the work described here has been largely an individual undertaking, it could not have proceeded satisfactorily without the background of expertise, discussion, imaginative ideas and support provided by my past and present colleagues in the Systems Research Group. Special mention must be made of Derek McAuley who initially employed me, of Simon Cosby, who encouraged me to make the transition from research assistant to PhD student, and of James Bulpin for his ever-willing help in the area of system administration: to them, to Steve Hand, Tim Harris, Keir Fraser, Jon Crowcroft, Dickon Reed, Andrew Moore, and to all the others, I am greatly indebted. Many others in the wider community of the Computer Laboratory and University Computing Service have also made a valuable contribution: to Margaret Levitt for her encouragement over the years, to Chris Cheney and Phil Cross for their assistance in providing monitoring facilities, to Martyn Johnson and Piete Brookes for accommodating unusual demands upon the Laboratory s systems, and to many others, I offer my thanks. Thank you to those who have rendered assistance in proof-reading the final draft of this dissertation: to Anna Hamilton, and to some of those mentioned above you know who you are. Very special love and thanks goes to my wife, Jenny, whose continual encouragement and support has underpinned all of my efforts. She has taken on a vastly disproportionate share of our domestic and child-care tasks, despite her own work, and has endured much undeserved neglect to allow me to concentrate on my work. My love and a big thank-you also go to Charlie and Sam who have seen very much less of their daddy than they have deserved, and to Jo and Luke, my boys, with whom I have recently sunk fewer pints than I should. Finally, the rôle of an unlikely player deserves note. In the early 1980 s I was unwillingly conscripted into the ranks of Maggie s army the hundreds of thousands made unemployed in one of the cruelest and most regressive pieces of social engineering ever to be inflicted upon this country. For many it spelled unemployment for the remainder of their working lives. With time on my hands I embarked upon an Open University degree, the first step on a path which lead to the Computer Laboratory, the Diploma in Computer Science, and eventually to the writing of this dissertation I was one of the fortunate ones.

11 Contents An Assessment of Python as the Analysis Coding Language Python and Object Orientation Difficulties Associated with the Use of Python Summary Modelling and Simulating TCP Connections Analytical Models The Limitations of Steady State Modelling Characterisation of TCP Implementations and Behaviour Analysis of Individual TCP Connections Modelling the Activity of Individual TCP Connections The Model as an Event Driven Progress Packet Transmission and Causality Monitor Position and Interpretation of Packet Timings Unidirectional and Bidirectional Data Flows Identifying and Incorporating Packet Loss Packet Re-Ordering and Duplication Slow Start and Congestion Avoidance Construction of the Activity Model Enhancement of the Basic Data-Liberation Model The Model Output Simulation of Individual TCP Connections Assessment of the Impact of Packet Loss Construction of the Simulation Simulation Output Visualisation Validation of Activity Models and Simulations Summary and Discussion Non-Intrusive Estimation of Web Server Delays Using Nprobe Motivation and Experimental Method Use of the Activity Model Assessment Using an Artificially Loaded Server Client and Local Server with Small Round Trip Times Client and Remote Server with Larger Round Trip Times HTTP/1.1 Persistent Connections Discussion of the Experimental Results Observation of Real Web Traffic to a Single Site Summary Observation and Reconstruction of World Wide Web Page Downloads The Desirability of Reconstructing Page Downloads The Anatomy of a Page Download Reconstruction of Page Download Activity Limitations of Packet-Header Traces Reconstruction Using Data from Multiple Protocol Levels Constructing the Reference Tree

12 12 Contents Referrers, Links and Redirection Multiple Links, Repeated Downloads and Browser Caching Aborted Downloads Relative References, Reference Scopes and Name Caching Self-Refreshing Objects Unresolved Connections and Transactions Multiple Browser Instances and Web Cache Requests Timing Data Integration with Name Service Requests Visualisation Web Traffic Characterisation Summary Page Download Times and Delay Factors Contributing to Overall Download Times The Contribution of Object Delay to Whole Page Download Times Differentiating Single Object and Page Delays Assessing the Degree of Delay The Distribution of Object Delay within a Page Download The Effect of Early Packet Loss on Page Download Time Delay in Downloading from a Site with Massive Early Packet Loss Comparison with a General Traffic Sample Implications of Early Packet Loss for Persistent Connections Summary Conclusions and Scope for Future Work Summary Assessment Conclusion The Original Contribution of this Dissertation Scope for Future Work Further Analysis of Web Activity Nprobe Development A Class Generation with SWIG 201 A.1 Interface Definition A.2 Tailoring Data Representation in the Interface B An Example of Trace File Data Retrieval 205 B.1 A Typical Data-Reading Function B.2 Use of Retrieval Interface Classes B.3 Using the FileRec Class to Minimize Memory Use C Inter-Arrival Times Observed During Web Server Latency Tests 213 C.1 Interpretation of Request and Response Inter-Arrival Times C.2 Inter-Arrival Times for Client and Local Server with Small Round Trip Times 214 C.3 Inter-Arrival Times for Client and Distant Server with Larger Round Trip Times216

15 List of Figures The contribution of single object delays to overall page download time News server: whole page download times News server: server latency and prtts News server: components of early-loss delay The contribution of early delay to page downloads using persistent connections Page download times for the pages downloaded from the news site using simulated persistent connections A.1 SWIG Trace file interface generation C.1 Local server: inter-arrival times C.2 Distant server: inter-arrival times

21 Chapter 1 Introduction Computer networks in general, whether serving the local, medium or wide area, and in particular, the globally spanning aggregation known as the Internet (henceforth referred to simply as the network) remain in a period of unprecedented and exponential growth. This growth is not only in terms of the number of communicating hosts, links and switching or routing nodes; but also encompasses new technologies and modes of use, new demands and expectations by users, and new protocols to support and integrate the functioning of the whole. Heavy demands are made by new overlaying technologies (e.g. streamed audio and video) which carry with them additional requirements for timely delivery and guaranteed loss rates the concept of Quality of Service (QOS) becomes significant; the growth of distributed computing makes new demands in terms of reliability. Sheer growth in the size of the network and volume of traffic carried places strain on the existing infrastructure and drives new physical technologies, routing paradigms and management mechanisms. Within this context the ability to observe the dynamic functioning of the network becomes critical. The complexity of interlocking components at both physical and abstract levels has outstripped our capacity to properly understand how they inter-operate, or to exactly predict the outcome of changes to existing, or the introduction of new, components. 1.1 Introduction The thesis of this dissertation may be summarised thus: observation of the network and the study of its functioning have relied upon tools which, largely for practical reasons, are of limited capability. Research relying upon these tools may in consequence be restricted in its scope or accuracy, or even determined, by the bounded set of data which they can make available. New tools and techniques are needed which, by providing a richer set of data, will contribute to enhanced understanding of application performance and the system as a whole, its constituent components, and in particular the interaction of the sub-systems represented by the network protocol stack.

22 22 Introduction The hypothesis follows that such improved tools are feasible, and is tested by the design and implementation of a new monitoring architecture Nprobe which is then used in two case studies which would not have been possible without such a tool. 1.2 Motivation This section outlines the motivation underlying passive network monitoring and the development of more capable tools by which it can be carried out. The following sections (1.3 and 1.4) introduce the desirability of monitoring a wider range of protocols and suggest the attributes desirable in a monitor which would have this capability The Value of Network Monitoring The term network monitoring describes a range of techniques by which it is sought to observe and quantify exactly what is happening in the network, both on the microcosmic and macrocosmic time scales. Data gathered using these techniques provides an essential input towards: Performance tuning: identifying and reducing bottlenecks, balancing resource use, improving QOS and optimising global performance. Troubleshooting: identifying, diagnosing and rectifying faults. Planning: predicting the scale and nature of necessary additional resources. Development and design of new technologies: Understanding of current operations and trends motivates and directs the development of new technologies. Characterisation of activity to provide data for modelling and simulation in design and research. Understanding and controlling complexity: to understand the interaction between components of the network and to confirm that functioning, innovation, and new technologies perform as predicted and required. The introduction of persistent HTTP connections, for instance, was found in some cases to reduce overall performance [Heidemann97a]. Identification and correction of pathological behaviour Passive and Active Monitoring The mechanisms employed to gather data from the network are classified as passive or active, although both may be used in conjunction.

23 1.2 Motivation Passive Monitoring Passively monitored data is collected from some element of a network in a non-intrusive manner. Data may be directly gathered from links on-line monitoring using probes 1 (e.g. tcpdump) to observe passing traffic, or from attached hosts (usually routers/switches or servers, e.g. netflow [Cisco-Netflow], Hypertext Transfer Protocol (HTTP) server logs). In the first case the data may be collected in raw form (i.e. the unmodified total or part content of passing packets) or may be summarised (e.g. as protocol packet headers). All or a statistical sample of the traffic may be monitored. In the second case the data is most commonly a summary of some aspect of the network traffic seen, or server activity (e.g. traffic flows [netflow], Web objects served, or proportion of cache hits). Data thus collected is typically stored for later analysis, but may be the subject of real-time analysis, or forwarded to collection/analysis servers for further processing. The strength of passive monitoring is that no intrusion is made into the traffic/events being monitored, and further in the case of on-line monitoring by probes attached to a network attached directly to network links that the entire set of data concerning the network s traffic and functioning is potentially available. The weakness, particularly as network bandwidths and the volume of traffic carried increase, is that it becomes difficult to keep up with the passing traffic (in the processing power required both to collect the data and to carry out any contemporaneous processing) and that the volume of data collected becomes unmanageable Active Monitoring Active network monitoring, on the other hand, is usually concerned with investigating some aspect of the network s performance or functioning by means of observing the effects of injecting traffic into the network. Injected traffic takes the form appropriate to the subject of investigation (e.g. Internet Control Message Protocol (ICMP) ping packets to establish reachability, HTTP requests to monitor server response times). The data gathered is largely pertinent only to the subject of investigation, and may be discarded at the time of collection, or may be stored for global analysis. Specificity of injected traffic and results may limit the further usefulness of the data gathered. There is always the risk that the injected traffic, being obtrusive, may in itself colour results or that incorrect or inappropriate framing may produce misleading data Challenges in Passive Monitoring When considering passive monitoring, issues of maximising potential information yield and minimising data loss arise. 1 Here and elsewhere in this dissertation the term probe is used to describe a data capture software system, and the computer on which it runs, connected to a network for monitoring purposes.

3. MONITORING AND TESTING THE ETHERNET NETWORK 3.1 Introduction The following parameters are covered by the Ethernet performance metrics: Latency (delay) the amount of time required for a frame to travel

Infrastructure for active and passive measurements at 10Gbps and beyond Best Practice Document Produced by UNINETT led working group on network monitoring (UFS 142) Author: Arne Øslebø August 2014 1 TERENA

Transport Layer Protocols Version. Transport layer performs two main tasks for the application layer by using the network layer. It provides end to end communication between two applications, and implements

Key Components of WAN Optimization Controller Functionality Introduction and Goals One of the key challenges facing IT organizations relative to application and service delivery is ensuring that the applications

Layer 3 Network + Dedicated Internet Connectivity Client: One of the IT Departments in a Northern State Customer's requirement: The customer wanted to establish CAN connectivity (Campus Area Network) for

Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.

1 VMWARE WHITE PAPER Introduction This paper outlines the considerations that affect network throughput. The paper examines the applications deployed on top of a virtual infrastructure and discusses the

Protocols and Architecture Protocol Architecture. Layered structure of hardware and software to support exchange of data between systems/distributed applications Set of rules for transmission of data between

PART OF THE PICTURE: The / Communications Architecture 1 PART OF THE PICTURE: The / Communications Architecture BY WILLIAM STALLINGS The key to the success of distributed applications is that all the terminals

CHAPTER 2 QoS ROUTING AND ITS ROLE IN QOS PARADIGM 22 QoS ROUTING AND ITS ROLE IN QOS PARADIGM 2.1 INTRODUCTION As the main emphasis of the present research work is on achieving QoS in routing, hence this

Sage ERP Accpac Online Mac Resource Guide Thank you for choosing Sage ERP Accpac Online. This Resource Guide will provide important information and instructions on how you can get started using your Mac

TECHNICAL NOTE VMware Infrastructure 3 SAN Conceptual and Design Basics VMware ESX Server can be used in conjunction with a SAN (storage area network), a specialized high speed network that connects computer

Core Syllabus C OPERATE KNOWLEDGE AREA: OPERATION AND SUPPORT OF INFORMATION SYSTEMS Version 2.6 June 2006 EUCIP CORE Version 2.6 Syllabus. The following is the Syllabus for EUCIP CORE Version 2.6, which

TCP/IP Protocol Suite Marshal Miller Chris Chase Robert W. Taylor (Director of Information Processing Techniques Office at ARPA 1965-1969) "For each of these three terminals, I had three different sets

Objectives of Lecture Network Architecture Show how network architecture can be understood using a layered approach. Introduce the OSI seven layer reference model. Introduce the concepts of internetworking

Application Note IP Addressing A Simplified Tutorial July 2002 COMPAS ID 92962 Avaya Labs 1 All information in this document is subject to change without notice. Although the information is believed to

COMPUTER NETWORKS NETWORK ARCHITECTURE AND PROTOCOLS The Need for Standards Computers have different architectures, store data in different formats and communicate at different rates Agreeing on a particular

Stateful Inspection Technology Security Requirements TECH NOTE In order to provide robust security, a firewall must track and control the flow of communication passing through it. To reach control decisions

Part I: The problem specifications NTNU The Norwegian University of Science and Technology Department of Telematics Note! The problem set consists of two parts: Part I: The problem specifications pages

D A N T E I N P R I N T #10 Technical Options for a European High-Speed Backbone Michael Behringer This paper was presented by Michael Behringer at JENC, the annual conference of TERENA (RARE), held in

1 Network Reference Models - OSI Reference Model - A computer network connects two or more devices together to share information and services. Multiple networks connected together form an internetwork.

Service Definition Introduction This Service Definition describes Nexium s from the customer s perspective. In this document the product is described in terms of an overview, service specification, service

TECHNOLOGY CONNECTED Advances with System Area Network Speeds Data Transfer between Servers with A new network switch technology is targeted to answer the phenomenal demands on intercommunication transfer

Dissertation Title: SOCKS5-based Firewall Support For UDP-based Application Author: Fung, King Pong MSc in Information Technology The Hong Kong Polytechnic University June 1999 i Abstract Abstract of dissertation

Guide to TCP/IP, Third Edition Chapter 3: Data Link and Network Layer TCP/IP Protocols Objectives Understand the role that data link protocols, such as SLIP and PPP, play for TCP/IP Distinguish among various

Requirements of Voice in an IP Internetwork Real-Time Voice in a Best-Effort IP Internetwork This topic lists problems associated with implementation of real-time voice traffic in a best-effort IP internetwork.

Local Area What s a LAN? A transmission system, usually private owned, very speedy and secure, covering a geographical area in the range of kilometres, comprising a shared transmission medium and a set

Ethernet Babak Kia Adjunct Professor Boston University College of Engineering ENG SC757 - Advanced Microprocessor Design Ethernet Ethernet is a term used to refer to a diverse set of frame based networking

Measurement of the Usage of Several Secure Internet Protocols from Internet Traces Yunfeng Fei, John Jones, Kyriakos Lakkas, Yuhong Zheng Abstract: In recent years many common applications have been modified

Page 1 of 8 Computer Networking Networks 9.1 Local area network A local area network (LAN) is a network that connects computers and devices in a limited geographical area such as a home, school, office

NETWORK DEVICE MONITORING pag. 2 INTRODUCTION This document aims to explain how Pandora FMS is able to monitor all network devices available on the marke such as Routers, Switches, Modems, Access points,

Leased Line + Remote Dial-in connectivity Client: One of the TELCO offices in a Southern state. The customer wanted to establish WAN Connectivity between central location and 10 remote locations. The customer

Performance Management for Next- Generation Networks Definition Performance management for next-generation networks consists of two components. The first is a set of functions that evaluates and reports

Encapsulating Voice in IP Packets Major VoIP Protocols This topic defines the major VoIP protocols and matches them with the seven layers of the OSI model. Major VoIP Protocols 15 The major VoIP protocols

White Paper Accurate End-to-End Performance Management Using CA Application Delivery Analysis and Cisco Wide Area Application Services What You Will Learn IT departments are increasingly relying on best-in-class

Overview of TCP/IP System Administrators and network administrators Why networking - communication Why TCP/IP Provides interoperable communications between all types of hardware and all kinds of operating

1 Transport Layer Protocols - TCP and UDP - The Transport layer (OSI Layer-4) does not actually transport data, despite its name. Instead, this layer is responsible for the reliable transfer of data, by

PRAMAK 1 Optimizing Data Center Networks for Cloud Computing Data Center networks have evolved over time as the nature of computing changed. They evolved to handle the computing models based on main-frames,

VXLAN: Scaling Data Center Capacity White Paper Virtual Extensible LAN (VXLAN) Overview This document provides an overview of how VXLAN works. It also provides criteria to help determine when and where

How To Understand and Configure Your Network for IntraVUE Summary This document attempts to standardize the methods used to configure Intrauve in situations where there is little or no understanding of

Chapter 2: Communicating over the 51 Protocol Units and Encapsulation For application data to travel uncorrupted from one host to another, header (or control data), which contains control and addressing