We investigate the correlation of the network
response times to those of Web
retrieval.
The network response times to Web servers is estimated using the well known
ping
utility
response times.
The Web retrievals use the standard
HyperText Transport Protocol (HTTP [HTTP]) GET
(henceforth referred to as a GET).
Such information is needed to identify how Web responsiveness may be affected
by the network.

Methodology

Selection of URLs to GET

We obtained a list of Web Uniform Resource Locators
(URLs) from the National Laboratory for Applied Network
Research (NLANR) Web Cache[KC] data
base. To decrease the chance that the
URLs from a particular cache might not be representative,
we used URLs selected from two of NLANR's caches (BO
and IT) henceforth referred to as the BO and IT caches.
The BO cache is located at NCAR Boulder Colorado,
and the IT cache at the Cornell Theory Center in Ithaca
New York. The BO cache list was for December 10, 1996
and the IT cache list was for December 13, 1996.

We used URLs which
resulted in GET response sizes between 10 bytes
(to allow space for the ping sequence and time information in
the ping payload) and 8000 bytes
(8000 is roughly the maximum "packetsize" supported by the
AIX operating system).
For the URLs we obtained from the selected cache lists,
roughly 70-75% of the GET responses contained less than
8000 bytes.

We used only one Web path from each server host name in the list
of URLs (multiple Web paths
may appear for a given server host name
in the caches). No constraint was placed on the number of
different hosts being sampled from a given network
domain. The URL path names were restricted to include only
characters from the set of alphanumerics plus period
(.), hyphen (-), underscore (_), tilde (~), slash (/),
backslash (\), colon(:) and percent (%). All other characters were
regarded as "invalid" for the current purpose.
This reduced the risks of untoward Unix shell expansions causing
problems, and the chance of the GET response being from a CGI [CGI]
script+ invoked from a
form since it excludes the
characters ampersand (&), question mark (?), plus
sign (+) and equal sign (=). Also URLs containing path names
including the text string "cgi", in any combination
of upper or lower case,
were ignored in a further attempt to avoid CGI scripts.

Measurement

The hosts which ran the program (the monitoring hosts) to
sample the hosts (sampled hosts) and gather the data were
lightly loaded IBM RS/6000 320Hs and an IBM RS/6000 250 all
running AIX 3.2.5 and a Sun 4/50 running SunOs version 4.1. All
the monitoring hosts were
located at the Stanford Linear accelerator Center (SLAC).
Measurement runs were made both over
weekends and holidays when we might expect the
Internet and other components involved in the response
times to be lightly loaded and midweek
during the daytime (at SLAC) when higher loading
would be expected. For each run, only one
pass through the list of URLs was made, the run being
terminated when a few thousand successful
measurements (samples) were completed (the lists of URLs from the
caches contained hundreds of thousands of URLs and so
were not exhausted).

For each of the first n URLs in the caches
satisfying the above selection criteria, we first used a function
called xchkaccess
with a timeout of 20 seconds to do a GET for the
URL. Special attention was paid in xchkaccess to randomizing
the time between measurements in order to avoid synchronizing with
the TCP delayed ACK timer [ST] when making repeated measurements.
Xchkaccess reported the success (possible failure
codes include tcp connection rejected, timeout, host name invalid or
unknown) plus a timestamp, the size of the GET response
transferred, the
response time (excluding the domain name service lookup time)
and the number of packets. If the GET succeeed
and the GET response size was acceptable (see above), then a further
9 GETs were done for the same URL for a total of 10 GETs/URL.
The host was then pinged
(using the standard AIX ping utility with a timeout
of 20 seconds and a minimum
of 1 second between pings) 10 times with a payload of the
same size as the previous GET response. Pathological ping responses
(e.g. duplicate ping responses received) and pings with 100%
packet loss
were rejected and were not included in the successful samples recorded
for further analysis.

The measurement program applied successive
filters to restrict the URLs
used, the measurements made and the samples recorded
for further analysis. An example of how the filters progressively
restricted the input passed to the next filter is seen in
Table 1 below. The information is provided here to help
clarify the measurement process, and to give an idea of the
frequency of some of the failures seen on the Internet.

Table 1: Effect of filters in the
measurement program. The input
URLs were selected from the first 131069 of the 214591 URLs in
the IT cache.
The program ran between January 26 and February 1, 1997.
Each filter receives its input from the previous filter and passes its
filtered output to the next filter. Other filters in the
program (e.g. path name contains "htbin")
are not reported in this table since they resulted in no rejections
for this run of the program.

Stage

Filter

Number of Inputs to Filter

Number Rejected by Filter

% Rejected

Check Cache URLs

URL contains "invalid characters"

131069

2772

2.11%

Check Cache URLs

URL scheme is not "http" protocol

128297

1505

1.17%

Check Cache URLs

Path name contains "cgi"

126792

691

0.54%

Check Cache URLs

Duplicate host, i.e. host already successfully measured

126101

110927

87.97%

Xchkaccess (GET)

Host Name Invalid or Unknown

15174

630

4.15%

Xchkaccess (GET)

TCP Connection Rejected

14544

207

1.42%

Xchkaccess (GET)

Time Out (20 seconds)

14337

1447

10.09%

Xchkaccess (GET)

Error in read response from server

12890

2

0.02%

GET Response Size

Response has 0 bytes

12888

101

0.78%

GET Response Size

Response size outside threshold (> 8KB)

12787

3618

28.29%

Ping

100% packet loss

9169

393

4.29%

Ping

Pathological Pings (e.g. duplicate responses)

8776

17

0.19%

Httpq

httpq fails

8759

16

0.18%

Successfully Measured Samples

Analysis Program

8744

A host was removed from further consideration if a successful
measurement was obtained, or the host failed an
xchkaccess or ping filter.

For each successful
measurement, we recorded one record with the:
timestamp, hostname, port, path
name, GET response size, the server type and the HTTP status code,
and relevant information from the
cache list (e.g. the cache list's measure of the GET response size,
and transfer time).
The record also contains: for each set of
10 GETs, the median, the
25 percentile, the 75 percentile and
inter-quartile range and the first GET responses; for each set of
10 pings,
the average, minimum and maximum*
responses, plus
the percentage of ping packets
lost. The granularity of the clock as reported was
1 millisecond for ping and for xchkaccess.

At a later time (~24 hours) we ran traceroute
on the same monitoring hosts to measure the number of
hops to the sampled hosts. This data was recorded in files together
with the GET and ping data for further
analysis. The median hops count was 12, the average 13.3.
About 10% of the traceroute measurements
returned hop counts of 30, the default maximum hop count.

Analysis

The data files were read into Microsoft Excel [MS] and scatter
plotted to try to reveal correlations between the metrics
- various response times measures, GET response size,
packet loss, hops, period of measurement. We also
used Excel to provide linear regression fits to the data,
Correlation Coefficient R
[MS], and other statistical information.

Results

Characteristics of web objects, servers etc.

Most (~ 96%) of the hosts sampled using the IT
cache list were in the .com (~60%), .edu (~13%), .org,
(~5%) and .net (~4%) domains. About 1.6% of the hosts had
IP addresses only, and there were hosts (~15%) in about
25 country domains (excluding .us). Less than 1% of the
GETs were to other than the default Web server port 80.
For the BO cache list, about 96% of
the hosts sampled were in the .edu (~43%), .net (~20%),
.org (~14%), .com (~13%), .gov (~1.5%) and .mil (~1%)
domains. About 0.7% of the hosts had IP addresses only.
There were hosts in about 13 country domains (excluding
.us).

For the URLs successfully retrieved using the IT cache list,
about 45% had the suffix .gif,
~35% had .htm, .html or .shtml, ~7% had .jpg, ~10% had no suffix, and
the other main suffixes observed were .asp, .class, .js, .exe, .txt
and .xbm. About 70% of the GET responses had HTTP status codes of 200 (OK,
the request was fulfilled), about 18% had code 404 (server could not
find given resource), about
10% had code 302 (suggestion for the client to try another location), and
about 1% had code 401 (client is not authorized to access data). The
remaining 1% was mainly composed of codes 301, 403, and 400. The top 5
identified WWW servers were from Apache (41%), Netscape (18%), NCSA (15%),
WebSite (4%) and CERN (3%). For a survey on Web server software usage see
The Netcraft Web Server Survey.

A typical GET response size distribution is
seen in Figure 1. The sharp peaks in the GET response
distribution are associated with specific response such as the
server reporting that it is unable to find a requested object.
For example the peak at 207 bytes is largely composed of status
code 404 responses from a particular brand of server.

Figure 1 shows the frequency histogram of the sizes of web objects
in the IT cache list.
Figure 2 shows a typical hop count distribution. About 10%
of the traceroute hop measurements returned hop counts of 30, the
default maximum hop count.

The distributions of the minimum (of 10) ping responses for
different bin sizes (10ms and about 120ms) are seen in Figures 3 and 4.
The ping response is
plotted on a logarithmic scale to enable one to
see more clearly the distribution for the short responses. A definite
bimodal behavior is seen in the first plot (narrower bins)
with peaks at roughly 50 msec. and 100 msec.
Note that the ping payloads for the distribution in Figures 3 and 4
follow that shown in Figure 1.
A similar distribution is obtained
for pings of a fixed payload of 1000 bytes, so the bimodality
is not believed to be due to the peakiness of the
Web object size distribution shown in Figure 1.
For the wider
bins, the distribution roughly follows a power law
(R2=0.97)
whose equation is shown in the figure.
A large difference
in the hop and ping distributions can be noticed.
Figure 3 shows a frequency histogram of the minimum ping
response time of a sample of 10076 web servers. The bin width
is 10 msec.

Figure 4 shows the frequency histogram of the minimum response ping
response time of the same sample of 10076 hosts seen in Figure 3,
but with a bin width of 120 msec. the curve is a power law fit to the
data with the parameters shown. the R2 [AF] of the fit is
also shown.

The distribution of the median (of 10) GET responses, for 6578
Web servers selected from the IT cache list, for
2 different bin sizes is seen in Figures 5 and 6. There is some evidence
of bimodaility and again the distribution to the right of the peaks
roughly follows a power law (R2=0.95).

Figures 5 & 6 show the frequency hiostograms of the median
HTTP GET reponse for objects from a sample of 6578 web servers.
Figure 5 is for 10 msec. bins and figure 6 is for 100 msec. bins.
the line in figure 6 is a power law fit with the parameters
and R2 shown.

Figure 7 shows the frequency histogram of the ping losses observed
to 10076 web servers in the IT cache list. A power law fit
is also shown together with its parameters and R2.

The statistics of the GET response
sizes and the response times
are summarized in Table 2.
The "Min.", "Avg." and "Max." refer
to the minimum, average and maximum of the 10 pings or 10
GETs done for each host.

Table 2: Statistical summary of two sets of
6000 Samples (the IT sample set is a subset of the first 6000 samples
reported in Table 1). The first statistical measure in each cell in
Table 2 is for the
IT cache, the second for the BO cache.

Statistics

GET Response Size (Bytes)

Min. GET (msec.)

Min. Ping (msec.)

Avg. GET (msec.)

Avg. Ping (msec.)

Max. GET (msec.)

Max. Ping (msec.)

Median GET (msec)

25 Percentile

331, 331

216, 217

87, 87

288, 297

96, 94

440, 433

114, 106

254, 253

50 Percentile

1602, 1562

393, 376

132, 127

554, 538

151, 140

826, 786

193, 174

461, 458

75 Percentile

3534, 3537

657, 624

205, 177

1027, 936

237, 201

1897, 1716

318, 260

852, 745

Average

2230, 2246

568, 525

215, 178

884, 803

252, 207

1927,1788

322, 262

733, 664

Standard Deviation

3106, 2132

710, 664

359, 288

1107, 978

456, 386

2733, 2509

585, 515

973, 880

Minimum

11, 11

11, 15

1, 2

13, 16

3, 2

16, 19

4, 2

12, 15

Maximum

7991, 7999

16454, 12218

12152, 12633

16582, 14118

12152, 11085

20043, 19966

15363, 12633

16565, 14587

Correlations between ping RTT and GET response times

A typical scatter plot of GET versus ping
response time for 4000 (the maximum plottable by Excel [MS])
successful samples is shown below in Figure 8.

Figure 8 shows a scatter plot of the median GET response for 10 GETs/host
versus the Minimum Ping for 10 pings/host for 4000 samples. The samples
are the first 4000 IT cache samples summarized in Table 2,
and the measurements were made from 24
thru 26 December, 1996. See Figure 9 below for more details on the
scatter plot in the low response time ranges. Two linear regression
fits are shown in Figure 8. The dashed
line is constrained to go through the origin, the other is unconstrained.
The coordinates of the fits are also shown together with the squares of
the Correlation Coefficients (R2). The Correlation
Coefficient of the unconstrained fit is R = 0.64 which is
indicative of a "strong" positive correlation (see below).

The square of the Correlation Coefficient (R2)
defines the fraction of the total variance of y that is accounted
for by its regression on x[CDM].
1 - R2 represents
the proportion of the total variability of the y values that is
not accounted for by the variable x.

Table 3 shows the Correlation Coefficients for various combinations
of the minimum, average and maximum ping*
and the minimum,
average, maximum and median GET responses
for sets of 10 pings and 10 GETs for each host in a sample set of
the first 4031 samples taken from the IT cache sample sets of
Tables 1 and 2.

Table 3: Measured Correlation
Coefficients for various combinations of the minimum,
average, and maximum ping and minimum,
average, maximum and median GET responses of 10
probes (10 for GET followed by 10 for ping).

Correlation Coefficient R

Min. GET

Avg. GET

Max. GET

Median GET

Min Ping

0.609

0.579

0.36

0.61

Avg. Ping

0.583

0.558

0.35

0.587

Max. Ping

0.538

0.521

0.331

0.546

Table 3 shows that the correlation is
best if we use the minimum of the 10 ping
responses for each host. It might be expected that this would
give better estimates of the ping response since the minimum
ping response has a lower bound, whereas the maximum is
unbounded and so outliers may make the average a less reliable
estimator. Similar effects are seen for the
GET correlations.
The correlations of the minimum, average and median
GET response times versus the minimum ping responses
times may be said to be between "moderate" and "strong"
[AF].

Further correlation improvements can be made if one
ignores outlying samples with large GET
response times. For example, for the set of 6000 IT cache samples
described in Table 2,
the Correlation Coefficients R
for the minimum and median GETs versus the minimum
pings increase by 16% (from 0.595 to 0.698 for the minimum GETs) and
8% (from 0.594 to 0.645 for the median GETs)
if one excludes
the less than 1% of the samples which have average GET response times of
6 seconds or more.
A rationale for removing these
samples is that they represent hosts where the GET response time
is dominated by effects other than the network, such as an
overloaded Web server, a slow host, or the URL invokes a CGI script
etc.

Table 4: Typical linear regression fit parameters (slope & intercept)
for minimum, average and median GET responses versus minimum and
average ping responses. The sample set
is the IT cache sample set of 6000 samples described in Table 2.

Slope

Min. GET

Avg. GET

Median GET

Min. Ping

1.18

1.77

1.61

Avg. Ping

0.88

1.36

1.23

Intercept

Min. GET

Avg. GET

Median GET

Min. Ping

315ms

502ms

422ms

Avg. Ping

345ms

540ms

386ms

Typical linear regression fit slopes and intercepts are shown in Table 4 for
various combinations of minimum, average and median GET responses
versus minimum and average ping responses.

To evaluate whether the results are skewed by path names ending in a
slash (/), which we refer to as "index pages", which may require
the server to compose a directory listing which in
turn may take more time, we re-analyzed the data excluding samples with
such path names. These paths comprised about 25% of the paths that we
measured.
Table 5 below shows that the difference
in Correlation Coefficient if one includes or excludes "index pages" is
negligible.

Table 5: Values of the Correlation Coefficient for the minimum, average and
Median GETs versus the minimum ping response for the first 4031 samples
measured from the IT cache sample set
shown in Table 2.
The first row includes path names ending in slash ("index pages").
The second row excludes samples with path names
ending in a slash.

Correlation Coefficient R

Min. GET

Avg. GET

Median GET

Number of Samples

All samples

0.609

0.579

0.61

4031

All samples - Index Pages

0.593

0.562

0.596

3120

There was a weak correlation (R ~ 0.15 - 0.19) between the minimum
ping response times and the GET response sizes in bytes. There was
a slightly larger but
still weak correlation (R ~ 0.20 - 0.23) between the minimum or median GET
response times and the GET response sizes in bytes. In one measurement run of about 1700 samples,
we fixed the ping payload to 1000 bytes,
instead of making the ping payload size equal to the GET response
size.
The Correlation Coefficient R
for minimum, average and median GET response against
the minimum ping response dropped by about 25% to about
R=0.45 as can be seen in
Table 6.

Table 6: Correlation Coefficients for minimum, average and
median GETs (10 GETs/host) versus minimum and average pings (10 pings/host)
where the ping payload was fixed at 1000 bytes. The sample size is
1734 hosts obtained from URL's in the IT cache
list.

Correlation Coefficient R

Min. GET

Avg. GET

Median GET

Min. Ping

0.43

0.46

0.45

Avg. Ping

0.38

0.42

0.41

We also plotted the GET response times versus the packet
loss, but could find only weak
correlations (R ~ 0.18 - .24).

There was a significant difference in R
between the IT and BO cache measurements. For example, for 2 sets of
6000 samples shown in Table 2 which were
measured over the same time interval (December 24-28, 1996) R is
as shown in Table 7. This difference is not currently understood.

Table 7: Correlation Coefficients R for the minimum, average and
median GETs versus the minimum ping responses for 6000 samples derived from
the IT and BO cache lists.

R for IT cache list hosts

Min. GET

Avg. GET

Median GET

Min. Ping

0.595

0.575

0.594

R for BO cache list hosts

Min. GET

Avg. GET

Median GET

Min. Ping

0.530

0.511

0.529

Lower bounds of GET with respect to ping response

The remarkably clear lower boundary seen in Figure 9 around y = 2x
is not surprising since:
a slope of 2 corresponds to HTTP GETs that take twice the ping
time; the minimum ping time is approximately the round trip time; and
a minimal TCP transaction involves two round trips, one round trip to exchange
the second to send the request and receive the response. The connection
termination is done asynchronously and so does not show up in the timing.
Figure 9 shows a scatter plot of the minimum GET reponse versus the minimum
ping RTT reponse for the lower values of response time. The straight line shows the
boundary of y=2x.

The lower boundary can also be visualized by displaying the distribution of residuals
between the measurements and the line y = 2 x (where y =
HTTP GET response time and x =
Minimum ping response time). Such a distribution is shown below. The steep in crease in
the frequency of measurements as one approaches zero residual value
(y=2x) is apparent.
The Inter Quartile Range (IQR), the residual range between where
25% and 75% of the
measurements fall, is about 220 msec, and is indicated on the plot by the
red line.
Figure 10 shows the frequency histogram of the residual of
minimum(HTTP GET response) - minimum(ping RTT response) for the
data shown in figure 9.

In summary there is a moderate to strong correlation between the
GET and ping response times for typical Web GET response sizes for
Internet Web servers. Better correlations are obtained if one
compares the minimum GET response times versus the minimum ping response times.
Between 25% and 40% of the total variance of
the minimum GET response is accounted for by its regression on the minimum
ping response.
Since ping measures the response time of the lower
network layers, we may say that the response time to GET Web
pages over the Internet, is moderately to strongly dependent on the
network's performance. Another way of putting this is that if
one knows in advance the minimum or average ping response time to the Web server,
the GET response is moderately predictable for the typical
size of Web GET response retrieved.
The ability to make such a prediction, however, is only a part of being able to
predict what the user experiences. There are many other factors involved
including:

the user is unlikely to GET the same page 10 times;

many Web pages are composed of multiple GETs;

usually the user does not know in advance how many GETs compose a page;

usually the user does not know in advance the sizes of the GET responses,
as shown
above this can reduce the correlation (compare Table 6 with Table 3 to
see how this lack of fore-knowledge can reduce the correlation);

different browsers may perform the GETs composing a page in parallel
or sequentially;

after receiving each GET response for a path, the browser still has to render it;

the Web server may have to execute a lot more code and hence be
much slower in responding for some requests
(e.g. CGI access to databases);

the GET responses may be cached.

Other observations include:

The number (about 5%, see Table 1) of Web servers that successfully responded to
GETs but
did not respond to pings resulting in 100% ping packet loss is interesting.
Possibly this
is a security/safety measure triggered in part by recent "Ping o' Death"
concerns [MB]. This could raise concerns for people monitoring the network
by using pings, and attempts to select one of a set of replicated servers
based on the ping response [MC].

Less than 10% of the hosts pinged had packet losses of > 10%. Less than
0.25% of the hosts pinged had duplicate responses, and we received packets
out of order for less than 0.25% of the hosts pinged.

The large number of failures (over 10% in our samples, see Table 1)
to GET a URL is especially
interesting and is probably worth long term tracking since it is directly
related to users' Web experiences.
Note these failures do not include failures where the server could not find
a file or the file is read protected and the server
reports an error to the client
via an HTTTP status code. It is possible that the some of the effect might be
due to Web servers being shut down over the extended
holiday season when most of the measurements were made.

The weak correlations between both GET versus bytes transferred and ping
versus bytes transferred may indicate that for the GET response sizes
we meausured
the performance is more related to start up effects and delays rather than
bandwidth availability.

There was little change in the correlations if we excluded "index pages"
(see Table 5).

For sets of 10 pings for the purpose of correlation measures, there
seems little need to provide the median as well as the average.

The measurements cause network traffic. Their unusual nature
(e.g. 10 closely separated GETs) may also raise concerns for an administrator
who carefully reviews her/his Web server's or other logs.

Many of the Web server hosts monitored have nothing to do with SLAC's mission, and
the account doing the monitoring may wind up on some very non-work related mailing lists.

Possibilities for future work include:

Use the cache URLs that point to FTP or Gopher servers to
measure and analyze the correlation between FTP and ping, and Gopher and ping.

Gather and report on long term information on the GET failure rates at the
network level.

Gather and report on long term information on HTTP status code
frequencies.

For items 2 and 3,
relevant
data may already be gathered, or the capability to
gather it simply added, at Web server caches such as those associated with
NLANR.

Appendix: Pathology due to early measurement method

Looking in more detail at the scatter plots in the lower ranges of
median GET and minimum ping responses, some
clustering about fixed values of the median GET response are visible.
Figure 11
(for about 10000 samples measured on an RS/6000 model 250) shows an example
of this clustering.
Figure 11 shows a scatter plot of the lower ranges of the GET and ping responses
measured from SLAC to about 10000 web servers.

The first cluster is
around median GET responses of 250-265 msec. and a further
cluster at 450-465 msec. can be
observed. Histogramming the frequency of median GET responses against the
median GET response time (see Figure 3)
shows several distinct peaks at which are separated by about 200ms.

Figure 12 shows the frequency histogram of the GET reponse data in
figure 11.

Samples comprising these peaks, compared with the complete sample set, do
not contain statistically significant different
distributions of:

ping packet loss (one might expect longer GET responses if there
were more TCP retries which in turn might be expected if the ping packet
loss were higher);

Web server HTTP status codes;

bytes in the GET response;

reads required on the monitoring host to GET each response.

The effects are less pronounced for the minimum GETs and disappear for
the first GET of each set of 10 GETs. Possibly this is due to smearing
out of the effect since the first GET response is more variable than
the median of 10 GETs.

No such peaks are seen in the equivalent histogram of ping response
times, though the ping responses do appear to be bimodal with a
peak at about 28 msec and a larger peak at 106msec.
The GET effect is reproducible across several monitoring host architectures
including RS/6000 models
320H and 250 (both running AIX 3.2.5) and
a Sun 4/50 running SunOS 4.1 all located at SLAC, and a Sun SuperSparc 10 running
SunOS 4.1 located at the Fermi National Accelerator Laboratory (FNAL). For these
different monitoring hosts, the
location of the first GET response peak peak changes, for example, it is at
about 320 msec.
for a RS/6000 320H, about 255 msec for an RS/6000 250 and 210 msec for
a Sun 4/50. However the separation of the peaks stays fairly constant at
about 200msec.

The effect was an artifact of the measurement method, where the repeated GETs (up to
10) tended to synchronize with the delayed ACK timer [ST]. the solution was to delay
the request for the second and consecutive GETs by a random time. The clue to this
was provided by Vern Paxson.

Acknowledgements

We would like to thank Bill Wing of Oak Ridge National Laboratory for
encouraging us to make these measurements, Dave Martin of the
High Energy Physics Resource Center (HEPNRC) at FNAL for help in
running xchkaccess at FNAL, Connie Logg of SLAC for help with capturing packets,
Bill Weeks of SLAC for useful discussions and help looking at captured packets,
and Vern Paxson of LBNL for suggesting the probable source of the GET
clustering.

Footnotes

+ We wanted to avoid CGI scripts since they can cause the
Web server to take much longer
to provide the information than a simple page reference, and hence are less
typical of network effects and will skew the results.

* For later measurements we also measured the median ping response.
For these measurements (about 1900 samples), the Correlation
Coefficient obtained using the average ping versus the minimum, average or
median GET differed from that obtained using the median ping by of the order of
1%, which was within the expected statistical fluctuations.
For the bulk of the measurements and analysis we focussed on the minimum and
average ping responses rather than the median ping response. This was since the
summary report from the standard ping tool used by most users provides the
minimum, average and maximum responses and not the
median .