Download

Abstract

Technology trends are making it more and more difficult to observe and record the large amount of data generated by high speed
links. Traffic sampling techniques provide a simple alternative that reduces the volume of data collected. Unfortunately,
existing sampling techniques largely hide any temporal relationship in the recorded data.Our proposed method, ``FastCARS''
naturally captures statistics forpackets that are 1, 2 or more steps away. It has thefollowing properties: (a) it provides
accurate measurementsof full trace's statistics, (b) it is simple and can be easilyimplemented, (c) it captures correlations
between successive packets,as well as packets that are further apart, and (d) it is scalable andflexible such that it can
be easily adjusted to take into account priorknowledge about characteristics of particular traces.We also propose several
new tools for network data mining that use theinformation provided by FastCARS.The experimental results on multiple, real-world
datasets (233Mb intotal), show that the proposed FastCARS sampling method and these new datamining tools are effective. With
these tools, we show that theindependence assumption of packet arrival is not correct, and packettrains may not be the only
cause of dependence among arrivals.

BibTeX Entry

@InProceedings{GlobalInternet02FastCARS,
author = {Jia-Yu Pan and Srinivasan Seshan and Christos Faloutsos},
title = {FastCARS: Fast, Correlation-Aware Sampling for Network Data Mining},
booktitle = {Proceedings of IEEE GlobeCOM 2002 - Global Internet Symposium},
year = 2002,
wwwnote = {Taipei, Taiwan, November 17-21, 2002},
abstract = {Technology trends are making it more and more difficult to observe and record the large amount of data generated by high speed links. Traffic sampling techniques provide a simple alternative that reduces the volume of data collected. Unfortunately, existing sampling techniques largely hide any temporal relationship in the recorded data.
Our proposed method, ``FastCARS'' naturally captures statistics for
packets that are 1, 2 or more steps away. It has the
following properties: (a) it provides accurate measurements
of full trace's statistics, (b) it is simple and can be easily
implemented, (c) it captures correlations between successive packets,
as well as packets that are further apart, and (d) it is scalable and
flexible such that it can be easily adjusted to take into account prior
knowledge about characteristics of particular traces.
We also propose several new tools for network data mining that use the
information provided by FastCARS.
The experimental results on multiple, real-world datasets (233Mb in
total), show that the proposed FastCARS sampling method and these new data
mining tools are effective. With these tools, we show that the
independence assumption of packet arrival is not correct, and packet
trains may not be the only cause of dependence among arrivals.},
bib2html_pubtype = {Refereed Conference},
bib2html_rescat = {Network Data Mining},
}