IAJIT

Selectivity Estimation of Range Queries in Data Streams using Micro-Clustering

Selectivity Estimation of Range Queries
in Data Streams using Micro-Clustering

Sudhanshu
Gupta and Deepak Garg

Computer
Science and Engineering Department, Thapar University, India

Abstract:
Selectivity estimation is an important task for query optimization. The common
data mining techniques are not applicable on large, fast and continuous data
streams as they require one pass processing of data. These requirements make
range query estimation a challenging task. We propose a technique to perform
range query estimation using micro-clustering. The technique maintains cluster
statistics in terms of micro-clusters. These micro-clusters also maintain data
distribution information of the cluster values using cosine coefficients. These
cosine coefficients are used for estimating range queries. The estimation can
be done over a range of data values spread over a number of clusters. The
technique has been compared with cosine series technique for selectivity
estimation. Experiments have been conducted on both synthetic and real datasets
of varying sizes and results confirm that our technique offers substantial
improvements in accuracy over other methods.

Warning: fsockopen(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /hsphere/local/home/ccis2k/ccis2k.org/iajit/templates/rt_chromatophore/index.php on line 251
Warning: fsockopen(): unable to connect to oucha.net:80 (php_network_getaddresses: getaddrinfo failed: Name or service not known) in /hsphere/local/home/ccis2k/ccis2k.org/iajit/templates/rt_chromatophore/index.php on line 251
skterr