Space-Efficient Sampling

Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:171-178, 2007.

Abstract

We consider the problem of estimating nonparametric probability density functions from a sequence of independent samples. The central issue that we address is to what extent this can be achieved with only limited memory. Our main result is a space-efficient learning algorithm for determining the probability density function of a piecewise-linear distribution. However, the primary goal of this paper is to demonstrate the utility of various techniques from the burgeoning field of data stream processing in the context of learning algorithms.

Related Material

@InProceedings{pmlr-v2-guha07a,
title = {Space-Efficient Sampling},
author = {Sudipto Guha and Andrew McGregor},
booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics},
pages = {171--178},
year = {2007},
editor = {Marina Meila and Xiaotong Shen},
volume = {2},
series = {Proceedings of Machine Learning Research},
address = {San Juan, Puerto Rico},
month = {21--24 Mar},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v2/guha07a/guha07a.pdf},
url = {http://proceedings.mlr.press/v2/guha07a.html},
abstract = {We consider the problem of estimating nonparametric probability density functions from a sequence of independent samples. The central issue that we address is to what extent this can be achieved with only limited memory. Our main result is a space-efficient learning algorithm for determining the probability density function of a piecewise-linear distribution. However, the primary goal of this paper is to demonstrate the utility of various techniques from the burgeoning field of data stream processing in the context of learning algorithms.}
}

%0 Conference Paper
%T Space-Efficient Sampling
%A Sudipto Guha
%A Andrew McGregor
%B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2007
%E Marina Meila
%E Xiaotong Shen
%F pmlr-v2-guha07a
%I PMLR
%J Proceedings of Machine Learning Research
%P 171--178
%U http://proceedings.mlr.press
%V 2
%W PMLR
%X We consider the problem of estimating nonparametric probability density functions from a sequence of independent samples. The central issue that we address is to what extent this can be achieved with only limited memory. Our main result is a space-efficient learning algorithm for determining the probability density function of a piecewise-linear distribution. However, the primary goal of this paper is to demonstrate the utility of various techniques from the burgeoning field of data stream processing in the context of learning algorithms.

TY - CPAPER
TI - Space-Efficient Sampling
AU - Sudipto Guha
AU - Andrew McGregor
BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics
PY - 2007/03/11
DA - 2007/03/11
ED - Marina Meila
ED - Xiaotong Shen
ID - pmlr-v2-guha07a
PB - PMLR
SP - 171
DP - PMLR
EP - 178
L1 - http://proceedings.mlr.press/v2/guha07a/guha07a.pdf
UR - http://proceedings.mlr.press/v2/guha07a.html
AB - We consider the problem of estimating nonparametric probability density functions from a sequence of independent samples. The central issue that we address is to what extent this can be achieved with only limited memory. Our main result is a space-efficient learning algorithm for determining the probability density function of a piecewise-linear distribution. However, the primary goal of this paper is to demonstrate the utility of various techniques from the burgeoning field of data stream processing in the context of learning algorithms.
ER -