NAME

DESCRIPTION

It is intended to provide a place to keep a variety of functions and formals for use in callback APIs in DBD::SQLite.

AGGREGATE FUNCTIONS

Variance

This is a simple aggregate function which returns a variance. It is adapted from an example implementation in pysqlite.

packagevariance;sub new {bless[],shift;}sub step {my($self,$value)=@_;push@$self,$value;}sub finalize {my$self=$_[0];my$n=@$self;# Variance is NULL unless there is more than one rowreturnundefunless$n||$n==1;my$mu=0;foreachmy$v(@$self){$mu+=$v;}$mu/=$n;my$sigma=0;foreachmy$v(@$self){$sigma+=($v-$mu)**2;}$sigma=$sigma/($n-1);return$sigma;}# NOTE: If you use an older DBI (< 1.608),# use $dbh->func(..., "create_aggregate") instead.$dbh->sqlite_create_aggregate("variance",1,'variance');

The function can then be used as:

SELECTgroup_name,variance(score)FROMresultsGROUPBYgroup_name;

Variance (Memory Efficient)

A more efficient variance function, optimized for memory usage at the expense of precision:

packagevariance2;sub new {bless{sum=>0,count=>0,hash=>{}},shift;}sub step {my($self,$value)=@_;my$hash=$self->{hash};# by truncating and hashing, we can comsume many more data points$value=int($value);# change depending on need for precision# use sprintf for arbitrary fp precisionif(exists$hash->{$value}){$hash->{$value}++;}else{$hash->{$value}=1;}$self->{sum}+=$value;$self->{count}++;}sub finalize {my$self=$_[0];# Variance is NULL unless there is more than one rowreturnundefunless$self->{count}>1;# calculate avgmy$mu=$self->{sum}/$self->{count};my$sigma=0;while(my($h,$v)=each%{$self->{hash}}){$sigma+=(($h-$mu)**2)*$v;}$sigma=$sigma/($self->{count}-1);return$sigma;}

The function can then be used as:

SELECTgroup_name,variance2(score)FROMresultsGROUPBYgroup_name;

Variance (Highly Scalable)

A third variable implementation, designed for arbitrarily large data sets: