{- |RandProc.hs - a Haskell library for working with random processes in a mathematically rigorous way
(Concepts taken from /Random Processes - a Mathematical Approach for Engineers/ by:
- Robert M. Gray
- Lee D. Davisson
Prentice-Hall Information and System Sciences Series, Thomas Kailath, Series Editor)
$Id: RandProc.hs 31 2011-06-22 13:49:48Z dbanas $
David Banas
12 March 2011
Copyright (c) 2011 by David Banas; All rights reserved World wide.
/Revision History:/
[@Date SVN #@] Description
[@2011-03-13 3@] Data structures stabilized. 'isSigma' working under minimal,
discrete sample testing.
[@2011-03-18 4@] Added 'isProbMeas', as well as monadic debugging versions of
both it and 'isSigma'.
Added an example probability space representing a fair die.
[@2011-03-29 7@] Custom intersection functions added and briefly tested.
[@2011-04-02 8@] Custom union functions added and briefly tested.
Solution is crude: it is O(N^2), and requires 2 passes over
the sample list every time a join is successful. Perhaps, a
pre-sort?
[@2011-06-06 9@] Attempted fix of 'getCompEvent'
Added 'smplComp' function, as helper to revised 'getCompEvent'.
Changed 'Point' to accept Double.
Moved all sample spaces to new file, 'Main.hs'.
Added input sorting to 'range'.
Changed Ranges to be open intervals, in order to allow for
complementing out a Point from them.
[@2011-06-11 10@] Major re-write.
'getCompEvent' fixed. All 5 test spaces checking out ok.
[@2011-06-18 21@] Removed sample set order dependency from 'checkSigma'.
All 7 test spaces checking out ok.
[@2011-06-19 22@] Added 'union of events is an event' test to 'checkSigma'.
[@2011-06-20 23@] Changed 'Event' from data constructor to type alias, in order
to eliminate many instances of 'Event . f . getSamps' code.
[@2011-06-20 25@] Modified 'smpsSetInt' to use a fold.
[@2011-06-20 26@] Defined public interface.
[@2011-06-21 27@] Modified comments for Haddock, and generated HTML docs.
[@2011-06-22 31@] Moved into 'Data' directory.
[End of Subversion revision history] This source has been moved to darcs.
[@2011-06-27@] Made `smplSetUnion` more efficient, and tuned remaining
performance bottlenecks.
/To Do:/
-}{-# LANGUAGE FlexibleInstances, TypeSynonymInstances, OverlappingInstances, NoMonomorphismRestriction #-}moduleData.RandProc(ProbSpace(ProbSpace),Measure(Measure),Sample(Empty),TestResult(..),ErrType(..),checkProbMeas,point,range,makeProbSpace,rangeBegin,rangeEnd,getProb,getEvent,getCompEvent,eventInt,smplComp,isElem,noDupEvents,smplInt,smplSetInt,smplUnion,smplSetUnion,checkSigma,getRsltStr)whereimportData.Listeps::Doubleeps=0.000001-- Used in floating point "equality" tests.{- |We take a probability space to consist of the following:
- an 'abstract space' composed of either discrete or continuous (or a mix) samples
- an 'event space', which must be a Sigma field defined over the abstract space
- a 'probability measure' defined over the event space
[Note:] For the sake of efficient coding, the /event space/ and the
/probability measure/ are combined in the Haskell data structure,
below. This is permissable, because there has to be a 1:1
correspondance between them anyway. And it is preferable, because it:
- keeps the probabilities more closely associated w/ the events, and
- avoids duplication of code (i.e. - the list of events).
-}dataProbSpace=ProbSpace{space::[Sample],measure::[Measure]}deriving(Show){- |This is our abstract data type, which represents a sample in the abstract space.
It has a constructor representing every possible element in the abstract space
we're modeling. (Currently, just points and ranges of /Double/s.)
Normally, none of the constructors of this type will be called directly.
Instead, helper functions are provided, such as 'point' and 'range', which
hide the implementation details from the user, and present a stable interface.
Currently, the sole exception to the above is the /Empty/ constructor, which is
really just a hack intended to put off the job of making the functions in this
library more intelligent, with regard to their handling of empty lists.
-}dataSample=PointDouble|Range(Double,Double)|Empty|Fullderiving(Eq,Show)-- [Note:] When constructing a range, it is the user's responsibility to ensure-- that the first element of the couple is less than the second.instanceOrdSamplewhere-- (<)_<Empty=FalseEmpty<_=TrueFull<_=False_<Full=True(Pointp)<(Pointp')=p<p'(Pointp)<(Range(r1,_))=p<=r1(Range(r1,r2))<(Pointp)=Pointp>=Range(r1,r2)(Range(r1,r2))<(Range(r3,r4))|r1==r3=r2<r4|otherwise=r1<r3-- (>)Empty>_=False_>Empty=True_>Full=FalseFull>_=True(Pointp)>(Pointp')=p>p'(Pointp)>(Range(r1,r2))=Pointp>=Range(r1,r2)(Range(r1,r2))>(Pointp)=Pointp<Range(r1,r2)(Range(r1,r2))>(Range(r3,r4))|r1==r3=r2>r4|otherwise=r1>r3-- (<=), (>=)(<=)s1=not.(s1>)(>=)s1=not.(s1<)-- custom data type and helper function for communicating sample types:-- (Helpful in places where pattern matching can't be used.)dataSampleType=STPoint|STRange|STEmpty|STFullderiving(Eq,Show)sampleType::Sample->SampleTypesampleType(Point_)=STPointsampleType(Range_)=STRangesampleTypeEmpty=STEmptysampleTypeFull=STFull-- |The custom type /Event/ is just an alias for a list of Samples.typeEvent=[Sample]-- |/Measure/ has 2 fields:---- * /event/ - a list of samples from the space, and---- * /prob/ - a number between 0 and 1 giving the events probability of-- occurence.dataMeasure=Measure{event::Event,prob::Double}deriving(Eq,Ord,Show)-- |This is the helper function intended to be used for constructing a point sample.point::Double->Samplepoint=Point-- |This is the helper function intended to be used for constructing a range sample.-- The range is considered /open/. That is, its end points are not included.range::(Double,Double)->Samplerange(a,b)|a==b=pointa|a<b=Range(a,b)|otherwise=Range(b,a)-- |This helper function generates a complete and valid probability space,-- given a discrete sample space and set of probabilities.makeProbSpace::[(Sample,Double)]->ProbSpacemakeProbSpace[]=ProbSpace{space=[Empty,Full],measure=[Measure{event=[Empty],prob=0},Measure{event=[Full],prob=1}]}makeProbSpaceps=ProbSpace{space=[fstp|p<-ps],measure=Measure{event=[Empty],prob=0}:[Measuree(sum[sndp|p<-ps,fstp`elem`e])|e<-ss]}wheress=filter(not.null)$subs$mapfstps-- Helpful info getters:-- |Gets the beginning point of a range, which is /not/ included in the range,-- since ranges are considered to be open.rangeBegin::Sample->DoublerangeBegin(Point_)=undefinedrangeBegin(Range(a,_))=arangeBeginEmpty=undefinedrangeBeginFull=undefined-- |Gets the ending point of a range, which is /not/ included in the range,rangeEnd::Sample->DoublerangeEnd(Point_)=undefinedrangeEnd(Range(_,b))=brangeEndEmpty=undefinedrangeEndFull=undefined-- |Extracts the probability from a Measure.getProb::Measure->DoublegetProb=prob-- |Extracts the Event from a Measure.getEvent::Measure->EventgetEvent=event-- |Get the complement of an event from the sample space.getCompEvent::[Sample]->Event->EventgetCompEvent[]_=[Empty]getCompEvent(s:ss)e=smplSetUnion$foldleventInt[s](map(smplComps)e)++getCompEventsse-- |Calculates the intersection of 2 events (i.e. - list of samples).eventInt::Event->Event->EventeventInts1s2|null(filter(/=Empty)s1)=[Empty]|null(filter(/=Empty)s2)=[Empty]|otherwise=smplSetUnion$concatMap(\s->(map(smplInts)s1))s2-- |Returns that portion of the first sample that is disjoint from the second.smplComp::Sample->Sample->[Sample]smplCompEmpty_=[Empty]smplCompsEmpty=[s]smplComp(Pointn)(Pointm)|m==n=[Empty]|otherwise=[Pointn]smplComp(Pointn)(Range(a,b))|(n>a)&&(n<b)=[Empty]|otherwise=[Pointn]smplComp(Range(a,b))(Pointn)|(n>a)&&(n<b)=[Range(a,n),Range(n,b)]|otherwise=[Range(a,b)]smplComp(Range(a,b))(Range(c,d))|(c>=b)||(a>=d)=[Range(a,b)]|(c<=a)&&(d>=b)=[Empty]|(c>a)&&(d<b)=[Range(a,c),Range(d,b),Pointc,Pointd]|a<c=[Range(a,c),Pointc]|otherwise=[Range(d,b),Pointd]smplComp_Full=[Empty]smplCompFull_=undefined-- |Determine if a sample is an element of a space.---- (Need this, as opposed to just using `elem`, in order to accomodate ranges.)isElem::[Sample]->Sample->BoolisElem[]_=FalseisElem_Empty=TrueisElem(s:ss)s'=testElemss'||isElemsss'-- testElem : Test whether the second sample is an element of the first.testElem::Sample->Sample->BooltestElemEmpty_=FalsetestElem_Empty=TruetestElem(Pointx)(Pointy)=x==ytestElem(Point_)(Range_)=FalsetestElem(Range(x,y))(Pointz)=(z>x)&&(z<y)testElem(Range(x,y))(Range(w,z))=(w>=x)&&(z<=y)testElemFull_=TruetestElem_Full=False-- |Checks a list of measures against duplicate events.noDupEvents::[Measure]->BoolnoDupEvents[]=TruenoDupEvents(m:ms)=notElem(eventm)es&&noDupEventsmswherees=mapeventms-- custom union and intersection functions for our 'Sample' data type-- |Returns the intersection between 2 samples.smplInt::Sample->Sample->SamplesmplIntEmpty_=EmptysmplInt_Empty=EmptysmplIntFulls=ssmplIntsFull=ssmplInt(Pointn)(Pointm)|m==n=Pointn|otherwise=EmptysmplInt(Pointn)(Range(a,b))|(n>a)&&(n<b)=Pointn|otherwise=EmptysmplInt(Range(a,b))(Pointn)=smplInt(Pointn)(Range(a,b))smplInt(Range(a,b))(Range(c,d))|(c>=b)||(a>=d)=Empty|otherwise=Range(maxac,minbd)-- |Reduces a list of samples to a single sample representing their intersection.smplSetInt::[Sample]->SamplesmplSetInt=foldlsmplIntFull-- |Returns the union of 2 samples.---- Unlike 'smplInt', /smplUnion/ must return a list since, if the 2 input-- samples aren't adjacent or overlapping, the union of them is a list-- containing both.smplUnion::Sample->Sample->[Sample]smplUnionEmptys=[s]smplUnionsEmpty=[s]smplUnion(Pointn)(Pointm)|m==n=[Pointn]|m<n=[Pointm,Pointn]|otherwise=[Pointn,Pointm]smplUnion(Pointn)(Range(a,b))-- Ranges are considered open, in order to allow complementing a point from them.|(n>a)&&(n<b)=[Range(a,b)]|n>=b=[Range(a,b),Pointn]|otherwise=[Pointn,Range(a,b)]-- The union operation is commutative.smplUnion(Range(a,b))(Pointn)=smplUnion(Pointn)(Range(a,b))smplUnion(Range(a,b))(Range(c,d))|c>=b=[Range(a,b),Range(c,d)]|a>=d=[Range(c,d),Range(a,b)]|otherwise=[Range(minac,maxbd)]smplUnionFull_=[Full]smplUnion_Full=[Full]-- |Absorbs a single input sample into a reverse-sorted list, as far as possible,-- via unioning. Assumes the single sample to be > the first list entry.smplUnionRecursRev::[Sample]->Sample->[Sample]smplUnionRecursRev[]s=[s]smplUnionRecursRev(Empty:ss)s=smplUnionRecursRevssssmplUnionRecursRev(s:ss)Empty=smplUnionRecursRevssssmplUnionRecursRev(Pointm:ss)(Pointn)|m==n=smplUnionRecursRevss(Pointn)|otherwise=Pointn:Pointm:sssmplUnionRecursRev((Range(a,b)):ss)(Pointn)|(n>a)&&(n<b)=smplUnionRecursRevss(Range(a,b))|otherwise=Pointn:Range(a,b):sssmplUnionRecursRev((Pointn):ss)(Range(a,b))=Range(a,b):Pointn:sssmplUnionRecursRev((Range(c,d)):ss)(Range(a,b))|a>=d=Range(a,b):Range(c,d):ss|otherwise=smplUnionRecursRevss(Range(c,maxbd))smplUnionRecursRev_Full=[Full]smplUnionRecursRev(Full:_)_=[Full]-- |Collapses a list of samples down to the maximally reduced set, which still-- composes a proper union of the input.smplSetUnion::[Sample]->[Sample]smplSetUnion=consolidateRPR.foldlsmplUnionRecursRev[].sort-- consolidateRPR : Reduces all occurences of:-- Range (a,b), Point (b), Range (b,c) into:-- Range (a,c).consolidateRPR::[Sample]->[Sample]consolidateRPR=scanRPR.sort-- scanRPR : Scans through a list of samples looking for the pattern:-- Range (a,b), Point (b), Range (b,c) and consolidates those into:-- Range (a,c).---- Note) This function depends upon receiving a SORTED sample list!scanRPR::[Sample]->[Sample]scanRPR[]=[]scanRPR(s:ss)|sampleTypes==STRange=ifheadIsPoint(rangeEnds)ss&&headIsRange(tailss)&&rangeBegin(head(tailss))==rangeEndsthenscanRPR$Range(rangeBegins,rangeEnd(head(tailss))):tail(tailss)elses:scanRPRss|otherwise=s:scanRPRss-- headIsPoint : Tests whether the head of the incoming list is a particular-- Point value.headIsPoint::Double->[Sample]->BoolheadIsPoint_[]=FalseheadIsPointa(s:_)=s==Pointa-- headIsRange : Tests whether the head of the incoming list is a-- Range.headIsRange::[Sample]->BoolheadIsRange[]=FalseheadIsRange(s:_)=sampleTypes==STRange-- |Custom data type used for test results and error reporting.dataTestResult=Fail{err::ErrType}|Passderiving(Show)instanceEqTestResultwherePass==Pass=TrueFaile==Faile'=e==e'_==_=False-- |Custom data type for reporting different errorsdataErrType=UnknownErr|EmptySampleSpace|EmptyEventSpace|MissingNullEvent|MissingCertainEvent|BadEventSamples|MissingCompEvent|MissingUnionEvent|EventMeasLenMismatch|DupEventsInMeas|MissingEventsInMeas|NullEventNonZeroProb|CertainEventNonUnityProb|EventAndCompNoSumOnederiving(Eq,Show)-- |getErrStr : Turns a value of type /ErrType/ into a more readable string.getErrStr::ErrType->StringgetErrStrUnknownErr="Unknown error"getErrStrEmptySampleSpace="Empty sample space"getErrStrEmptyEventSpace="Empty event space"getErrStrMissingNullEvent="The null event is missing from the event space."getErrStrMissingCertainEvent="The certain event is missing from the event space."getErrStrBadEventSamples="At least one event contains samples not in the sample space."getErrStrMissingCompEvent="At least one event's compliment is missing from the event space."getErrStrMissingUnionEvent="At least one union of events is missing from the event space."getErrStrEventMeasLenMismatch="Lengths of event and measure lists don't match."getErrStrDupEventsInMeas="There are duplicate events in the measure list."getErrStrMissingEventsInMeas="Some events aren't covered in the measure list."getErrStrNullEventNonZeroProb="The null event has been assigned a non-zero probability."getErrStrCertainEventNonUnityProb="The certain event has been assigned a probability other than 1."getErrStrEventAndCompNoSumOne="At least one pair of event and compliment have probabillities that don't add to 1."-- |Turns a value of type /TestResult/ into a human readable string.getRsltStr::TestResult->StringgetRsltStrPass="Ok"getRsltStrtr=getErrStr$errtr-- |Checks whether event space is actually a Sigma field over the sample space.checkSigma::ProbSpace->TestResultcheckSigmaps|null(filter(/=Empty)sp)=FailEmptySampleSpace|nulles=FailEmptyEventSpace|notElem[Empty]es=FailMissingNullEvent|notElemspes&&notElem[Full]es=FailMissingCertainEvent|not$all(all(\s->isElemsps||(s==Empty)))es=FailBadEventSamples|not$all(\e->getCompEventspe`elem`es)es=FailMissingCompEvent|not$all(`elem`es)(eventUnionses)=FailMissingUnionEvent|otherwise=Passwherees=map(sort.event)(measureps)ss=filter(not.null)$map(filter(/=Empty).event)(measureps)sp=spaceps-- |Power set generator, specific to a list of `Event`s---- Made necessary by the fact that generating all possible unions of all-- possible events grows as 2^(2^N), N = # of samples, due to much redundancy.eventUnions::[Event]->[Event]eventUnionses=concat$foldl'(\xsx->removeDups(map(smplSetUnion.concat.(x:))xs:xs))[[]]es'wherees'=filter(not.null)$map(filter(/=Empty))es-- |Power set generator, which removes duplicates as it generatessubs::[a]->[[a]]subs=foldl'(\xsx->xs++map(x:)xs)[[]]-- |Remove duplicates from a list.removeDups::(Eqa)=>[a]->[a]removeDups[]=[]removeDups(x:xs)=x:removeDupsyswhereys=[y|y<-xs,y/=x]-- |Checks a value of type 'ProbSpace' for correctness, and returns a value of-- type 'TestResult'.checkProbMeas::ProbSpace->TestResultcheckProbMeasps|cs/=Pass=cs|not(noDupEvents(measureps))=FailDupEventsInMeas|not$all(\m->getProbm==0.0)(filter(\m->getEventm==[]||getEventm==[Empty])(measureps))=FailNullEventNonZeroProb|not$all(\m->getProbm==1.0)(filter(\m->getEventm==spaceps)(measureps))=FailCertainEventNonUnityProb|not$all(\m->1.0-getProbm-getProb(head(filter(\m'->sort(getEventm')==getCompEvent(spaceps)(getEventm))(measureps)))<eps)(measureps)=FailEventAndCompNoSumOne|otherwise=Passwherecs=checkSigmaps