Revision as of 19:47, 28 August 2007

This page was added to discuss different versions of the code for collaborative filtering at Bryan's blog.

Chris' version

I renamed the variables and then reorganized the code a bit.

The predict' function replaces predict.
The update2

moduleWeightedSlopeOne(Rating,SlopeOne,empty,predict,update)whereimportData.List(foldl',foldl1')importqualifiedData.MapasM-- The item type is a polymorphic parameter. Since it goes into a Map-- it must be able to be compared, so item must be an instance of Ord.typeCount=InttypeRatingValue=Double-- The Rating is the known (item,Rating) information for a particular "user"typeRatingitem=M.MapitemRatingValue-- The SlopeOne matrix is indexed by pairs of items and is implmeneted-- as a sparse map of maps. If the item type is an instance of Show-- then so is the (SlopeOne item) type.newtypeSlopeOneitem=SlopeOne(M.Mapitem(M.Mapitem(Count,RatingValue)))deriving(Show)-- The SlopeOne' matrix is an unormalized version of SlopeOnenewtypeSlopeOne'item=SlopeOne'(M.Mapitem(M.Mapitem(Count,RatingValue)))deriving(Show)empty=SlopeOneM.emptyempty'=SlopeOne'M.empty-- This performs a strict addition on pairs made of two nuumeric typesaddT(a,b)(c,d)=let(l,r)=(a+c,b+d)inl`seq`r`seq`(l,r)-- There is never an entry for the "diagonal" elements with equal-- items in the pair: (foo,foo) is never in the SlopeOne.update::Orditem=>SlopeOneitem->[Ratingitem]->SlopeOneitemupdate(SlopeOnematrixInNormed)usersRatings=SlopeOne.M.map(M.mapnorm).foldl'update'matrixIn$usersRatingswhereupdate'oldMatrixuserRatings=foldl'(\oldMatrix(itemPair,rating)->insertoldMatrixitemPairrating)oldMatrixitemComboswhereitemCombos=[((item1,item2),(1,rating1-rating2))|(item1,rating1)<-ratings,(item2,rating2)<-ratings,item1/=item2]ratings=M.toListuserRatingsinsertouterMap(item1,item2)newRating=M.insertWith'outeritem1newOuterEntryouterMapwherenewOuterEntry=M.singletonitem2newRatingouter_innerMap=M.insertWith'addTitem2newRatinginnerMapnorm(count,total_rating)=(count,total_rating/fromIntegralcount)un_norm(count,rating)=(count,rating*fromIntegralcount)matrixIn=M.map(M.mapun_norm)matrixInNormed-- This version of update2 makes an unnormalize slopeOne' from each-- Rating and combines them using Map.union* operations and addT.update2::Orditem=>SlopeOne'item->[Ratingitem]->SlopeOne'itemupdate2s@(SlopeOne'matrixIn)usersRatingsIn|nullusersRatings=s|otherwise=SlopeOne'.M.unionsWith(M.unionWithaddT).(matrixIn:).mapfromRating$usersRatingswhereusersRatings=filter((1<).M.size)usersRatingsInfromRatinguserRating=M.mapWithKeyexpand1userRatingwhereexpand1item1rating1=M.mapMaybeWithKeyexpand2userRatingwhereexpand2item2rating2|item1==item2=Nothing|otherwise=Just(1,rating1-rating2)predict::Orda=>SlopeOnea->Ratinga->Ratingapredict(SlopeOnematrixIn)userRatings=letfreqM=foldl'insertM.empty[(item1,found_rating,user_rating)|(item1,innerMap)<-M.assocsmatrixIn,M.notMemberitem1userRatings,(user_item,user_rating)<-M.toListuserRatings,item1/=user_item,found_rating<-M.lookupuser_iteminnerMap]insertoldM(item1,found_rating,user_rating)=let(count,norm_rating)=found_ratingtotal_rating=fromIntegralcount*(norm_rating+user_rating)inM.insertWith'addTitem1(count,total_rating)oldMnormM=M.map(\(count,total_rating)->total_rating/fromIntegralcount)freqMinM.filter(\norm_rating->norm_rating>0)normM-- This is a modified version of predict. It also expect the-- unnormalized SlopeOne' but this is a small detailpredict'::Orda=>SlopeOne'a->Ratinga->Ratingapredict'(SlopeOne'matrixIn)userRatings=M.mapMaybeWithKeycalcItem(M.differencematrixInuserRatings)wherecalcItemitem1innerMap|M.nullcombined=Nothing|norm_rating<=0=Nothing|otherwise=Justnorm_ratingwherecombined=M.intersectionWithweightinnerMapuserRatings(total_count,total_rating)=foldl1'addT(M.elemscombined)norm_rating=total_rating/fromIntegraltotal_countweight(count,rating)user_rating=(count,rating+fromIntegralcount*user_rating)userData::[RatingString]userData=mapM.fromList[[("squid",1.0),("cuttlefish",0.5),("octopus",0.2)],[("squid",1.0),("octopus",0.5),("nautilus",0.2)],[("squid",0.2),("octopus",1.0),("cuttlefish",0.4),("nautilus",0.4)],[("cuttlefish",0.9),("octopus",0.4),("nautilus",0.5)]]userInfo=M.fromList[("squid",0.4),("cuttlefish",0.9),("dolphin",1.0)]predictions=predict(updateemptyuserData)userInfopredictions'=predict'(update2empty'userData)userInfo