moduleData.CRF.Chain1.Dataset.Codec(Codec,CodecM,encodeWord'Cu,encodeWord'Cn,encodeSent'Cu,encodeSent'Cn,encodeSent,encodeWordL'Cu,encodeWordL'Cn,encodeSentL'Cu,encodeSentL'Cn,encodeSentL,decodeLabel,decodeLabels,mkCodec,encodeData,encodeDataL)whereimportControl.Applicative((<$>),(<*>),pure)importData.Maybe(catMaybes)importData.Lens.Common(fstLens,sndLens)importqualifiedData.SetasSimportqualifiedData.MapasMimportqualifiedData.VectorasVimportqualifiedControl.Monad.CodecasCimportData.CRF.Chain1.Dataset.InternalimportData.CRF.Chain1.Dataset.External-- | A codec. The first component is used to encode observations-- of type a, the second one is used to encode labels of type b.typeCodecab=(C.AtomCodeca,C.AtomCodecb)-- | Type synonym for the codec monad. It is important to notice that by a-- codec we denote here a structure of two 'C.AtomCodec's while in the-- monad-codec package it denotes a monad.typeCodecMabc=C.Codec(Codecab)c-- | Encode the labeled word and update the codec.encodeWordL'Cu::(Orda,Ordb)=>WordLab->CodecMab(X,Y)encodeWordL'Cuword=dox<-mkX.mapOb<$>mapM(C.encode'fstLens)(S.toList$fstword)y<-mkY<$>sequence[(,)<$>(Lb<$>C.encodesndLenslb)<*>purepr|(lb,pr)<-(M.toList.unDist)(sndword)]return(x,y)-- | Encodec the labeled word and do *not* update the codec.-- If the label is not in the codec, use the default value.encodeWordL'Cn::(Orda,Ordb)=>Int->WordLab->CodecMab(X,Y)encodeWordL'Cniword=dox<-mkX.mapOb.catMaybes<$>mapM(C.maybeEncodefstLens)(S.toList$fstword)y<-mkY<$>sequence[(,)<$>encodeLilb<*>purepr|(lb,pr)<-(M.toList.unDist)(sndword)]return(x,y)whereencodeLjy=Lb.maybejid<$>C.maybeEncodesndLensy-- | Encode the word and update the codec.encodeWord'Cu::Orda=>Worda->CodecMabXencodeWord'Cuword=mkX.mapOb<$>mapM(C.encode'fstLens)(S.toListword)-- | Encode the word and do *not* update the codec.encodeWord'Cn::Orda=>Worda->CodecMabXencodeWord'Cnword=mkX.mapOb.catMaybes<$>mapM(C.maybeEncodefstLens)(S.toListword)-- | Encode the labeled sentence and update the codec.encodeSentL'Cu::(Orda,Ordb)=>SentLab->CodecMab(Xs,Ys)encodeSentL'Cusent=dops<-mapMencodeWordL'Cusentreturn(V.fromList(mapfstps),V.fromList(mapsndps))-- | Encode the labeled sentence and do *not* update the codec.-- Substitute the default label for any label not present in the codec.encodeSentL'Cn::(Orda,Ordb)=>b->SentLab->CodecMab(Xs,Ys)encodeSentL'Cndefsent=doi<-C.maybeEncodesndLensdef>>=\mi->casemiofJust_i->return_iNothing->error"encodeWordL'Cn: default label not in the codec"ps<-mapM(encodeWordL'Cni)sentreturn(V.fromList(mapfstps),V.fromList(mapsndps))-- | Encode the labeled sentence with the given codec. Substitute the-- default label for any label not present in the codec.encodeSentL::(Orda,Ordb)=>b->Codecab->SentLab->(Xs,Ys)encodeSentLdefcodec=C.evalCodeccodec.encodeSentL'Cndef-- | Encode the sentence and update the codec.encodeSent'Cu::Orda=>Senta->CodecMabXsencodeSent'Cu=fmapV.fromList.mapMencodeWord'Cu-- | Encode the sentence and do *not* update the codec.encodeSent'Cn::Orda=>Senta->CodecMabXsencodeSent'Cn=fmapV.fromList.mapMencodeWord'Cn-- | Encode the sentence using the given codec.encodeSent::Orda=>Codecab->Senta->XsencodeSentcodec=C.evalCodeccodec.encodeSent'Cn-- | Create the codec on the basis of the labeled dataset, return the-- resultant codec and the encoded dataset.mkCodec::(Orda,Ordb)=>[SentLab]->(Codecab,[(Xs,Ys)])mkCodec=letswap(x,y)=(y,x)inswap.C.runCodec(C.empty,C.empty).mapMencodeSentL'Cu-- | Encode the labeled dataset using the codec. Substitute the default-- label for any label not present in the codec.encodeDataL::(Orda,Ordb)=>b->Codecab->[SentLab]->[(Xs,Ys)]encodeDataLdefcodec=C.evalCodeccodec.mapM(encodeSentL'Cndef)-- | Encode the dataset with the codec.encodeData::Orda=>Codecab->[Senta]->[Xs]encodeDatacodec=map(encodeSentcodec)-- | Decode the label.decodeLabel::Ordb=>Codecab->Lb->bdecodeLabelcodecx=C.evalCodeccodec$C.decodesndLens(unLbx)-- | Decode the sequence of labels.decodeLabels::Ordb=>Codecab->[Lb]->[b]decodeLabelscodecxs=C.evalCodeccodec$sequence[C.decodesndLens(unLbx)|x<-xs]