Contents

The Learning menu provides access to a wide range of learning algorithms and related functions.

Missing Values Processing

As the name implies, this function allows you to specify the Missing Values Processing algorithm:

Static Completion

Dynamic Completion

Structural EM

Stratification

If the Target State of the Target Node has a very weak representation, for instance in the case of fraud detection, Stratification will allow modifying the probability distribution of the Target Node by using the internal weights associated with the states. This modification of the probability distribution permits learning a network that is structurally more complex. Once the structure learned, the parameters, i.e. the Conditional Probability Tables, are estimated on the unstratified data.

In the following dialog box, you can specify the proportion of each state of the Target Node. By default, the marginal distribution of the Target Node is shown. You can now use the sliders to set the proportions to the desired levels, or enter the percentages directly.

Once the Stratification is set, the iconis displayed in the status bar. You can also remove the Stratification by right-clicking on this icon and selecting Remove Stratification from the Contextual Menu.

Inthatcase,anequivalentnumberhastobespecifiedtoindicatehowmanycaseshavebeenused for the construction of that network (by learning or expertise).This number is automatically set when thenetworkislearnedfromadatabase.Itcorrespondstothesumoftheweightsofthedatabase's learningsetifweightsareassociated,ortothenumberofexamplesinthelearningset.Avirtual databaserepresentingthatknowledgeisthenaddedtothecurrentdatabaseinordertotakeintoaccount this a priori knowledge.A new icon is then added in the task bar:

AllthelearningalgorithmsthatareinBayesiaLabarebasedontheminimizationoftheMDLscore (MinimumDescriptionLength).ThisscoretakesintoaccountboththeadequacyoftheBayesian networktothedata,andthestructuralcomplexityofthegraph.Thescorevaluesareavailableintheconsoleduringlearning.Thescoreofagivennetworkcanalwaysbecomputed byupdatingitsprobabilities.Theexcluded nodesare not taken into account in the learning algorithms.Thefiltered statesare taken into account.Acompressionrateisavailableintheconsole.Thisindicatormeasuresthedatacompressionobtained bythenetworkwithrespecttothepreviousnetwork(usually,theunconnectednetwork).Thisrate, whichcorrespondstothepart"adequacyofthenetworktothedata"oftheMDLscore,thennotonly givesanindicationontheprobabilisticlinksthatareinthenetwork,butalsothestrengthofthese links.Forexample,withadatabasecontainingtwobinaryvariablesthatarestrictlyidentical,thecor- respondingnetworkwilllinkthesevariablesanddescribeintheconditionalprobabilitytablethatthe valueofthesecondvariableisdeterministicallydefinedbythefirstvariable.Thecompressionrate will be then equal to 50%.