Adaboost.M1+J48

I am using Adaboost.M1+j48 for my master thesis which concerns about reduction of memory consumption. During execution algorithm, It may create several trees. (for example with 3 iteration)

I wonder whether after execution algorithm, the occupied memory for decision tree (j48) is equal to that of the last decision tree or to that of the first classifier? (The number of nodes is considered as consumption memory)

Adaboost.M1+J48

I am using Adaboost.M1+j48 for my master thesis which concerns about reduction of memory consumption. During execution algorithm, It may create several trees. (for example with 3 iteration)

I wonder whether after execution algorithm, the occupied memory for decision tree (j48) is equal to that of the last decision tree or to that of the first classifier? (The number of nodes is considered as consumption memory)

supervised binning of categorical variables with large number of values...

I have a classification problem where a few of the predictors have very large number of values (or, levels) which is problematic for various models in weka. I am debating if "binning" such a categorical variable is a viable approach. I see the filter for supervised discretization of continuous variables. I suppose I am looking for such a filter but one that does supervised binning of categorical variables so that the number of levels for categorical variable is brought to a more manageable size for subsequent learning.

Is it possible to achieve something like this using the variety of filters and models available in weka?

JRIP rules and n-fold Crossvalidation

spaniard81 <raveesh.tmh <at> gmail.com>
2015-07-29 13:54:35 GMT

hi,
When using JRIP and 10 fold cross-validation a new set of rules is learned
and tested in each iteration. Thus a total of 10 rule-sets were learned.
However, the Weka Explorer shows just one rule-set.
Where does this rule-set come from? Is it the best one among the set or a
rule learned from a 80-20 split of the entire dataset (in fact the rule is
printed on the Explorer even before the 10 folds are completed). If the
latter is true then what does the performance summary suggest? The average
of the i) figures from the 10 iterations of cross-validation (this would not
make sense) or iii) the figures of the 10 folds on just this rule-set that
was learned from entire data?
I am tweaking with JRIP code. I want to learn rules that achieve high
precision without worrying about the recall, i.e. I am only interested in
learning rules that precisely describe the class of interest. I have not yet
figured the solution as it seems the pruning and optimization steps already
rely on accuracy. That is work in progress. But I do not know how to print
the rule set as I see in Weka Explorer on my console. What I see on my
console is rules from each iteration of the crossvalidation.
Any help on clarification about the rule-set in Weka Explorer and how to
obtain that through the JAVA api would we appreciated.
--
View this message in context: http://weka.8497.n7.nabble.com/JRIP-rules-and-n-fold-Crossvalidation-tp35256.html
Sent from the WEKA mailing list archive at Nabble.com.
_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.waikato.ac.nz
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html

Similar Phrase Clustering

Dear All,
I am working on a project where I want to cluster (unsupervised) a large
number of English phrases (of the order of 250k). I do not know the number
of clusters for these phrases. So far, I have been working with Weka DBScan
clusterer (tuning the Epsilon parameter). I have read from the Weka data
mining book that the hierarchical clustering can also be used provided I can
specify some termination condition for clustering. But I do not find such
parameter in Weka GUI (both Explorer and knowledgeflow). Please advice which
algorithms should I use to achieve best results. Thank you for the awesome
work building and sustaining WEKA :)
Best Regards
Baldie
--
View this message in context: http://weka.8497.n7.nabble.com/Similar-Phrase-Clustering-tp35241.html
Sent from the WEKA mailing list archive at Nabble.com.
_______________________________________________
Wekalist mailing list
Send posts to: Wekalist <at> list.waikato.ac.nz
List info and subscription status: http://list.waikato.ac.nz/mailman/listinfo/wekalist
List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html