Hi Christian,the mailling lists of the Special Interest Groups are listed on the sourceforge page of RapidMiner. We are still working on a page explaining in more detail topics and aims of each group, but nevertheless they are online. You might join there directly or write me an email and I will put you on the list.

The idea with a third party forum is good. I will add one.

I've never been in Kaiserslautern. Seems to be a good idea to change that now I will contact you by mail.

A similar (though not identical) feature, very effective indeed, is offered by IBM SPSS Modeler for instance as an automatic modeling operator, via which several models are produced automatically, and the best of them are proposed to the user. Moreover, the models may be combined to produce a kind of voting model, which may have better performance in some occasions than the individual models. See a demo here.

whoa, but that's quite a difference: in SPSS all models are actually tested (which can also be done with the PaREn extension during the evaluation step but is also possible with a simple process for core RapidMiner as Simon has pointed out).

The cool thing about the PaREn extension is that it predicts which model is probably the best even without any testing. This is the first time I have actually see this meta learning approach really working and this is probably the reason why we at Rapid-I and many others love it. Kudos to the Christian and the team of the DFKI for this great extension!

I have also a suggestion: would be great if a k-fold cross validation or even a single split would be selectable instead of the rather time consuming LOO evaluation.

Cheers,Ingo

Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at http://marketplace.rapid-i.com

I am getting an error when I perform Step 3 of the Automatic System Construction with the Iris dataset. It states, "No parameters were specified which should be optimized". The wizard closes after I click the OK button. I have followed the instructions exactly on how to use this extension. Has anyone else had this same problem?

10-27-10At the time of my post I thought I had the latest update for PaREn. Once I installed the latest version I am not experiencing this problem any longer.

whoa, but that's quite a difference: in SPSS all models are actually tested (which can also be done with the PaREn extension during the evaluation step but is also possible with a simple process for core RapidMiner as Simon has pointed out).

The cool thing about the PaREn extension is that it predicts which model is probably the best even without any testing. This is the first time I have actually see this meta learning approach really working and this is probably the reason why we at Rapid-I and many others love it. Kudos to the Christian and the team of the DFKI for this great extension!

@ Ingo: Note that SPSS Modeler has less, but very carefully chosen and highly optimised algorithms (there where it is possible - example: take C5 as opposed to C4.5 implemented in open source software). Therefore one affords to create models for most classification algorithms available in SPSS and to retain the best ones, in a reasonable amount of time.

Factually speaking (by the way as a fan of both software - RM and SPSS Modeler), there are obviously similarities and differences in the features we discuss about, and I am afraid that the differences show for now that SPSS Modeler is incomparably much ahead: time of running to build the best models, reliability and performance of models (see my previous posting above regarding unexpectedly suboptimal optimised PaREn models), the combination of best models in an overall model to use, etc. On the other hand, the estimated accuracies in PaREn were quite far from the actual accuracies in most of my experiments, but the idea is interesting.

@ Christian et al. : I would have an additional suggestion to which I had thought when posting questions earlier in this topic. ROC Analysis can be added to searching the model giving the best accuracy when the output/label attribute is binominal. More precisely, after finding the best parameters for a learner, given a dataset, one can get also the optimised threshold from a ROC curve (as opposed to using the default threshold 0.5), which guarantees the best accuracy.

However, perhaps this suggestion may be useful to consider after the ROC Analysis implemented in Rapid Miner would be revised as it is still unreliable in this package (i.e. AUC calculation needs corrections, as I have shown on the forum http://rapid-i.com/rapidforum/index.php?PHPSESSID=18d6261d2d63b2ca946477f03c2552bc&topic=2237.0, and Find Threshold operator does not find the best threshold as expected but provides suboptimal solutions - I emailed a complete report to the RM development team, with relevant processes illustrating this).

PaREn is an excellent initiative towards RM's enrichment. However the extension needs to be more practical and more accurate. Indeed, it requires relatively much processing time and models are not as optimised as expected - see postings in this thread, where it is explained that both - an ad hoc model created with no particular setting, and a trivial model that picks up blindly the most frequent class as prediction - are better in accuracy than the optimised, time consuming to build PaREn model. Improvement would be very beneficial and necessary indeed for the extension. Other users of the extension may wish to generate ad hoc models in addition to the PaREn models, and to compare their accuracies - this would be a useful feedback to the development team.

I hope the feedback and suggestions in this thread help and would be useful to PaREn, as part of the community's contribution to improve the open source software. Good luck!

Has Paren extension anything to do with Amine platform ? The formalism they use is said to be compatible with parallel processing (mix of lambda calculus into ontologies):http://amine-platform.sourceforge.net/.