I'm trying to do hierarchical classification of documents and I believe the 'hierarchical classification' operator is the way to go as recommended here in the forum. My problem is that I couldn't figure out how to use this operator and what to expect as an output. I couldn't find any example of use in the forum either. Can somebody post a sample process using this operator?

Here's an example of a top down clustering. It uses the top clustering operator which itself contains another clustering operator; in this case expectation maximization with k = 2.. By observation this all works something like this. The outer operator invokes the inner which splits the example set into k = 2 clusters. The outer operator then repeats this with the examples from these 2 clusters and the inner operator duly splits these into 2 more clusters. This repeats for the number defined in the max depth parameter for the top down clustering operator. I believe the flatten clusters operator is what is needed to extract a particular clustering and to prove this to myself I added a map labels operator with performance to see how well the clusters map to the ground truth.

Thanks Andrew for the reply.But I'm looking for hierarchical classification, particularly its operator. I have hierarchical labels which I can enter in the operator's table. But other than that I have no idea how to use (expected input and output) it.

Good point - I didn't pay attention to the question and substituted clustering for classification

I'm not familiar with hierarchical classification in the context of machine learning but I'm guessing it's something to do with dividing example sets into smaller and smaller pieces based on a rule at each stage. That's sort of what the clustering example is doing with the proviso that the rule is not controllable because it is the same clustering algorithm at all times. It also produces a prediction so it is usable as a classifier - again with one proviso, the label results are not derived from the training data so there would also be ambiguity about the true identify of the clusters.

I created a hierarchical classification a couple of years ago similar to what you described --modelling/applying different set of labels to each divided example set. The set of labels are hierarchical. But since there is this 'Hierarchical Classification' operator, I thought that this could make the process simpler.

Anyways, if anybody has a sample process please post it or maybe a hint on how it works.

The following process performs a hierarchical classification on Iris. You have to define the hierarchy in tabular form, starting from a "root" node. Please have a look at the process below and come back with any questions you have.

Thanks Marius. It works but if I apply the model to an exampleset, the result is not showing the hierarchical labels --just the original labels (iris-*). Is there a way to make the prediction use the parent labels too --like another column?