Pinned topicNouns are not correctly recognized in Dutch

I am analyzing some documents related to Healthcare.
An example of the text is : "rustig voorste oogsegment".

For some reason, the word "oogsegment" is not recognized as a noun. What it does : it recognizes "oog" as noun and "segment" as "noun". When I try to search for "oogsegment" no document is found back. In the Content Analytics Studio, I see that the words "oog" and "segment" are recognized, but the word "oogsegment" isn't.

Re: Nouns are not correctly recognized in Dutch

‏2013-03-20T14:54:11Z

This is the accepted answer.
This is the accepted answer.

This is due to the decomposition paradigm which decomposes "oogsegment" into oog and segment.
You can add these type of compound words into a custom dictionary and use them in your model if you need to.

Re: Nouns are not correctly recognized in Dutch

This is due to the decomposition paradigm which decomposes "oogsegment" into oog and segment.
You can add these type of compound words into a custom dictionary and use them in your model if you need to.

If you want to see compound words as single tokens in the rules editor to write complex rules on top of them, I would suggest you create a rule to annotate any sequence of nouns with the feature "isConnectedToPrevious" as a "CompoundNoun" and use this new type in subsequent rules. So it is a bit like creating a shallow parsing grammar for your model.

If you really want to turn off decomposition, then it is an advanced usage and you need to contact IBM via the support channel from which you bought the ICA license.

It is possible to turn off decomposition, but please be aware that there are side effects, mainly on the Part of speech tagging precision which may degrade.