well, there is also a third one: you could first map the values to numbers you want to use instead (Operator: "Map") and parse them afterwards (Operator: "Parse Numbers").

Well, if the usage of the nominal to numeric operator is a problem or not depends a bit on what are you doing on which data. Indeed, the operator simply produces numbers based on the internal mapping used by RapidMiner. If you produce those numbers for two data sets with different mappings, those numbers would also differ. You can deal with this by ensuring that the same internal mapping is used for all data sets.

But still even then the internal mappings don't have any real meaning. For example, if you have the three nominal values "low", "medium", "high", you would probably would not like to end up with the numbers "2", "1", and "3" but would prefer at least something like "1", "2", and "3" instead. But even this might become problematic: Is "high" really exactly 1 more than "medium" compared to "medium" to "low". Who knows?

For both reasons (especially the second one since the first one can be dealt with if you are cautious) I would agree with your boss that method 2 should usually be preferred. If memory is getting low, you could try to create a view instead which calculates the values on the fly instead of directly calculating and storing them. If this still does not work, you could use method 3 introduced by me above so that at least both problems discussed above will be smaller.

Cheers,Ingo

Logged

Did you try our new Marketplace? Upload or download new Extensions, add comments, and organize your operators. Have a look at http://marketplace.rapid-i.com