Why is the context index on the MV so large?

I’m not saying the context indexes on Materialize Views are always large, but sometimes they can be. Today we found an issue in production database where the token information table (named DR$<index_name>$I) of the context index named <index name> was ~1.7Gb in size, but the MV table segment which the index was created on was only ~8 Mb. I decided to dig and and find out what the hell was going on there. After a short while it was discovered that there are ~500 sets of token information in the index table for each row in the materialized view. The following demo illustrates how that happened:

I’m not sure it’s a bug (but looks like that certainly), I reproduced it on 10.2.0.4 (HP-UX) and also 11.2.0.2 (Linux x86). Digging further into the issue showed that a complete refresh of the materialized view changes rowids for all rows and this is most likely why there are multiplexed entries in the token information table (in fact the multiplexion factor should be the same as number of combinations of row data in MV and rowids in MV table segment for that row, unless an index rebuild has been done at some point), it does look so that in case of MV refreshes the maintenance (cleanup part) of the context indexes are not properly done.

Here is a select you can use to identify the context indexes that are created on table segments of materialized views, and check the sizes of related segments to find out if they look reasonable: