1 Answer
1

Essentially you’re looking at sequences of length $Y$ over an alphabet of $X$ different symbols, and you want to count those that do not contain every symbol at least once; call these good sequences for short.

For each symbol there are $(X-1)^Y$ sequences that don’t contain that symbol, so at a first approximation there are $X(X-1)^Y$ good sequences. However, this counts many good sequences more than once: a sequence that misses both symbol $s_1$ and symbol $s_2$, for instance, gets counted once for missing $s_1$ and once for missing $s_2$. The method of inclusion-exclusion corrects for this kind of overcounting and yields the result that there are