What's happening is that PolyAnalyzer is performing more aggressive processing than RegexTokenizer. In addition to tokenizing, it's also converting all text to lower case so that searches are case-insensitive, and using a "stemming" algorithm to reduce related words to a common stem (senat, in this case).

PolyAnalyzer is actually multiple Analyzers wrapped up in a single package. In this case, it's three-in-one, since specifying a PolyAnalyzer with language => 'en' is equivalent to this snippet:

Sometimes you don't want an Analyzer at all. That was true for our "url" field because we didn't need it to be searchable, but it's also true for certain types of searchable fields. For instance, "category" fields are often set up to match exactly or not at all, as are fields like "last_name" (because you may not want to conflate results for "Humphrey" and "Humphries").

To specify that there should be no analysis performed at all, use StringType: