You cannot write an efficient algorithm on an highly disorganised data and expect results. Data organisation is as important as any data processing algorithm.

Step 1 should be organising your data. Basically, my understanding is, when you work on parsing any plain text based data, the below two things should/could be kept in mind in order to help your parsing algorithm work effectively.

Delimiter - A character which would be used as boundary to separate two consecutive values. For e.g. comma(,)

For e.g. Alabama, Montgomery, Louisiana, Baton Rouge,

Field Qualifier (Optional) - A valid character that envelopes any multi-word values of which some special characters like space or the delimiter itself are part of value itself just so you avoid unintended results. For e.g. Baton Rouge doesn't end up being Baton, Rouge

With Qualifier you can have

"Alabama", "Montgomery", "Louisiana", "Baton Rouge"

Once you have your data in that format, you can simply apply String's split method and can proceed from there.

One word of caution, when you make use of String as a key, both Alabama and alabama could be used as two separate keys. You better have the strings saved as Uppercase or Lowercase as keys so as to have valid/unique identity as keys.