MonkeyMatch will find and help you fix names with similar spellings. It will search the MediaMonkey database for the selected category - Artists & Album Artists, Albums or Song Titles - and do a fuzzy compare of each one to all other such names in the MediaMonkey database. When similar names are found they will be listed allowing you to quickly and easily choose the correct spelling. It will then modify all songs that had the incorrect spelling, setting the applicable field to the correct spelling you chose.

The accuracy of the fuzzy search is selectable, with 4 presets and 1 that can be fully customized. The search is highly optimized, but can still be slow with large databases, so filters can be set to limit which names that will be compared.

The correction process is designed to be very fast, requiring only 2 clicks to correct a name. A right-click menu allows you to edit a name, look up more info about the selected name, and more.

Getting Started TutorialThe following is a quick and simple tutorial to get you started. It's quite simple, but there are a couple steps which aren't intuitive at all.

Finding Matches

Install MonkeyMatch, and start it. The main screen will appear.

For now, concentrate on the top left corner of the screen. The Filters and other buttons will be explained later in this document.When started, MonkeyMatch will have Artists as the selected Category, and the Match Accuracy will be set to 5, Extremely Accurate. This is a good starting point for now, so click on the Find Matches button.MonkeyMatch will search the MediaMonkey database, getting all unique Artist and Album Artist names. (When Albums or Songs are selected, MonkeyMatch only gets the Album names or Song Titles, respectively.)

You will see the counters increasing for SubSet Names and then SuperSet Names (more on these later). Once it has both sets of names, MonkeyMatch will start comparing the two lists, doing a fuzzy search looking for similar names. Whenever it finds a pair of similar names it will increase the Match Pairs counter.

Matches FoundWhen it's done finding matches, it will load the first group of matches into the Match Group window, and the Next Match Group button will become enabled.From this screen you can see that my database has 2,987 unique Artist and Album Artist names. After comparing all of them, it found 63 pairs of names that were considered to be matches according to the current accuracy setting. The first pair of matches is displayed, leaving 62 pairs left.

Correcting MatchesThe first pair of matches is displayed in the Match Group window in that large New Courier font. (The font can be changed if you prefer something else.)According to their web site, Alice In Chains should have the "I" capitalized for the word "In", meaning the bottom name is the correct one. Click on the Preferred button to indicate this choice. The background of the selected name will turn green.

To correct the top spelling, simply double-click on the name. The name will be copied from the Preferred entry and the text will change to a bold red font, indicating that it has changed. The Save Changes button will become enabled, indicating that a change has been made and needs to be saved to the MediaMonkey database.

Save ChangesClicking the Save Changes button will update the MediaMonkey database. It will search for all songs that have the Artist or Album Artist set to the incorrect spelling and will change each of them to have the correct spelling. When the selected Category is Artists, MonkeyMatch will search both the Artist and Album Artist fields, but will correct each individually. If either field contains the incorrect spelling it will be replaced by the correct one.

Once the change is saved, the changed name will be removed from the list. If a single name remains in the list, then no more edits are possible so the next group of matches will be displayed.

Next Match GroupThis button will clear the Match Group window, get the next group of matches and display them, and decrement the Matches Left counter. If there any changes that have not been saved, a warning box will pop up before the next group is displayed.

Right-Click MenuRight-click on a name to list a few options. More Info will list all the songs associated with the selected name. More Info (All) - or F11 - will list the songs associated will all the names shown. Warning - these can take a few seconds if there are many songs to be shown.

Google Search will launch Google with the listed name as the search string.

Edit will pop up a box letting you free-edit that name.

Revert To Original will undo all changes made to the name.

Blacklist All Pairs - or F12 - will add all possible combinations of the listed names to a list of names that are known to be unique. These pairs will never again be considered to be a match. Note that each name may still match some other name, but the combination of the two names will never match again. For example, if you blacklist Rush and Bush, you will never see those two names paired together. But if you ever add a band called Mush then you will see Rush matching Mush, and Bush matching Mush.

Copy and Paste work with the Windows Clipboard, just like in any other program.

Tutorial CompleteThat's it. OK, there's more that can be done, and there's a LOT more information about what's MonkeyMatch is actually doing. But that will get you started.

Change HistoryBeta 0.5.30 - June 11, 2013Fixed a bug that causes an "Index Out Of Range" errorBeta 0.5.34 - June 15, 2013Added Blacklisting SupportAdded Function Keys:--- F11 - More Info On All--- F12 - Blacklist All PairsAdded Composer to all Artist and AlbumArtist searchesAdded Match Progress counter to show number of SubSet names checkedSmall performance increase gained by only showing counters in hundredsBeta 0.5.49 - June 30, 2013Fixed a bug caused by a song without Artist, Album, and Song Title fields.Beta 0.5.53 - July 4, 2013Fixed a bug caused by extremely long namesAdded a counter to show how many songs have been changedChanged the way that changed songs are saved and updated by MediaMonkeyAdded an option to stop matching once a certain number of Match Pairs have been foundClicking Help in the Configure menu now worksWhen all Match Pairs have been processed, the Match Pair list disappearsMade some minor tweaks and stability improvements, and added some more debugging capabilitiesBeta 0.5.56 - July 7, 2013TREMENDOUS performance improvement - even more so if you use Blacklisting. Simply huge.Changed "Revert To Original" to "Undo All Changes" because the latter is more standard

Last edited by Scottes on Tue Jul 09, 2013 4:57 pm, edited 14 times in total.

This is great! I've been looking for a script like this for a while! I will try it out tonight when I get home. Thank you!

Two quick questions.

Is it possible to include 'composer' to the people match list? I.e., not just artist and album artists? I find that most of my inconsistencies occur on the composer field. And since many artists are also composers, it might make sense to compare them all.

Also, is there any ways to flag common equivalent expressions/names? So something like an editable file with common matches like (Robert, Bob, Rob), (2nd, Second), (and, &, 'n), (orchestra, orch.) or (Benjamin, Ben, Benny), etc. This would make the script very powerful in identifying matches accurately...

Concerning composers... Yes, it is possible. However, it won't be easy from what I learned with the way I mashed Artist & AlbumArtist together. If I were going to add Composer then I'd probably want to clean up some of that ugly code and make it more manageable.

Concerning Equivalent expressions/names... Interesting concept. I'm going to let my brain spin on this one for a while and see how it would fit it in.

I would not think that MM 4.1 would matter because I am only using two MM function calls in this, OpenSQL() and QuerySongs(). AFAIK those have been stable for a while. I'm going to install 4.1 and poke at it a bit and see if I can figure something out.

My worry is that it is a localization thing. The last piece of shareware I wrote caused a problem for a German user, but that one was easy to solve since it involved the use of commas and periods in numbers and was readily apparent.

I'm going to install MM 4.1 and see if that gives me any issues with my program. If not, would you be willing to send me your MM database file? Though I'm not sure if that will even help since I'm on an American Windows.

EDIT: Well, it's not caused by MM 4.1. MonkeyMatch handles the database without issue on my system. It must be the localization, German vs. American.

I've been doing some research, and I'm beginning to think that this is caused by one of the names in your database. For some reason the query is returning only one name - from the actions.log you posted:Jun 02 22:16:08 Found 1 names for SubSet

I would think that you have more than 1 unique artist name in your database, so I think there may be a problem caused by a special character - special to an American language program, like umlauts and Ä, Ö, Ü, ä, ö, ü. My database has numerous "special characters", but certainly not all of them.

trixmoto wrote:Out of curiosity, what fuzzy matching algorithm(s) do you use?

It uses the Damerau-Levenshtein algorithm, which allows for an early exit if the distance exceeds a configure threshold. I changed the base distance to a normalized score based on the string lengths. I have a few other optimizations around it, so it doesn't execute unless the two strings are within tolerances.

I did this once before in Python, using the basic Levenshtein algorithm and barely any optimizations around it. Running against my database took 522 seconds. Using C#, and Damerau-Levenshtein with the optimizations, running against the same database took 77 seconds - and produced much better results.

NEW - Blacklist All Pairs (F12)The fuzzy matching algorithm will often match a pair of names that are actually unique, such as two Albums with similar names such as "Greatest Hits I" and "Greatest Hits II". When presented with a list of names that you know are unique you can choose to Blacklist them, which will ensure that these names will never again be presented as a match.

Note that this option will Blacklist all possible combinations of the names shown in the Match Group window, so be sure that ALL names are unique before executing this function.

Pressing the F12 key is another way to execute this functionality.

More Info On All (F11)This will pop up a new window containing all the songs related to all the entries in the current Match Group.

NEW: Pressing the F11 key is another way to execute this functionality.

Added Composer to all Artist and AlbumArtist searchesWhen the selected category is Artists, MonkeyMatch will also search for and correct Album Artists and Composers since these three will often be the same for each song. When corrections are done, Artists, Album Artists and Composers are processed separately. Each field will be checked individually, and each field will be changed only when it matches the incorrect spelling as you specified.

NEW - The Composer will now be searched and matched along with the Artists and AlbumArtists.

Added Match Progress counterNEW: Added Match Progress counter to show number of SubSet names checkedNEW: Small performance increase gained by only showing counters in hundreds