Is there an easier way to organize my lists?

Archived and Closed

This conversation is no longer open for comments or replies and is no longer visible to community members.
The community moderator provided the following reason for archiving:
Old thread

It is 12 minutes since yesterday when I started sorting my lists. I am not done, and I have been working at this for over 4 hours. With the articles "A" and "The" not being ignored in a standard sort, I am having to keep a piece of paper in front of me so I can get this list properly alphabetized. Please let me take my lists off line, sort them with Perl, and then upload the properly sorted list.

My watchlist is 6 pages long in Compact view.
My "Own on DVD" is 2 pages long in Compact view.

Sorting manually ignoring articles is a right pain in the rear end; sorting by date for franchises is also giving my poor wrists a lot of strain. I feel sorry for any user with arthritis.

Unfortunately, IMDb currently does not sort titles in list properly. The article "A", "An" and "The" are included as part of the sort. Although a user can export a list, IMDb does not allow a user to import a list. There is already an Idea to import lists. For details, see: Importing other sites' history to IMDb [https://getsatisfaction.com/imdb/topi...]. If you support this idea, I recommend +1 it.

Likewise, your question would be the basis of a good idea. When I get a chance, I will reformulate it as an idea.

They dropped it, at least in part, due to the problems with articles in other languages. For example: the French article L' not only needs to be prepended with no space, but the following letter needs to be lower case, so the original title of One Deadly Summer in storage format would be été meurtrier, L' which looks bad.

Even more problematic is the German article Die. If you had a title like Die, Scumbag, Die how would the system know that this should not be treated like Jungfrauen von Bumshausen, Die?

IMDb really needs to set up a system like the MARC 21 system used by libraries. This stores the title in the display format, but adds a number to indicate the number of characters that must be ignored for the sort; i.e. The Accidental Tourist would have a ignore number of 4, so the sort would be on Accidental Tourist

If they stored the language the titles were in, they could apply language specific rules about articles. I could probably write a sort in Perl based on language rules in a few moments, if I had all of the language rules in front of me.

To be honest, you could just crowdsource it, with a sort field/attribute that people could update. Also, as mentioned above, they did have all the data for correctly sorting titles, so could probably run a script that'd reintroduce those and rely on crowdsourcing to pick up any exceptions or new entries not covered. Granted there'd always be a few entries that'd mess up the neatness but people will always be spotting and fixing these too (I personally could live with a little messiness sneaking in for better sorting).

Setting up fully automated rules might not be quite that simple. When researching the problem (both to comment on IMDb and for my own database, I found a table of language/article combinations which listed 44 languages and 138 articles from a to yr (including ones like 'r, ang mga and na h-) - and this does not include the various spellings in romanizations of the Arabic al- - which provided a total of 225 valid article/language combinations.

I said then, and still feel now, that the way to go is to create an extra field as is done in the MARC 21 system (although that system actually uses (half of) the first byte of the title field as the flag). This field could be automatically populated for English, French and German articles, although a flag would need to be raised for Les and Die so the Data Manager can decide if these are actually articles. The other languages are probably infrequent enough that a weekly scan of all new/modified titles for initial strings that match the list of potential articles for a Data Manager to would be enough to get the rest correctly sorted.

This system has an additional benefit - you are not restricted to articles. Various symbols and punctuation marks can be included in the ignore, allowing shows like 'Allo 'Allo! to be sorted with the A's and also bringing ...And God Created Woman (1956) and And God Created Woman (1988) together in the sorted list, rather than having the first showing at the very top of the list.

Certainly the current sort is not really acceptable, and a significant change is necessary. It seems to me that if they had put even a small fraction of the effort expended in fancying up the display into solving this, we would have had a solution long ago.

Yes, I suspect the system was abandoned because there was no way to automate it (the Die Hard franchise probably pushed them over the edge), but I don't see there being any reason this couldn't be crowdsourced. IMDB have their old data for the sorting that could be added back in and then they could allow people to update problematic films. I'd imagine it'd be easy enough to also have some kind of check that could be run during the submission process forcing someone to look over the sort attribute for the name (perhaps with a small degree of automation to address the most obvious cases with a few language specific rule) and tick a checkbox to continue, which should make sure the majority of new submissions conform to the right format (and data processors would be able to check to pick up any mistakes) and anything that falls through the net can be crowdsourced and sorted out.

There are other complications, as well as the obvious one - you mention Die Monster Die, for example, but if you ran language specific rules to generate a specific search attribute for each title then the West German title would be fine "Das Grauen auf Schloss Witley" but another is also listed: "Die Monster". My German isn't good, so I couldn't say if they have just used the English title, or if they've used the German article. Probably easy enough to resolve (I assume it is the former). It seems this issue might have been largely addressed in German translations to avoid confusion anyway - Die Hard is Stirb Langsam. So it might not be a big issue.

DavidAH_Ca brought up something I had not thought of, movies with punctuation as the first character in the title. When I sort those titles, they always go on top, such as *batteries not included and 'Til There Was You. Now I'm thinking I need to rewrite my own sorting function to remove the initial punctuation which would leave Æon Flux as the only film with a special character at the beginning of the title. (I might have to add a new field to my own database to add titles' languages which would be a minor pain in the neck for me, but for a company like IMDb, it should be easy.)

I just wish I could sort my Watch and other lists easier. I have pages upon pages of unsorted titles there because I have a problem getting a title from one page to another. Also, the last time I was sorting, IMDb got really buggy on me. I had titles with the same sort number and other sort numbers were really out of whack. I had to quit for the day since refreshing didn't do me any good. I even closed my browser and reopened it, and the problem was still there. And yes, I have five films which are supposedly at the 55 position. :S