I have a question about your 'Find duplicates' plugin. Is there a possibility to auto delete all the binary duplicates save one? Just the actual binary duplicates and not the whole book entry?

Cause I have quite a few books, a lot with differnt formats in one entry, but as well duplicates of those.

A lil example:
I have several entries of the same book. Call it 'Book A'. Then there are several formats of said book. And to mix it up a bit more there are differnt versions of those formats (mobi version1 & mobi version2...)

So when I get the results I can't just delete all the mobi formats save one or all the pdf duplicats. I know that one or the other is the exact same file, but I can't see the differnce without opening up every single file when I have a result like the example up there.

If auto delete is not an option, maybe highlighting the formats which are a binary duplicate would be possible?

Cause I have quite a few books, a lot with differnt formats in one entry, but as well duplicates of those.

A lil example:
I have several entries of the same book. Call it 'Book A'. Then there are several formats of said book. And to mix it up a bit more there are differnt versions of those formats (mobi version1 & mobi version2...)

I don't know about adding this feature to this plugin, but if you have any more books to add to calibre it would be easier to find and automatically delete binary duplicates prior to adding the books to calibre.

Firstly re the "Back" button equivalent - do you mean something specific to this plugin (as in going backwards through the Find Duplicates result groups) or backwards through your search queries in Calibre? If the former then the functionality exists in the plugin and you can assign to a keyboard shortcut. If the latter then take a look at the Walk Search History plugin, which I assign Alt+Left and Alt+Right to go back/fwd through your search queries...

Now for your queries concerning binary duplicate deletion.

Unfortunately there is no way to colour highlight like you suggest exactly which format is in conflict, for a number of reasons I won't bore you with, just trust me! I always anticipated there might one day be a "Smart Merge" UI plugin that could incorporate such functionality but I don't have the time/motivation to write that so it remains vapourware.

So I might be more amenable to a change to this plugin along the lines of your suggestion (since it does not involve trying to merge the book records automatically, instead *just* removing duplicated formats). Off the top of my head it would most probably be by adding a checkbox option to the Find Duplicates screen rather than yet another interactive prompt. It would have to decide which book record in the group should have its format kept of course (perhaps oldest by Date column would usually make most sense).

The rest of the duplicates process would remain unchanged - the user would still be presented with a list of the duplicate groups, it just might be the case that some of those book records now have no attached formats (if the duplicate was the only format it had). So all the normal merging stuff would still apply, leaving it manually to the user to either delete such records or manually merge them.

Please help. I'm entirely new to this and I installed the duplicate finder plugin and restarted but the button was nowhere to be found. I looked it up in the User plugin under the filter of installed and showed as being installed but it is nowhere to be found on menu bar or in any of the dropdowns...

Thanks PeterT...that got me in the right direction. I clicked on the Preferences tab and under the "Interface" section I clicked "Toolbar" and than under the dropdown I chose to edit the "Main toolbar" and there it was on the left side.

I found a problem with the binay comparison, as the epub reader of calibre modify the epub to store the last page read (option on by default), I have a lot of book which are not detected duplicate.
is it possible to have an option of comparing also specified column (ie age or word-count);
So I could compare with the actual options on author and editor, and make a restriction to see only books with the same information on the choosen columns in my case, wordcount.

@P.K.Dick - what you found is the very good reason why some of us turn off that feature of the ebook viewer. You can do that by bringing up the ebook viewer, clicking the preferences icon and unticking the "Remember the current page when quitting" option. That will at least stop any more "damage" being done.

To answer your other question, no it isn't something I am interested in adding (it has been discussed previously on this thread). Apart from introducing dependencies between plugins (standard calibre has no page count or word count feature) it also won't work for books with multiple formats associated with them.

Store configuration in the calibre library database rather than a json file, to allow reuse from different computers (not simultaneously!)

Add a support option to the configuration dialog allowing viewing the plugin data stored in the database

Add an option to allow automatic removal of binary duplicates (does not delete books records, only the newest copies of that format).

I've got a number of plugin updates that will be released today based around calibre 0.8.57, taking advantage of a new feature to store plugin settings inside the calibre library. The benefit of this is for users that store their library in a shared network location and use calibre from multiple machines (though never at the same time - that limitation hasn't changed!). Your plugin configuration settings which are specific to each library (such as custom column names, book exemptions, reading lists etc) will be automatically kept in sync when calibre is next opened on each machine.

Specifically for this plugin it means any duplicate exemptions you configure for books and authors do not have to be repeated on each machine you open the library on.

The other addition to this plugin is a feature that has been requested a number of times over the last x months. When you do a Binary Compare using this plugin, it is not possible to see in the calibre gui which of the formats actually are duplicates of each other in the situation where each of the books in the group have multiple formats associated. So if Book A has EPUB, MOBI, PDF and book 2 has EPUB and MOBI, you couldn't tell if it was the EPUB, MOBI or both that were found as binary identical in that group (without opening each folder and comparing file sizes manually).

You now have a checkbox on the plugin dialog which allows the plugin to automatically remove the duplicate format(s) for you. Note this does not delete the book records in your database - just all but one of the duplicate formats and physical files. It chooses the book record that is the oldest in your library within that group to keep the format for.

So with this option enabled and the example above, if the EPUB was a binary duplicate and book 1 was the first copy created in your library, then the results will leave you with book 1 having the EPUB, MOBI and PDF formats, and book 2 having just the MOBI. If the MOBI was also a binary duplicate of book 1, then it would be left with no formats.

You can then resolve the duplicate groups in the usual manner - either merging the records, or deleting the books with no formats.

Some of you may ask why not completely delete the book if it has no formats? The answer is because you might have other metadata on that book record which you want to keep. There is no guarantee that the "oldest" version of a book duplicate is the one that should be kept - for instance you might have added a "Read" tag/custom column and it happened to be set on the newer record. Or perhaps you downloaded metadata and a nice cover for the newer record, not realising your older one existed. The approach I have taken ensures no important data is lost if you enable this option, and leaves the decisions about whether to merge/which direction or delete entirely up to you.

Thanks for "Find Duplicates". It's a great plugin ... many useful features and easy for a novice to use.

I have over 600 books in Calibre. Yesterday, after loading some books, I noticed that my book-count and title-count were different. I sorted by title and after a long and tedious search, I discovered that 3 of the books I had just added were duplicate titles with different authors. Never had a duplicate before ... then 3 in one day! (Last Man Standing, Relentless, Split Second).

Did some searching on MobileRead and learned about your MANY Calibre plugins. Sincere thanks for all your time and effort.

I'm working on the next version of this plugin at the moment, which has a new screen that allows searching for author, series, publisher and tag variations and renaming them in a convenient way. It should be out in a week or so...

No there isn't. The problem is how to present the results - you can't display them as groups together in the library view since you can only show books from one library. So it would either have to be a report, or a completely new screen.

No there isn't. The problem is how to present the results - you can't display them as groups together in the library view since you can only show books from one library. So it would either have to be a report, or a completely new screen.

I'd be happy with a report - but don't break a leg. It wasn't much more than an idle thought, I won't die for it's lack - promise