This plugin will determine a number of pages and/or words in a book and store the result in custom column(s). In addition to just general library browsing usage, Kindle users can generate APNX files using the value from a pages custom column (requires Calibre 0.8.40). So when you send an ebook to your Kindle device from calibre, you will have page numbering available similar to that when loading Amazon books which offer this feature.

You have two overriding methods of determining page count with this plugin.

The first approach is estimation based on the book content, provided you have an ePub format or a format that is convertible to ePub. The format used if your book has multiple is chosen based on your Preferred Input Format order, that you set in Preferences -> Behavior.

Note that if you use this option it can be an approximation only of a paperback edition due to differences in fonts, images, layouts etc. By default it uses an "accurate" algorithm similar to that created by user_none for generating APNX files for Kindle users. Alternatively in the configuration you can choose to use the page count used by the calibre e-book viewer, or you can use the Aobe algorithm used by their ADE software and some devices like a Nook. However if the format being counted is a PDF, then as of v1.4.2 there is now a special optimisation to read the actual page count rather than estimating it using any of the above algorithms.

The second page count option (added in v1.3) is to download the page count from a web page on the Goodreads.com website for your specific linked edition. This can be used for a book with any formats (or even none). How is a goodreads identifier linked? Either by using the Goodreads metadata download plugin, the Goodreads Sync plugin, or by manually typing a goodreads:xxx id into your identifiers field for the edition of interest. If the edition you have linked to has no page count, you can switch editions using a feature added to the Goodreads Sync plugin.

Word count is optionally calculated independently of page count. As this is unavilable on a website, it is subject to the same limitations as estimating page count above, in that you must have either an ePub or a format convertible to ePub available for it to work.

Configure whether the default for clicking on the toolbar button is to use an estimated page count or a downloaded page count.

Optionally configure whether to only overwrite counts if you don't already have one

Optional keyboard shortcuts

Special Notes:

Requires calibre v1.0.0 or later.

Installation Steps:

Download the attached zip file and install the plugin/restart calibre/add to context menu or toolbar as described in the Introduction to plugins thread.

Create custom column(s) to store the page count and/or word count in. See the instructions in the spoiler below.

The first time you use this plugin in a library, you will be asked to configure the custom columns. Select your column you defined above in the dropdown. If you do not want to retrieve either of the counts, leave the custom column blank.

Counting pages runs as a job in calibre and when complete you will be prompted to update the books with the page/word counts.

Creating a Count Custom Column:

Follow these instructions to configure a page count column as per screenshot 3 below. Perform similar steps for a word count column.

Type the name of your custom column in the textbox as shown in the attached screenshot 6 below.

Paypal Donations:

If you find this or any of my other plugins useful please feel free to show your appreciation. I have spent many hundreds of unpaid hours in their development and support so any encouragement for me to continue is appreciated!

Version History:

Spoiler:

Version 1.6.9 - 05 Jul 2015
Added option to disable the confirmation prompt each time to update the page/word counts. Use at your own risk - if you make simultaneous other changes to the book record they may get lost.
Fix for Cancel on the progress dialog (submitted by Raúl)

Version 1.6.6 - 09 May 2013
For Mac users using the ADE algorithm fix an issue with paths (as submitted by SimpleText)

Version 1.6.5 - 06 Dec 2012
If user chooses Adobe page count algorithm, do not attempt it on any formats other than EPUB.

Version 1.6.4 - 05 Dec 2012
Add a "Custom" algorithm option for page count, for users who want to specify the number of characters per page.
When switching libraries, ensure keyboard shortcuts are reactivated
Prevent plugin being used in Device View or on Device View context menu

Version 1.6.3 - 26 Jul 2012
If no page count downloaded from goodreads, prevent wrong error appearing in log
If book configured for page count only and has no formats, prevent error in log (if downloading from Goodreads)

Version 1.6.1 - 17 Jul 2012
If a book has zero words, just display an error in log rather than storing zero in the column

Version 1.6.0 - 14 Jul 2012
Add three new statistics for calculating readability - Flesch Reading Ease, Flesch-Kincaid Grade Level and Gunning Fog.
Remove the redundant Words algorithm combo since only one algorithm offered.
Make page algorithm a per library setting rather than a plugin level setting
For CBR and CBZ book formats, calculate the number of pages as being the number of image files rather than converting to ePub
For CBR and CBZ book formats, only allow the Count Pages statistic and ignore all other statistics
Fix tooltip missing line breaks in configuration dialog

Version 1.5.0 - 22 Jun 2012
Now requires calibre 0.8.57
Store configuration in the calibre database rather than a json file, to allow reuse from different computers (not simultaneously!)
Add a support option to the configuration dialog allowing viewing the plugin data stored in the database
Remove the additional menu items for individual word/page counts added in v1.4.0 as cluttered the interface

Version 1.4.3 - 02 Jun 2012
Add another page count algorithm of "Adobe Digital Editions (ADE)", which matches that used by the ADE software and some devices like Nook.
Rename the "Calibre Viewer (Adobe)" option to "E-book Viewer (calibre)" as it was misleading, calibre uses its own calculation not the Adobe one.

Version 1.4.2 - 31 May 2012
Minimum version set to calibre 0.8.54
Optimisation for counting pages for PDFs to read the page count from the PDF info rather than estimating it
Revert the performance optimisation from 1.4.0 which affected the character count statistics

Version 1.4.1 - 30 May 2012
Fix problem with new overwrite existing behaviour not counting pages in some circumstances

Version 1.3.2 - 07 Apr 2012
Fix bug where preferred input order not being correctly applied (was alphabetical instead!)
Fix bug where LIT formats would cause file in use errors

Version 1.3.1 - 03 Mar 2012
Support count page/word estimates for any book format that is convertible to ePub, using preferred input format order

Version 1.3.0 - 12 Feb 2012
Add a Download from Goodreads option to allow retrieving book count from books that have a Goodreads identifier
If word count is disabled (i.e. only page count) allow download of page count for any book regardless of formats
Attempted workaround for Qt issue on Mac where some books would crash calibre.

Version 1.2.0 - 11 Sep 2011
Upgrade to support the centralised keyboard shortcut management in Calibre

Version 1.1.1 - 12 Jun 2011
Fix bug if user chooses to retrieve only word count
If an unexpected error thrown while counting, include in log
Display log and no results dialog if no statistics were gathered
Change Mobi word count to not require a conversion

Version 1.1 - 09 Jun 2011
Add option to generate a word count instead of or in addition to page count

I did initially wonder about supporting other formats, but the immediate question becomes how would it work, when you have a book that has multiple formats? Say you have a book that has an ePub and a mobi version. Which would it choose to calculate on? Should there be a user configurable preference list (a bit like the input format order?).

And then if using the apnx code, clearly there is the two algorithms - should the choice be hard-coded, user configurable, or use the Kindle driver setting?

It opens questions I chose to avoid for the sake of a quick plugin that was primarily a technical experiment as a precursor to the new version of Extract ISBN...

That is not to say I wouldn't be willing to make the changes to make it more useful to others, clearly from the posts above there is interest. I just would appreciate some input as to how you would like to see it working.

For myself I only store ePub and mobi versions so always have an ePub to "calculate" on. I guess it will be interesting to see how the numbers differ on user_none's page calculations versus the simple calc done on ePubs for the viewer.

The fast APNX algorithm will give approximately half of the page count as the viewer. It counts every 2300 characters as a page. I believe the viewer uses 1024. The more accurate APNX algorithm will be substantially different because it only looks at visible characters and checks for paragraph tags / length.

I cannot comment on the technical aspect, however as a user I would like it to be user configurable as to which format should be the default when more than one exists. While I have both mobi and epub in my library, mobi is my primary format so that is the one that I would like to use. Unfornuately I am not able to use the apnx plugin because the k2 does not support it.

@user_none - thx for that info. Yes I must confess in my random sampling of ePub page counts I found the published value was between 50% and 65% of the Calibre calculated value. So that sounds like using your mobi calculations as the "first choice" would be a nice default. I guess the problem is that means you would get very inconsistent results in your library though, depending on which format you chose. And that would rather negate the intent of the plugin, allowing you to compare at a glance books to see if you wanted a "quick read" versus something to get stuck into.

Perhaps the solution to that is to not use Calibre's page count from the EBookIterator, and to instead replicate your apnx approach for ePubs. What are your thoughts on that - is that feasible in your opinion?

@Nyn - thx for your thoughts. Provided both ePub and mobi used "similar" algorithms, then hopefully it should be mostly immaterial which you ran (assuming they are a conversion of the same edition of course).

Support option to prioritise either Mobi formats (using APNX algorithm) or ePub files

Change ePub page count algorithm to be similar to the Mobi APNX algorithm

Thanks to all for the feedback above. Note that I have decided to change the ePub count away from the Adobe "1 page = 1024 chars" that the ebook viewer uses and instead apply an algorithm very similar to that created by user_none for generating APNX files for Mobi formats. So you should now get roughly similar numbers depending on whether you scan Mobi or ePub formats. You can choose which to prioritise in the configuration dialog.

I've added some extra logic in the ePub counting that the APNX mobi code does not have. So you might find for a minority of "weird" books internally that the ePub count offers a more consistent result. It is a luxury I have that user_none did not for his purposes of not sacrificing performance while still offering a useful solution to Kindle users. However for the most part either should give you comparative results, and thanks to user_none for his starting point and information.

In my own quick sampling I found that the algorithms generally tend to slightly underestimate the pages compared to paperback versions, but there were exceptions where the reverse was true. Also there is inconsistent results on printed page counts as well - hardback vs paper vs ebook vs large print editions for instance, fonts, line spacing etc.

So treat the numbers from this plugin just as a general indication of relative size and have some fun

Just one tiny thing, without realizing it I selected my clippings file to have a page number calculated. This "book" is only in a text format, I have the mobi format selected as my preferred format. The error message said it could not calculate the page count because no epub version existed, which is correct. However, would it be better to have the error message state the preferred format didn't exist, rather than epub?? Just a question.

The page count is still about 30% more than the hardcover and paper back written in Amazon will it be possible to get those numbers. I would really prefer those numbers or the calculated numbers to be close to that.
I do know that for some books the count wont be that close as sometime the typeface used might be bigger. But for the 10 or so book I compared the count was consistently around 30% more that for the hardcover. I do read a lot of fantasy and scifi and usually the type face is more or less the same for those books so maybe thats why I got such a consistent result.
If that's not possible would it be possible to make the page count user configurable so maybe we can specify how many words or letters would signify as one page.

Ok forget about my previous post as I don't think you can get a very accurate result. The 10 books I had tried before were the largest by page from earlier page count and all were large fantasy and scifi books. With the updated plugin the count for all those books was about 30-35% more than the stated on amazon for harcover and paperback.
But the count for other genres like romance the count is less by about the same or more so this count is good for only comparing ebook libraries.

@dopedangel - like I posted above there is no way for it to be particularly accurate so don't get hung up on trying to compare with actuals. Think about all the differences between books on your shelf in terms of size of fonts etc. In my own testing I didn't find the variance to be that large but of course it can happen.