Update: Current Version - v2.5.211 (17 Jun 2013).Having installed this I have also updated the opening post.PDF-XChange Viewer now gets a 5 x from me (was 4½).No issues re OCR, now - it all seems to work just fine. I see that it currently caters for English, French, German, Spanish, but I have only used the English OCR functionality so far.

I had tried the OCR function in v2.5.205 and found it to be spotty. However, consistent with IainB's experience with v2.5.211, when I had upgraded to v2.5.208 several months ago most of the issues disappeared—when the fonts in the original were standard (e.g., Arial, Times New Roman). I hope the function has continued to improve in v2.5.211, since I have been downloading an increasing number of documents using human-readable, but less standard fonts that the OCR in v2.5.208 has had a great deal of difficulty converting to text.

Despite this issue, PDF X-Change Viewer is still far and away the most versatile FREE PDF reader I have found.

Very interesting comments from the xplorer² blog about Adobe apparently playing unethical tricks on us for their own benefit (also mentions PDF-XChange Viewer in a positive way at the end).(Copied below sans embedded hyperlinks/images.)

I can understand the frustration of the guys who run Adobe, having introduced the most popular portable document format (PDF) and being unable to make money out of it — or at least making much less than they feel they're due. With the release of adobe reader X (version 10) they started castrating their PDF shell integration and in particular the text extraction filter that allows PDFs to be searchable for keywords.

The other day I installed the latest adobe PDF reader version 11 on a windows 8 machine to see how things fare nowadays. Intriguingly, adobe have reintroduced the text filter (IFilter) functionality but somehow it only worked for windows search (!) and not for other IFilter aware programs like xplorer². How did they manage that? I was not alone wondering about this duality but there was no solution forthcoming.

Technically speaking adobe supplied and correctly registered the PDF text extraction filter DLL (ACRORDIF.DLL) but it wouldn't be instantiated by any common means, that is either using LoadIFilter API or using direct COM object creation after looking up the filter object CLSID in the registry. Was it broken? No, because somehow windows search could use it!? Some people argued that the filter was dropped in STA threading mode (like it did in the old v6 days) but that isn't corroborated by the ThreadingModel of the filter DLL. Some talked about running it only through a Job object. Adobe support kept themselves tight lipped and were claiming that the restriction was there for our security — ahem.

Anyway, here's a spoiler for Adobe, I present to you the way to obtain the IFilter object in C++ for use in your program (after adding some error corrections). Instead of LoadIFilter, you must obtain a stream interface on the PDF file, then create the filter COM object and use its IPersistStream interface to pass the file to be extracted. Also note that the whole process has to be running as a job or you receive E_FAIL.

// pass the file to the filterCComQIPtr< IPersistStream > pdfStream = pdf;hr = pdfStream->Load(iStream);// from now on proceed as usual with Init()ializing the filter

This approach works for version 11 of the adobe filter. However that's not the end of the story.

The plot thickens: PDF reader v10Version XI isn't available on windows XP, the last supported version there is X, so I run a quick check on XP to confirm that the above code for initializing the PDF IFilter works... but it didn't!! As usual windows search had no problems finding text so the handler worked, and so did all microsoft filter test tools. Back to head-scratching.

At first I thought that they could be playing on the job object trick and use some particular name for it, that only FILTDUMP.EXE used. So I wasted a few hours hunting the job name using process explorer, but it looks like FILTDUMP doesn't register a job object at all. Can you guess how the trick works? They hard coded the names of MS tools like FILTDUMP in the PDF filter ACRORDIF.DLL!!! So when the PDF IFilter object is being instantiated, it checks the calling process name, and if it is one in the "whitelist" it works, otherwise it fakes a problem and E_FAILs. Scandalous. For proof, rename your program to "filtdump.exe" and as if by magic everything works, even plain LoadIFilter without job objects.

So it wasn't really sandboxing or security Adobe were after, but a callous attempt to stop 3rd party tools extracting PDF text. Interestingly, FILTDUMP.EXE is still hard coded in the version 11 DLL but they must have turned off the hack. Adobe naughty <g> I say dump Adobe reader altogether (who needs 100MB installs just to read documents?) and go with a better solution like PDF-XChange Viewer. All the shell integration features work (for free), both for 32 and 64 bit windows. That's the best plugin for use with xplorer² too.

Some pretty good points there. I dumped Adobe Reader some time ago, and avoid using anything sourced from Adobe if I can help it. After reading the above, I think I shall shun Adobe products from hereon.

PDF-XChange Viewer now gets a 5 x from me (was 4½).No issues re OCR, now - it all seems to work just fine. I see that it currently caters for English, French, German, Spanish, but I have only used the English OCR functionality so far.

I agree that PDF-XChange Viewer is excellent. I used the free version for years, and recently have upgraded to PRO.

Regarding OCR, it is generally fine for most purposes. Where it begins to have problems is with poorly scanned texts. For those situations I use ABBYY FineReader, and there can be a big difference: where PDF-XChange Viewer might have a 60-70% success rate in recognising text (which is basically unusable, as you can't understand a sentence where a third of the words are unintelligible), FineReader produces a 99.99% correct OCR. But these are marginal cases I'm talking about (book pages scanned at a low resolution).

...Regarding OCR, it is generally fine for most purposes. Where it begins to have problems is with poorly scanned texts. For those situations I use ABBYY FineReader, and there can be a big difference: where PDF-XChange Viewer might have a 60-70% success rate in recognising text (which is basically unusable, as you can't understand a sentence where a third of the words are unintelligible), FineReader produces a 99.99% correct OCR. But these are marginal cases I'm talking about (book pages scanned at a low resolution).

Update 2013-09-10: PDF X-Change restores PDF attributes in explorer detailed view mode.This is a bonus that I had been unaware of. I have updated the opening post mini-review.As described in the xplorer² blog on 2013-09-08:(Copied below sans embedded hyperlinks/images.)

Quote

See PDF details (Subject/Keywords/etc) in windows explorer and xplorer²In many respects windows XP was the pinnacle of shell integration. Things were simpler but just worked. With the onslaught of windows vista (and 7/8) things got very complicated, the documentation available for the new shell features was poor to non-existent and many things just stopped working like shell column handlers.

Adobe PDF reader quickly lost interest in PDF metadata and attributes like Author or Subject (what you see in detailed view mode in windows explorer). People who used this information to browse and organize their scanned PDF documents in windows XP immediately felt the problem. In later windows you can only see such properties using the PDF property page, which isn't very convenient for large collections of documents and neither you can search for such PDF information.

You can bring back this information in your windows explorer and other shell-aware file managers like xplorer² by installing PDF XChange viewer. Unlike Adobe, these guys know all about offering quality shell extensions that work both in 32 and 64 bit windows. When you install PDF XChange viewer you can tick off all components except for Shell extensions — if you are after a lightweight solution. This part of the program is completely free for all uses (including commercial).

As you can see in the picture to the right, after installing this tool the PDF columns come back to life and you can browse the missing details. Note that the Keyword property still isn't available in windows explorer but you can see it in xplorer². Moreover with xplorer² you can search and filter using such PDF properties as a rule.