OCR Smackdown: ABBYY FineReader vs. Adobe Acrobat

A very common request that I get here at DocumentSnap is to compare the Optical Character Recognition (OCR) capabilities of ABBYY FineReader with Adobe Acrobat. Why? Well, for starters, both of them come included with models the Fujitsu ScanSnap as well as other scanners.

I decided to do a quick test comparing the OCR of the two packages using the following criteria:

Yes, I realize that Adobe Acrobat X is out, but since I am not aware of any scanners that come bundled with it yet, I decided to stick with the versions that ship with the ScanSnap. I’ll update Acrobat X in a later post.

The Document

I scanned a magazine article for this test. It probably would have been better to do this with a bunch of different documents to compare, but hey.

In all cases except one, I scanned without OCR so that I could run it standalone later. Here’s some info on the document that I used:

Pages: 2

Scan Quality: 300dpi, Color

Resulting File Size: 1.5 MB

Columns: 2, with some images

Maybe I am blind, but I couldn’t figure out a way to run ABBYY FineReader for ScanSnap on Windows standalone. If you know how, please leave a message in the comments. In that test, I re-scanned with “Create Searchable PDF” checked in the ScanSnap Manager settings.

The Settings

I tried not to do too many fancy settings to keep things as “real-life” as possible. There were essentially three configurations:

ABBYY FineReader

I set Save Mode to “Text under page image” and Quality to High. These were the settings for the Mac ABBYY, and I believe it is what ScanSnap Manager on Windows uses as well.

Adobe Acrobat (Normal)

I set the output style to “Searchable Image (Exact)” because leaving it just as Searchable Image in my experience has caused some weird things to happen with the resulting PDF. I used these settings on both Windows and Mac.

Adobe Acrobat (With ClearScan)

In Acrobat 9 there is a setting called ClearScan. I used that as an additional test to see what the difference is.

Speed

Windows

ABBYY Windows: 20.5 seconds

Acrobat 9: 13.9 seconds

Acrobat 9 With Clearscan: 17.6 seconds

Mac

ABBYY Mac: 44.7 seconds

Acrobat 8: 20.2 seconds

Winner: Acrobat!

Since they are different machines, you can’t directly compare the Windows and Mac times, but clearly in both cases Acrobat is faster.

File Size

The non-OCR’ed PDF was 1.5 MB.

Windows

ABBYY Windows: 1.7 MB (+.2 MB)

Acrobat 9: 1.5 MB (same)

Acrobat 9 With ClearScan: 315 KB (-1.16 MB)

Mac

ABBYY Mac: 1.4 MB (-.1 MB)

Acrobat 8: 1.5 MB (same)

Winner: Acrobat 9 with ClearScan!

With an astonishing 1.16 MB reduction in file size after OCR, Acrobat 9 with ClearScan is the winner. Wow.

Accuracy

Here is a passage from the article:

Let’s see how each of the packages did:

ABBYY Windows

The spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary strategic plans to financial statements. As with any familiar method, it finds its way into numerous situations where better alternatives are available, mostsignificantly in itswidespread use as a de facto reporting tool.
The appeal of the spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably the most comfortable environment for a lot of financial professionals,” Alok Ajmera, vice-president, professional services withMississauga, Ont.-basedProphixSoftware, says. “There’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Acrobat 9 Windows

T he spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary su·ategic plans to financial statements. As with any farniliar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of tlle spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably tlle most comfortable environment for a lot of financial professionals,” AJok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want witll tlle data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Acrobat 9 With ClearScan

The spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary su·ategic plans to financial statements. As with any farniliar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of tlle spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably tlle most comfortable environment for a lot of financial professionals,” AJok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want witll tlle data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

ABBYY Mac

The spreadsheet has become the virtual “slide rule” for CiMAs. It’s used for everything from preliminary strategic plans to financial statements. As with any familiar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of die spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably the most comfortable environment for a lot of financial professionals,” Alok Ajmera, vice-president, professional sendees with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Acrobat 8 Mac

T he spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything frorn preliminary strategic plans to financial statements. Aswith any familiar method, it finds its way into numerous situations where better alterna tives are available, most significantly in its widespread use as a de facto reporting tool.
T he appeal of the spreadsheet as the quickest
way to get a report out is not hard to appreciate.
“Excel is probably the most comfortable
environment for a lot of financial professionals,” avaJlaun:.:,JIIU:::’l;)It;IIIULauuy1111l::>WIUC::>PU:C1U uocd::>
a de facto reporting tool. T he appeal of the spreadsheet as the quickest
way to get a report out is not hard to appreciate. “Excel is probably me most comfortable environment for a lot of financial professionals,” AJok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “T here’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organiza tions.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Winner: ABBYY FineReader for Mac looks the best to me. Acrobat 8 on the Mac is pretty terrible (in this example anyways).

Conclusion

Is there a “best” choice? It seems that in this example anyways, Adobe Acrobat 9 with ClearScan turned on gives fast results with good OCR while dramatically reducing the file size.

If you don’t really care about speed so much, FineReader produces good OCR results and for ScanSnap users, has the additional benefit of being integrated with ScanSnap Manager.

As with most things, the best software is the one that works the best for you. Have you found similar results? Any other tests of your own to share? Leave a note in the comments.

About the Author

Brooks Duncan helps individuals and small businesses go paperless. He's been an accountant, a software developer, a manager in a very large corporation, and has run DocumentSnap since 2008. You can find Brooks on Twitter at @documentsnap or @brooksduncan. Thanks for stopping by.

Leave a Reply:

Can you thoughts only offer a few ones blogposts as long as My spouse and i produce credit ratings and also sources back to your website? My blog site is the same market seeing that your own house as well as my personal people would likely definitely benefit from many of the data you present below. You need to well then, i’ll know in the event this ok together with you. Appreciate it!

Leave a Reply:

for some years (from time to time not regularly ) I am reading your articles and I enjoy it!

I have been working with Devonthink Pro Office with Abbyy Finereader integrated.

The OCR results were quiet good. But I left Devonthink Pro Office – remark my son, he is an MBA engineer, did it for the same reasons – we both made the unhappy experiences to loose a lot of data when we had to reinstall OS X on our MACs or even had to upgrade Devonthink Pro Office versions.

For about 2 years I am using Evernote Premium and still my old Fuijitzu ScanSnap S510M.

Although Evernote has an integrated pdf-OCR converter-utility I do use some other apps, when I just want to convert some pdf documents already added to my iMac and want to produce searchable pdf- or text- docs.

I wouldn’t hesitate to buy Abby Finereader,, pdfpen Pro from smile or pdfconverter from Nuance if there weren’t drawbacks:
1. more than 100 € is quiet expensive, not compared to Adobe but to my income(!)
2. the resulting quality of the converted docs isn’t always convincing
3. the processing time is still quiet long
(remark the longest as you will know is with Evernote)

Time passed, and now in 2016 I wonder which program would be the best for me, my macs and my small purse.

Thank you very much in advance for your answer

Yours sincerely
￼
Werner L. Ende 2016-04-07 12.07.54

PS: I am sorry if there are errors in my english texts.

_______________________________________

Please consider:
english is my secondary language,
german is my first language.
_______________________________________

Leave a Reply:

Adobe should do Clearscan automatically when you are doing batch-OCR .
But what is the point of ClearScan? Almost each scanning program has a similar tool (already for the scan).
But you can also lose important data with ClearScan. So you should use it with care.

Leave a Reply:

Two pages simple document is not enough complex to compare these programs. Scan a document with multiple pages (40-50) that contain tables witch colored backgrounds, various text sizes and typefaces, other embedded objects like pictures with text on them, and then compare. I am sure you'll see then which software is better. To me, your article is absolute joke.

Leave a Reply:

I would suggest just cleaning it with warm water and a mild soap to remove any salt, and melt any ice that is stuck. You can soak it for a few minutes, then make sure it gets very dry, even between the pads.

Leave a Reply:

Awesome info…Nice to see perspective on how each work before investing the time and money to figure it out on my own! For now, I use an online OCR service that I recently discovered that's offered by Ricoh Innovations. http://beta.rii.ricoh.com/betalabs/content/docume…

Leave a Reply:

I like Dave's suggestion about reducing the file size using Adobe Acrobat 8. However, Ed's use of DevonThink Pro Office is appealing as a "one-button" solution. I am concerned about the file sizes of the PDFs I scan. Is there a file reduction process similar to "optimize scanned PDF" in the DevonThink Pro Office software?

Leave a Reply:

It's worth noting that DevonThink Pro Office (a Mac-only unstructured database) has the ABBYY Reader built-in, and applies it automatically when a ScanSnap (or other document scanner) is set up to send the scan to DevonThink. So, once set up, you get a one-button, automated, OCRed PDF of every document you scan.

Leave a Reply:

On Mac OS X, I've found a good compromise between accuracy and file size. First, I OCR the scanned document with ABBYY FineReader, then open it in Adobe Acrobat 8 to run through the "Optimize Scanned PDF" process. I get similar file size reductions, and the excellent accuracy of FineReader, even though it's an extra step in the process.

DocumentSnap was created by Brooks Duncan (that's me). I started it in 2008 as I was going through my paperless journey. Now I share what works (and what doesn't) so you know exactly how to go paperless yourself.