Thanks Jim.. In my case ALL of my scans done with the old ScanSnap FI-5110EOX had no text (e.g. no OCR done) as the software I was using for the past 7+ years didn't easily support it without buying a separate OCR package and writing some sort of scripts so they were all PDF image copies of the originals done using ScanTango.

I agree that IF you're scanning new material you will not need a pre/post OCR set of stages as it's all done at once. However, in my case where I was converting older documents over from the older scanner is when I noticed the fuzziness,etc..

As for the speed of the S1300, even at 100DPI the speed appeared to be the same -- something akin to about 15-20 seconds per page (duplexed of course) -- I just checked the specs and it's 4 pages per minute so that would be about right.

I've got a standing Craigslist search for an S1500M and if I find one for perhaps $200-$250 that works fine I may decide to upgrade to the much faster device.

If you buy one on Craigslist... be sure to check the version of Acrobat that is included. Also... it would be best if you knew that the installation was de-authorized on the old machine. When I bought my first S1500M... Acrobat 9 was shipping and the the ScanSnap shipped with A8. When I bought my second (for my daughter)... Acrobat X was shipping... and the Scansnap was shipping with A9. It seems that Fujitsu has a license to ship 1 generation old Acrobat. Given the price of Acrobat (about $450)... getting a one gen old version bundled in to the S1500M is still a great bargain.

I also did some PDF -> PDF+Text conversion... but not a lot. I honestly cannot say if significant image degradation occurred.

I'm coming in a bit late in this thread but the topic still seems relevant.. I recently purchased the Pro Office version of DevonThink specifically because it has the OCR feature to use in conjunction with our recently bought Scansnap S1300 (to replace our aging but still functioning Scansnap 5110EOX -- no decent drivers for it on the Mac that would work seamlessly with DTPO).. Anyway, we are slowly trying to get to be mostly paperless -- scanning bills, medical statements, importing bank estatements directly (they're much smaller than the scanned equivs),etc..

A few things I'm hesitant to destroy after scanning -- deeds in particular or papers from the local county/state office with an official seal -- obviously NOT birth certs!.. What do you all do with those more 'official' docs?

By the way.. For those of you that have the S1500, is the speed of scanning pretty decent? We are currently using the S1300 (smaller brother to the S1500) and it's pretty slow compared to our older FI-5110EOX Scansnap which was pretty darned fast (I think the S1500 is more or less a newer version of our old FI-5110).. Luckily I picked this S1300 up from Craigslist for $150 so the out of pocket wasn't bad -- the old FI-5110 scanner was close to $800 about 7-8 years ago! Ouch!

As for DTPO -- I'm using one database on the 2nd Mac drive and have imported ALL of my docs into it and I think I've converted most PDF's scanned by the old scanner into OCR'd "Pdf + text" equivs -- interestingly enough, the filesizes are HUGELY different between the originals and OCR'd versions.. With our old scanner I was using ScanTango at 200DPI.. It would create HUGE documents in the last few versions of OSX -- a 10 page doc could easily be 100Mb in size with no OCR -- that same document after DTPO got it transformed was perhaps 1-2Mb in size -- a huge difference.

Anyway, my DTPO database is a bit over 25Gb in size although after I empty the DTPO trashcan it will probably shrink by a gig or two after I delete the original (larger) PDF's before OCR was done.

A few questions for you all :

1) Is there a way to have newly scanned docs get filenames applied without a popup dialog box? Currently after OCR'ing is done I get a popup dialog asking for the filename + timestamp and DTPO will stop and wait until "OK" is pressed before continuing to OCR the next document in the queue.

2) I notice when converting a document to an OCR'd equivalent that occasionally the new OCR'd version of the document is a bit fuzzier than the original was -- is there a way to adjust this? There has been a few times where I've considered keeping the non-OCR'd version of the document because of this fuzzyness.

Also -- now that you're making your databases, please ensure that you back them up from time to time! I'm personally using CrashPlan to backup my stuff offline (and encrypted of course)..

With regard to the fuzziness you noticed on the post-ocr images, I think you've actually answered your own question earlier in your comment...

You noted that file sizes drop substantially after OCR, the reason for this is that the resulting pdf contains only a low-res image, intended primarily for use in verifying the accuracy of the OCR if there is any question later on. This makes your documents quick to index and small to store.

Also, with regard to managing certain official documents, we must accept that we really can't be truly paperless at this point. After scanning, my documents all get put in a 9x12x5 safe deposit box at my bank. I pay $50 a year for not having to worry about documents being destroyed at my home if anything were to happen.

With regard to the fuzziness you noticed on the post-ocr images, I think you've actually answered your own question earlier in your comment...

You noted that file sizes drop substantially after OCR, the reason for this is that the resulting pdf contains only a low-res image, intended primarily for use in verifying the accuracy of the OCR if there is any question later on. This makes your documents quick to index and small to store.

This is correct, and this behavior is also controllable. In DTPO, under Preferences->OCR, there is a setting to adjust the resolution of the converted scan; I think the default is 150dpi. However, there is a check box to leave the resolution coming off the ScanSnap unaltered if you wish to retain full image quality.