Aside from using ADO, is there any other way of extracting data from a PDF file using VB6? I have successfully managed to open a PDF file using VB6 without using an Acrobat Reader, but the next step in my project requires reading the PDF file and finding data from it. Example what pages show the word "cement"?

1 Answer
1

In general, you will need to rely on an external library. A pure VB solution (i.e. read file as text and process yourself) is nothing you pull off in a week.

You can use Adobe Acrobat via automation. An example to get you started is e.g. http://www.freevbcode.com/ShowCode.asp?ID=7066. Note however that the Adobe Reader is not sufficient, you really need full Acrobat. There are other popular PDF reading libraries (e.g. poppler), however you might have a hard time using those in VB6.

On a general remark, your chances of success depend on what you mean by "extract". Simply put,
PDF is a purely descriptive format without metainformation. I.e. the file contains instructions like "Put an A at (x1,y1); put 'foo' at (x2, y2)", etc. Reading tables or any kind of structured information would require huge amounts of heuristics.

The best course of action is probably to try and get the data you seek to extract in a better suited data format (plaintext, XML, whatever).