Has any work been done on using perl to recognize types of documents or pulling specific information out of documents?

For instance, looking for certain diognistic result info in medical documents. I am looking to build a system that can recognize key phrases in a document that has been processed by OCR and pull names, dates, and other info from the document. The system needs to be able to learn to some extent as well.