Recognize Text & Objects in Graphical Images with PHP

21052008

An OCR with PHP ? it doesn’t sounds very common topic for PHP developers, but Andrey Kucherenko from Ukraine have made a very interesting project to realize the first phpOCR. His classes can recognize text in monochrome graphical images after a training phase. The training phase is necessary to let the class build recognition data structures from images that have known characters. The training data structures are used during the recognition process to attempt to identify text in real images using the corner algorithm.

PHPOCR have win the PHPClasses innovation awards of march 2006, and it shows the power of what could be implemented with PHP5.

Certain types of applications require reading text from documents that are stored as graphical images. That is the case of scanned documents.

An OCR (Optical Character Recognition) tool can be used to recover the original text that is written in scanned documents. These are sophisticated tools that are trained to recognize text in graphical images.

This class provides a base implementation for an OCR tool. It can be trained to learn how to recognize each letter drawn in an image. Then it can be used to recognize longer texts in real documents.

Like this:

Actions

Information

6 responses

While I appreciate the noble efforts of Andrey Kucherenko, writing image processing software in PHP will never match the speeds of a compiled language. This project is still quite young and requires a good deal of effort to actually do anything with. If you’re looking to get something working quickly, GOCR or Tesseract are much better options.