Re: Index the document

Ok - so these are image files that are not in readable format (OCRed). Alfresco does not provide those tools out of the box, but there are plenty of options. You can integrate with another tool, like AWS Textract (I am not sure of your architecture - on premise or cloud, etc.). You can also use transformations to perform OCR with other tools.

However, based on what you are trying to do, the best method might be a capture (ingestion) provider - like Ephesoft. These tools can be trained to find specific information (by zone or surrounded text) and then optical character recognize the information and either save that at full text or apply the information found into particular custom metadata fields.

However, you will need another product to work in conjunction with Alfresco - or at least that is my experience.

We use cookies on this site to enhance your user experience

By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.