2014年4月2日 星期三

[ Java 套件 ] PDFBox - Extract text from PDF file

Preface:The Apache PDFBox™ library is an open source Java tool for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command line utilities. Apache PDFBox is published under the Apache License v2.0. 這邊要來看如何利用這個套件, 將 PDF 中的文字內容給輸出.