This guide shows instuctions about converting PDF into TXT with pdftotext utility. pdftotext is already built-in in Ubuntu along with poppler-utils package. Thanks to Poppler Project and Glyph & Cog for providing this utility.

Converting As Is

pdftotext <pdf_file_name> <txt_file_name>

Explanation: this command line will convert whole pages of pdf_file_name into a single file txt_file_name.

Converting with Following Original Text Layout

pdftotext -layout <pdf_file_name> <txt_file_name>

Explanation: this command line with -layout option will force txt_file_name to have same text layout with the original pdf layout.

Converting PDF to HTML

pdftotext -htmlmeta <pdf_file_name> <html_file_name>

Explanation: this command line will convert pdf_file_name into a HTML file.

Converting Only Particular Pages

pdftotext -f <number> -l <number> <pdf_file_name> <txt_file_name>

Explanation: this command line will specify first page number (-f) and the last (-l) to convert.