pdftohtml

NAME

SYNOPSIS

pdftohtml
[options] PDF-file HTML-dir

DESCRIPTION

Pdftohtml
converts Portable Document Format (PDF) files to HTML.

Pdftohtml reads
the PDF file, PDF-file, and places an HTML file for
each page, along with auxiliary images in the directory,
HTML-dir. The HTML directory will be created; if it
already exists, pdftohtml will report an error.

CONFIGURATION FILE

Pdftohtml reads
a configuration file at startup. It first tries to find the
user’s private config file, ~/.xpdfrc. If that
doesn’t exist, it looks for a system-wide config file,
typically /usr/local/etc/xpdfrc (but this location can be
changed when pdftohtml is built). See the xpdfrc(5)
man page for details.

OPTIONS

Many of the
following options can be set with configuration file
commands. These are listed in square brackets with the
description of the corresponding command line option.
−fnumber

Specifies the first page to
convert.

−lnumber

Specifies the last page to
convert.

−znumber

Specifies the initial zoom
level. The default is 1.0, which means 72dpi, i.e., 1 point
in the PDF file will be 1 pixel in the HTML. Using ´-z
1.5’, for example, will make the initial view 50%
larger.

−rnumber

Specifies the resolution, in
DPI, for background images. This controls the pixel size of
the background image files. The initial zoom level is
controlled by the ´-z’ option. Specifying a
larger ´-r’ value will allow the viewer to zoom
in farther without upscaling artifacts in the
background.

Treat all text as invisible. By
default, regular (non-invisible) text is not drawn in the
background image, and is instead drawn with HTML on top of
the image. This option tells pdftohtml to include the
regular text in the background image, and then draw it as
transparent (alpha=0) HTML text.

−opwpassword

Specify the owner password for
the PDF file. Providing this will bypass all security
restrictions.

−upwpassword

Specify the user password for
the PDF file.

−q

Don’t print any messages or errors. [config file:
errQuiet]

−cfgconfig-file

Read config-file in
place of ~/.xpdfrc or the system-wide config file.

−v

Print copyright and version information.

−h

Print usage information. (−help and
−−help are equivalent.)

BUGS

Some PDF files
contain fonts whose encodings have been mangled beyond
recognition. There is no way (short of OCR) to extract text
from these files.