Format Guide: HTML

If you have access to an HTML file of the open textbook you’d like to edit, you have options. Guidance for three of those options is included in this format guide: 1) using an HTML editor, 2) importing HTML into Pressbooks, and 3) converting HTML to a PDF so that you can then extract the contents into Microsoft Word.

Using an HTML Editor

There are many HTML editing programs. Which program you select will depend on the editor’s user friendliness and your HTML fluency. In order to use an HTML editor, you must save the original content as HTML in order to upload the file into an HTML editor. Wikipedia offers an HTML editor comparison chart.

Importing HTML into Pressbooks

Consult the Open Textbook Repository – Import Process by Shane Nackerud and Eric Wigham at the University of Minnesota for step-by-step guidance on importing openly licensed HTML into Pressbooks. The instructions assume that you’ve already created a place within Pressbooks to put your files. After you import the content into Pressbooks, cleanup will be required.

Converting to PDF

It’s possible to convert an HTML textbook to a PDF and capture the table of contents, document structure, and other formatting. Once you have a PDF, you can convert or extract the contents to an editable Microsoft Word file. Adobe offers instructions for this process.

You can always select and right click to convert individual pages without Adobe Acrobat Pro DC, but if you want a full site document concatenated, you’ll need Acrobat Pro.

To capture an entire website as a PDF using Acrobat Pro:

Open Acrobat Pro

Click shift+ctrl+o

Click on the “Capture Multiple Levels” icon and select the “Get entire site” radio button

NOTE: Make sure you have a bit of hard drive space before you start. It can take a bit of time and space to complete this transaction.

You can also save any single web page as an HTML web page and then open it directly in Microsoft Word for editing, but the layout isn’t as well preserved as when you use Acrobat Pro to create a PDF. This option works best for quickly editing pages that have no embedded images. Credit: Monica Marlo, Portland Community College.