Hacking the Linux Desktop, Part 2

Editor's note: If you didn't get enough Linux tweaks last week from O'Reilly's Linux Desktop Hacks, here are two more hacks from the book to satiate your hacking needs.

View Microsoft Word Documents in a Terminal

Avoid the load time of OpenOffice.org and view
Microsoft Word documents in a terminal.

The simplest way to view a Microsoft Word
document in a terminal is to use the catdoc
command. But catdoc turns a Word document to plain
text, which does little or nothing to preserve the format of the
original Word document. Obviously, it's nearly
impossible to view a Word document in a terminal exactly the way it
would look in Word. Heck, competing word processors have trouble
importing Word documents without upsetting the format, and they have
the advantage of being a graphical desktop application. But this hack
is still a vast improvement over the popular
catdoc program, because it preserves at least
some of the formatting of the original document by converting the
Word document to HTML.

You'll need both the wvWare set
of file conversion utilities and the hybrid web browser/pager
w3m, along with a little scripting magic to view
Word documents in a terminal or console while retaining at least some
of the original formatting.

wv, the All-Purpose Word Converter

There is a way to retain at least some of the
original formatting while printing the document to the screen. For
this, you need a set
of utilities under the name of wvWare. You can
find the home page for wvWare at
http://wvware.sourceforge.net.
Packages of wvWare are readily available for almost all
Linux distributions, although the package name is usually just
wv. For example, if you don't
already have it installed on your system, you can install
wv in Debian Linux with this command:

# apt-get install wv

Users of the yum package can get the RPM version
of wv with this command:

# yum install wv

w3m, the All-Purpose Web Browser/Pager

That's not all you need for this hack. You also need a popular
pager/browser called w3m. Packages of
w3m should be available for most Linux
distributions, and the package name is usually
w3m. For example, you can install
w3m in Debian Linux with this command:

# apt-get install w3m

Users of the yum package can get the RPM version
of w3m with:

# yum install w3m

The w3m program is rather unique in that it is a
web browser that works like a pager--that is, you can pipe text
into w3m and use w3m to
simply page back and forth through the text. Some versions of
w3m even render graphics in a frame-buffer
console without having an X Windows desktop running.

You can combine the two utilities to get the desired result of
viewing a Word document in a terminal. Use
wvWare to convert a Microsoft Word document to
HTML format, and then pipe the output into the
w3m pager to view it. Here's
the full command you need to make it work (this command assumes
wvHtml.xml is stored in the
/usr/lib/wv directory, which might not be the
case on your Linux system):

$ wvWare -x /usr/lib/wv/wvHtml.xml document.doc | w3m -T text/html

That's a lot of typing every time you want to view a
Word document, so turn it into a script called
viewdoc to make it easier to use in the future.
Log in as root and use your favorite editor to create the following
script:

Note the one subtle addition, 2>/dev/null.
This simply redirects any error messages to the twilight zone so that
they do not interfere with the presentation of the Word document.
Store it as /usr/local/bin/viewdoc and make the
script executable with this command:

# chmod +x /usr/local/bin/viewdoc

Now all you have to do to view a Word document in a text console or
terminal is issue this command:

$ viewdocdocument.doc

Not only does this technique preserve at least some of the formatting
of a Word document, but also, hyperlinks are live and you can
activate them to visit the URL from within the w3m viewer
you're using to view the document. Figure 7-3 shows an example of a Word document viewed
with w3m. Note both the bold headings and the
live link to http://www.bootsplash.de/files.