Doc2txt is an rc(1) script that uses olefs and mswordstrings to
extract the printable text from the body of a Microsoft Word document
and write it on the standard output. Doc2ps is similar, but emits
PostScript corresponding to the document. Wdoc2txt is similar
to doc2txt, but uses plumb(1) to send the output to a
new acme(1) window instead. Xls2txt performs a similar function
for Microsoft Excel documents.

Microsoft Office documents are stored in OLE (Object Linking and
Embedding) format, which is a scaled down version of Microsoft's
FAT file system. Olefs presents the contents of an MS Office document
as a file system on mtpt, which defaults to /mnt/doc. Mswordstrings
or msexceltables may then be used to parse
the files inside, extracting a text stream. Msexceltables may
be given options to control the formatting of its output.
–a Attempt conversion of non–tabular sheets in the workbook (charts).
–ddelim Sets the inter–field delimiter to the string delim, by default
a single space.
–D Enables debugging output.
–crangeRange is a comma–separated list of column numbers and ranges.
Ranges are separated by dashes. Limit processing to just those
columns named; by default all columns are output.
–n Disables field padding to column width.
–q Disable quoting of textural fields (see quote(2).)
–t Truncate fields to the column width.
–wrangeRange is a comma–separated list of worksheet numbers and
ranges, this limits the sheets output using the same syntax as
the –c option above. Suppressed chart pages are always included
in the sheet count.