How to Convert Word to HTML

Suggestions from 2006

1. Use the program HTML Tidy with its word2000 option

2. Find & Replace in Word: use the find function to find italized text, then use replace with <em>^&</em>. Then use a word macro, or script in your favorite text editor, to replace all the 'smart' quotation marks, character encodings, etc.

3. Save as an RTF file in word, then use DocFrac to convert it to HTML.

Suggestions from 2002

I don't know if this was in the discussion, but it should be
archived. An Office
HTML Filter can be downloaded from:

1. Install the HTML Filter software.
2. Open the document you want to save as HTML and filter.
3. On the File menu (in Word) point to Export To and click Compact
HTML (you can
also select a CSS file).

There are still some MS tags after filtering, but not nearly as
many. Macromedia
Dreamweaver also filters doc tags, but is very expensive software.

Sally

Tidy does a good job of fixing bad MS HTML.

Also OpenOffice seems to be able to open many MS documents and
although it's 'save as HTML' is also fairly crap at least it's free.

Chris

[1] http://tidy.sf.net/

[2] http://wwww.openoffice.org/

I never use word anymore - too many of these kinds of headaches. But
Abiword appears to give a good clean html ouput from a doc or rtf
file.

http://www.abiword.com/

Chris

HoT MetaL Pro is a great choice too. Unfortunately it is a quite
expensive tool.

It's current version is 6.0.3, but even 4.0, which I recieved as a
freebie in a
Norwegian computer magazine works fine (I've tried Open Office too,
but I like
HoT MetaL better). A trial version (30 days) is possible to download
from
www.hotmetalpro.com, though.

Mathais

I do it always in a quite simple manner.

1. Run M$ Word and save to the memo whole text
2. Run M$ WordPad and paste and another once cut
3. Run M$ FrontPage and paste

That's all. This allows me to save any italics and bold that
originally were in
the text. Besides that I like FpontPage as the transformation to MIA
CSS goes
soothly there and it doesn't add any rubbish HTML tags, does it?

Greetings,
Wojtek

Oh yes. I had a hell of a time with frontpage awhile back. Everything
looks ok if you don't venture outside of microsoft reality, but once
you
do your world falls apart. The problem being that if one is not
careful
creating a page with frontpage to view it properly is contingent on
having word installed. I would get all the word crap out and
frontpage
would automatically insert it again.

A few big problems,

1) mso fonts being a pain or impossible to get rid of.
2) Changing css (mia) does not get rid of microsoft css.
3) Microsost css used active x to display properly arrrrghhhhhhhhhhh

:-) nate

====>True, but in the case of DIRECT transformation from Word to
FrontPage,
you. If you want to do this properly, than just be sure that you don't
ommit
cut-paste in WordPad. If afterwards something iq not OK with font or
its size
then just select from their menus option "default font" and
"normal" respectively. If you want to know if it's possible or not to
make this, than just look at the codes of things published at Polish
Section MIA
- they're all in FrontPage. I like this programme so much because of
the easy
way in which MIA CSS can be inserted - just point the paragraph and
for expample
select "quote" from the list and all is done! Although this is a
product of one of the most vicious capitalist firm, tthis is a
relatively good
product. :)