The SitePoint Forums have moved.

You can now find them here.
This forum is now closed to new posts, but you can browse existing content.
You can find out more information about the move and how to open a new account (if necessary) here.
If you get stuck you can get support by emailing forums@sitepoint.com

If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Dealing with accent marks

I have not been updated on how to deal with accent marks... Most of the sites I work with are in french and I've been manually changing all the &#233; to their anscii equivalent with the ampersign and number.. Do I really still have to change these manually or is there a magic phrase I can put into html so that I don't have to worry about the all the different accent characters?

In addition to what Tommy wrote, I personally prefer the named entity references, rather than the numbered ones, i.e. &eacute; rather than &#233; and &Egrave; rather than &#200; (mainly because they're easier to remember).

While I wouldn't change content that was being sent to me, when I am building pages, even with UTF-8 in the document, in the meta tag and on the server, someone somewhere is going to have some goofy machine that ignores all that for some reason and gets ??? instead. So, at least for things like headers, menus, footers, if not the content, I still use the ascii (the hex actually).

In addition to what Tommy wrote, I personally prefer the named entity references, rather than the numbered ones

That's fine, as long as you don't use XHTML and you're willing to take the (negligible) risk of problems in really old browsers.

Originally Posted by Stomme poes

Fortunately, as I understand it, UTF-8 is supposed to be the default for a UA if it doesn't know which charset a site is using?

I think most browsers use Windows-1252 (or possibly ISO 8859-1) as the default encoding, since that's usually the encoding in point-and-click publishing tools used by non-savvy authors who don't set up their servers properly.