In addition to replacing Microsoft Windows smart quotes, as sgaston demonstrated on 2006-02-13, I replace all other Microsoft Windows characters using suggestions[1] published by character code specialist[2] Jukka Korpela.

When having to deal with parsing an IIS4 or IIS5 metabase dump I wrote a simple function for converting those MS hexidecimal values into their ascii counter parts. Hopefully someone will find use for it.

Note that chr(10) is a 'line feed' and chr(13) is a 'carriage return' and they are not the same thing! I found this out while attempting to parse text from forms and text files for inclusion as HTML by replacing all the carriage returns with <BR>'s only to find after many head-scratchings that I should have been looking for line feeds. If anyone can shed some light on what the difference is, please do.

If you're planning on saving text from a form into a database for later display, you'll need to apply the following function so that it gets saved with the proper HTML tags.

<?php
$text = str_replace ( chr(10), "<BR>", $text );
?>

When you want to plug it back into that form for editing you need to convert it back.

I spent hours looking for a function which would take a numeric HTML entity value and output the appropriate UTF-8 bytes. I found this at another site and only had to modify it slightly; so I don't take credit for this.

Unicode version of chr() using mbstring<?phpfunction unichr($u) { return mb_convert_encoding(pack("N",$u), mb_internal_encoding(), 'UCS-4BE'); }?>It returns a string in internal encoding (possibly more than one byte).