User Contributed Notes 6 notes

The XML parser converts the text of an XML document into UTF-8, even if you have set the character encoding of the XML, for example as a second parameter of the DOMDocument constructor. After parsing the XML with the load() command all its texts have been converted to UTF-8.

In case you append text nodes with special characters (e. g. Umlaut) to your XML document you should therefore use utf8_encode() with your text to convert it into UTF-8 before you append the text to the document. Otherwise you will get an error message like "output conversion failed due to conv error" at the save() command. See example below:

I just intend to bring my little contribution to the use of DOMDocument::save() function along with PHP serialize() function.

Sometimes you could have to serialize a PHP object before you put it into a XML structure (or other..). And then I suppose you could have to save this XML structure in a file. Of course, you will further need to read this file and unserialize it before to use its content.

The problem appears when there are some properties described as Protected in the object to be serialized.

As it’s written in the PHP documentation, serialize function add for those protected properties an asterisk before the name of the property along with a NULL character on each side of the asterisk. In this case, the DOMDocument::save function will stop the save operation just before the first NULL character met in the string to be saved, because PHP consider it as a potential risk. So, after that, the unserialize operation of the file become impossible.

Before the unserialize :$data_serial = explode("\x5C\x30\x2A\x5C\x30", $serialized_object); // take off the '\0*\0' string$serial_object = implode("\x00\x2A\x00", $data_serial); // and replace with 'NULL * NULL' string as included by the previous serialize

And a second solution :

For the serialize, before the DOMDocument::save() :$serialized_object = addslashes(serialize($object)) ;

When creating a DOMDocument from scratch and saving it, the encoding will be utf-8, although it's declared as iso-8859-1.

When loading an XML-file declared and saved as iso-8859-1, php will keep the correct encoding when it's saved after changes are made.

Php converts files declared as iso-8859-1 to utf-8 internally. To add text containing special characters, the text must be encoded as utf-8. When the document is saved, special characters are converted to iso-8859-1.

To save an xml created from scratch, use fopen/fwrite and utf8_decode: