User Contributed Notes 19 notes

Instead of passing a URL, we can pass the XML content to this class (either you
want to use CURL, Socks or fopen to retrieve it first) and instead of using
array, I'm using separator '|' to identify which data to get (in order to make
it short to retrieve a complex XML data). Here is my class with built-in fopen
which you can pass URL or you can pass the content instead :

One note about magic quotes: magic_quotes_runtime needs to be disabled before parsing XML. It can cause strange errors during parsing. Just add the following at the top of your program:

set_magic_quotes_runtime(0);

If you need magic quotes you can use stripslashes or save the current magic quotes setting with get_magic_quotes_runtime() then disable and parse your XML and then restore the previous magic quotes setting.

Just improving a little bit on the code examples from tgrabietz and randlem below... everything in one pretty class, plus some checks in place so that the element data doesnt get split up (thanks to flobee on the xml_set_character_data_handler page)

<?php

/* Usage Grab some XML data, either from a file, URL, etc. however you want. Assume storage in $strYourXML;

This is very simple way to convert all applicable objects into associative array. This works with not only SimpleXML but any kind of object. The input can be either array or object. This function also takes an options parameter as array of indices to be excluded in the return array. And keep in mind, this returns only the array of non-static and accessible variables of the object since using the function get_object_vars().

To solve this, I had to push all of the open tags onto a separate stack, add any tag data to the tail end of an attributed multi-dimensional array, and then pop the tag name off of the stack once it was closed...

bbellwfu's code does not handle 'text nodes' properly.Consider the innards of a tag like <root>xxx<tag2/>yyy</root>The 'tagData' for root will be "xxxyyy" and you have lost all information about where "tag2" was in that sequence.

What this does is adds 'textnodes' as children of its containing parent, *in the right sequence* (rather like the internet browsers do it). This then lets you do some more sensible secondary work like recursively looking up internal references within the document...

This page has been a great help! I've adapted the examples below to make a class to parse an X(HT)ML file to a multidimensional array.

Spaces, tabs, breaks, etc. are included in the array as TEXT_NODE, since in XHTML they may be functional. A function to trim them can easily be added if so desired.

I've written my class to store multiple tags with the same name. In order to do this I've used nested arrays with the key NODES to store the order in which tags and data were parsed. These NODES arrays can be used as a blueprint to reconstruct the X(HT)ML part of the document in it's entirity, including formatting. (Doctype will have to be added for validity).

some findings that i would like to share: - <![cdata[my value here]]> (does not work on property value - xml file must be htmlentity based (if not using cdata) - xml line feed on node data seems to be double line feed on windows (still figuring why) - xml line feed on attribute value seems to be ignored...

I wanted to create a really simple XML parser, but I found the array management in xml_parse a bit daunting. So I flattened my XML and parsed it using string matching. It wouldn't be difficult to add xml depth (of 2 plus levels) by modifying the parsedXML array.