A few days ago I needed to write some functionality to fetch an XML document from a URL and load it into an XmlDocument. As always I use the WebClient to retrieve simple documents over HTTP and it looked like this:

using (WebClient client = newWebClient())

{

string xml = client.DownloadString("http://example.com/doc.xml");

XmlDocument doc = newXmlDocument();

doc.LoadXml(xml);

}

I ran the function and got this very informative XmlException message: Data at the root level is invalid. Line 1, position 1. I’ve seen this error before so I knew immediately what the problem was. The XML document that was retrieved from the web had three strange characters in the very beginning of the document. It looks like this:

ï»¿<?xml version="1.0" encoding="utf-8"?>

Of course that result in an invalid XML document and that’s why it threw the exception. The three characters are actually a hex value (0xEFBBBF) of the preample of the encoding used by the document.

As said, I knew this error and also an easy way around still using the WebClient. Instead of retrieving the document string from the URL and load it into the XmlDocument using its LoadXml method, the easiest way is to retrieve the response stream and use the Load method of the XmlDocument instead. It could look like this:

using (WebClient client = newWebClient())

using (Stream stream = client.OpenRead("http://example.com/doc.xml"))

{

XmlDocument doc = newXmlDocument();

doc.Load(stream);

}

Often there are situations where the WebClient isn’t well suited for this or one might simply prefer to use the WebRequest and WebResponse classes. Still, the solution is very simple. Here is what it could look like: