Why does this code keep triggering the SaxParseException : ""PI must not start with xml"?

7.1k views Asked by At

This code is used to generate a XML document from its String representation. It works fine in my small unit tests, but fails in my actual xml data. The line where it triggers is Document doc = db.parse(is);

Any ideas?

public static Document FromString(String xml)
{
    // from http://www.rgagnon.com/javadetails/java-0573.html
    try
    {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        InputSource is = new InputSource();
        is.setCharacterStream(new StringReader(xml));

        Document doc = db.parse(is);
        doc.normalize();

        return doc;
    }
    catch (Exception e)
    {
        Log.WriteError("Failed to parse XML", e, "XML.FromString(String)");
        return null;
    }
}
6

There are 6 answers

5
Kurru On BEST ANSWER

Thanks for your help everyone.

I discarded the <?xml version="1.0" encoding="utf-8"?> which cleared this error. Still don't understand what the reason for this might be, but it worked nonetheless.

I went on to find one of my buffered writers (when extracting from a zip file into memory) wasn't being flushed, which was causing the xml string to be incomplete.

Thanks everyone for your help!

0
Saran On

You should have checked the encoding of the file instead of discarding the xml line.

I have found that my Eclipse (on Windows) had the same problem with a resource encoded as Unix-U8. After converting it to DOS-U8, the error went away.

0
jowett On

as @StaxMan said, remove any unknown characters before

responseBody = responseBody.substring(responseBody.indexOf("<"));

0
Jorgesys On

this issue will be caused too by having the line < ?xml version="1.0" encoding="UTF-8"?> together with the xml data in the same line...

< ?xml version="1.0" encoding="UTF-8"?>< secciones>< seccion>< id>0< /id>< nombre>Portada< feedURL>http://iphone.elnorte.com/libre/online07/a ....

0
shaobin0604 On

You may check if your xml file has BOM header

0
Timo Bakx On

I had the same problem while parsing XML generated by PHP. After I added the ContentType header "text/xml" it works like a charm.