Skipping Html Content in Tag attributes

455 views Asked by At

I am using SAX Parser to parse following piece of data with "Description" attribute containing HTML content . But I am getting error "The value of attribute "Description" associated with an element type "null" must not contain the '<' character".

How to make SAX Parser ignore this tag while XML Processing?

<Thread ThreadID="22" Title="google"
                    Description="<a href="http://google.com/">http://google.com/</a>"
                    DisplayName="Sam" LoginID="hjaja" UserEmailID="abx@ers"
                    UserSapCode="12345"
                    IsAnonymous="Yes" CreatedDate="2015-04-29T21:56:04.943" ReplyCount="0"
                    ViewCount="0" PopularityPoints="0" LastUpdatedBy="" LastPostDate="" />

Thanks in advance.

2

There are 2 answers

0
crigore On BEST ANSWER

I really thing that you should take a look at this post (HTML code inside XML) to see how other people recommended to tackle such problem.

0
Uday Shankar On

No XML parser can parse this data as the data do not comply the xml format. Please refer XML specifications.

There are two ways you can solve this:

  1. Change the source format

Change the source to create the proper XML. You can include HTMLs by escaping the characters using these:

"   &quot;
'   &apos;
<   &lt;
>   &gt;
&   &amp;
  1. Change the target algo

Second is by creating your own parsing algorithm for you case.

Usually answer is always the the first one.