Marketing Specialist As you can see in " /> Marketing Specialist As you can see in " /> Marketing Specialist As you can see in "/>

Invalid predicate error due to double quote inside an html attribute

30 views Asked by At

I have the following html script:

<body class="item"> <a title="spre_|_"Marketing_|_Specialist""> Marketing Specialist </a> </body>

As you can see in <a> tag, the value of title attribute has double quotes inside the main quotes. When I use beautifulsoup to get the element using xpath, I keep getting this error:

  File "src/lxml/etree.pyx", line 2314, in lxml.etree._ElementTree.xpath
  File "src/lxml/xpath.pxi", line 357, in lxml.etree.XPathDocumentEvaluator.__call__
  File "src/lxml/xpath.pxi", line 225, in lxml.etree._XPathEvaluatorBase._handle_result
lxml.etree.XPathEvalError: Invalid predicate

This is my code:

from lxml import etree
from io import StringIO
html_ = """<body class="loop"> <a title="spre_|_"Marketing_|_Specialist""> Marketing Specialist </a> </body>"""
xpath = """//a[@title="spre_|_"Marketing_|_Specialist""][1]"""
print(etree.parse(StringIO(html_), etree.HTMLParser()).xpath(xpath))

I tried to escape the double quotes with \, but nothing change.

0

There are 0 answers