Python RegEx for this HTML String

136 views Asked by At

I've got a string which is like that:

<span class=\"market_listing_price market_listing_price_with_fee\">\r
\t\t\t\t\t&#36;92.53 USD\t\t\t\t<\/span>

I need to find this string via RegEx. My try:

(^<span class=\\"market_listing_price market_listing_price_with_fee\\">\\r\\t\\t\\t\\t\\t&)

But my problem is, the count of "\t" and "\r" may vary.. And of course this is not the Regular Expression for the whole string.. Only for a part of it.

So, what's the correct and full RegEx for this string?

2

There are 2 answers

1
BlackM On BEST ANSWER

Answering your question about the Regex:

"market_listing_price market_listing_price_with_fee\\">[\\r]*[\\t]*&

This will catch the string you need. Even if you add more \t's or \r's. If you need to edit this Regex I advice you to visit this website and test-modify it. It will also help you to understand how regular expression works and build your own complete RegEx.

1
alecxe On

Since this is an HTML string, I would suggest using an HTML Parser like BeautifulSoup.

Here is an example approach finding the element by class attribute value using a CSS selector:

from bs4 import BeautifulSoup

data = "my HTML data" 

soup = BeautifulSoup(data)
result = soup.select("span.market_listing_price.market_listing_price_with_fee")

See also: