For the html, I need to add a p tag before and after img tabs. Each html includes multiple images.

For example:

<br><img id="aimg_uhkH3" class="zoom" src="../Images/0001.jpg" border="0" alt="" width="430" height="20"><br>
foo <img id="acvdojj2" class="zoom" src="../Images/0002.jpg" width="430" height="20" border="0" alt=""> foo 

Desired Result:

<br><p><img id="aimg_uhkH3" class="zoom" src="../Images/0001.jpg" border="0" alt="" width="430" height="20"><p><br>
foo <p><img id="acvdojj2" class="zoom" src="../Images/0002.jpg" width="430" height="20" border="0" alt=""><p> foo

I fail to get it with regex.

My Failed Code: (test_str is the html string)

re.sub(r'(<img.*>)','<p>\\1<p>',test_str)

My Failed Result:

<br><p><img id="aimg_uhkH3" class="zoom" src="../Images/0001.jpg" border="0" alt="" width="430" height="20"><br><p>
foo <p><img id="acvdojj2" class="zoom" src="../Images/0002.jpg" width="430" height="20" border="0" alt=""><p> foo

Any hints? Thanks in advance.

1 Answers

1
Keatinge On

Your match is terminating late. Using .*? will make your match end at the first > instead of the last >

re.sub(r'(<img.*?>)','<p>\\1<p>',test_str)