Regex to Find and Remove HTML ID Attribute That Starts With or Contains Specific Word

40 views Asked by At

This is what I currently have:

<h2 id="bookmark-" class="xyz">
<h3 id="bookmark-" class="xyz">
<h4 id="bookmark-" class="xyz">
<h5 id="bookmark-" class="xyz">
<h6 id="bookmark-" class="xyz">

This is what the end result should be:

<h2 class="xyz">
<h3 class="xyz">
<h4 class="xyz">
<h5 class="xyz">
<h6 class="xyz">

Will this regex expression achieve what I need?

<id="bookmark-[^\"]*"
1

There are 1 answers

3
NemoRu On

You should use tools like this or any other regex tools that help in debugging. As I checked your regex there it will not do the job. [^\"]* would match 0 or infinite characters that are not ". That works great. But < in the beginning is ruining all your efforts. So /id="bookmark-[^\"]*/gm will work. I also gave you my variant how I tried to solve this regex before I understood why yours isn't working.

/id="bookmark-"/gm This regex will match you case only if it is matches exactly id="bookmark-".

/(id="bookmark-)[\w\s]*"/gm This regex will match will match your id and any other characters till ". So it will match this for example id="bookmark-faes f1 2332454".

[abc] - Matches either an a, b or c character.
\w - Matches any word character [a-zA-Z0-9_]
\s - Matches any space, tab or newline character.
* - Matches previous character(all inside [] in our case) 0 or infinite times.
Flags: g - global.
m - multi line.
Delimiters - /.

P.S. you probably shouldn't use regex to parse html.