I'm currently working on a search function in Angular designed to highlight specific words or phrases within HTML text. The challenge arises because the text to be searched includes HTML elements. When searching for elements like class, <span>, etc., the text is displayed including the HTML tags, because I'm inserting the HTML text into my application using innerHTML. The reason behind this is that I receive formatted HTML pages that need to be displayed in my application, including formatting like bold or italic text etc..
The issue is that during the search process, the HTML tags are made visible and are not ignored, while I also need to insert <mark> tags to highlight the searched words or text passages without destroying the HTML code's formatting.
Examples:
Search input: Bone
incoming html: <p> bla bla <span class="italic">Bone</span>
Expected result: <p> bla bla <span class="italic"><mark>Bone</mark></span>
Search input: bla
incoming html: <p> bla bla <span class="italic">Bone</span>
Expected result: <p> <mark>bla</mark> <mark>bla</mark> <span class="italic">Bone</span>
Search input: bla bone
incoming html: <p> bla bla <span class="italic">Bone</span>
Expected result: <p>bla <mark>bla</mark> <span class="italic"><mark>Bone</mark></span>
My question is: How can I implement the search function in such a way that it ignores HTML tags during the search and does not make them visible, while correctly setting the <mark> tags to highlight the searched word or text without affecting the HTML formatting?
Edit and solution (for me):
transform(html: string, searchText: string): any {
if (!searchText) {
return this.sanitizer.bypassSecurityTrustHtml(html);
}
const parser = new DOMParser();
const doc = parser.parseFromString(html, 'text/html');
searchText.split(/\s+/).forEach((term) => {
if (term) {
this.highlightText(doc.body, term);
}
});
return this.sanitizer.bypassSecurityTrustHtml(doc.body.innerHTML);
}
public highlightText(node: Node, searchText: string): void {
const escapedText = this.escapeSpecialCharactersFromText(searchText);
if (node.nodeType === Node.TEXT_NODE) {
const match = node.textContent.match(escapedText);
if (match) {
this.replaceMatchedTextWithMarker(node, escapedText);
}
} else if (node.nodeType === Node.ELEMENT_NODE) {
this.goRecursivelyThroughTheNode(node, searchText);
}
}
private goRecursivelyThroughTheNode(node: Node, searchText: string) {
Array.from(node.childNodes).forEach((child) => this.highlightText(child, searchText));
}
private escapeSpecialCharactersFromText(searchText: string) {
return new RegExp(`(${searchText.replace(/[-/\\^$*+?.()|[\]{}]/g, '\\$&')})`, 'gi');
}
private replaceMatchedTextWithMarker(node: Node, searchRegex: RegExp) {
const highlightedHtml = node.textContent.replace(searchRegex, '<mark>$1</mark>');
const span = document.createElement('span');
span.innerHTML = highlightedHtml;
node.parentNode.replaceChild(span, node);
}