JS bug: highlight words, insensitive

51 views Asked by At

It easily highlight words, framework free, case insensitive, even on html tagged code.

Could help improve that Pure JavaScript code?

Works, almost perfectly.

(document.body).realcar("word high"); 

Bug only happens when: user search for sequential words.

The second word is replaced to other word in the sentence, that came after " " (last space of the textual fragment). Because it is not replaced to the searched second word.

Works perfectly to search for: "light nos", on that case: Highlight <strong>nossa!</strong>

Bugged function:

    HTMLElement.prototype.realcar = function(word) { var el = this;
const wordss = word.trim().sanitiza().split(" ").filter(word1 => word1.length > 2);
  const expr = new RegExp(wordss.join('|'), 'ig'); 
  let expr00 = expr; 
  const RegExpUNICO=wordss; 
  const nodes = Array.from(el.childNodes); 

  for (let i = 0; i < nodes.length; i++) {
    const node = nodes[i];

    if (node.nodeType === 3) {
      const nodeValue = node.nodeValue;  let matches = [];
      while ((match = expr.exec((nodeValue).sanitiza())) !== null) {
        //console.log("++"+match); 
        matches.push(match[0]);
      const palavrar = nodeValue.substring(match.index, match.index+match[0].length);
    RegExpUNICO.push(palavrar); 
        }
      expr00 = RegExpUNICO.join('|');
      let expr0 = new RegExp(expr00, 'ig');
      console.log("**"+expr00);
       
      
      
      if (matches) {
        const parts = nodeValue.split(expr0);

        for (let n = 0; n < parts.length; n++) {
          if (n) {
            const xx = document.createElement("hightx");
            xx.style.border = '1px solid blue';
            xx.style.backgroundColor = '#ffea80'; 
            const startIndex = nodeValue.indexOf(parts[n - 1]) + parts[n - 1].length;
            const palavra = node.nodeValue.substr(startIndex, matches[n - 1].length);
            xx.appendChild(document.createTextNode(palavra));
            el.insertBefore(xx, node);
          }

          if (parts[n]) {
            el.insertBefore(document.createTextNode(parts[n]), node);
          }
        }

        el.removeChild(node);
      }
    } else {
      node.realcar(word);
    }
  } 
}

Try the code here: https://jsbin.com/vegihit/edit?js,console,output Please help debug and fix it!

1

There are 1 answers

1
trincot On

The main issue is in this line:

const startIndex = nodeValue.indexOf(parts[n - 1]) + parts[n - 1].length;

This assumes that parts[n - 1] will only occur once in nodeValue, but this is not guaranteed. For instance, parts[n - 1] could be a single space, and have an occurrence earlier in the string, and then the wrong word will be picked up.

Another, less problematic, issue is that if (matches) will always be true, since even an empty array is a truthy value. This should be if (matches.length).

To solve the first problem there are several possible solutions: your aim is to know which is the next, adjacent subtring to extract from nodeValue. You can achieve this by including the separators in the output of nodeValue.split(expr0). You can do that by having a capture group in your regex. This will mean your loop will iterate both the not-matched substrings as well as the matched substrings -- covering the whole string without omissions.

Here is the correction of the relevant code:

if (matches.length) { // Must test the .length
    // Move the creation of `expr0` here :
    // Create a capture group by adding parentheses: 
    const expr00 = "(" + RegExpUNICO.join('|') + ")";
    const expr0 = new RegExp(expr00, 'ig');
    const parts = nodeValue.split(expr0);
    
    for (let n = 0; n < parts.length; n++) {
        const textNode = document.createTextNode(parts[n]);
        if (n % 2) { // A matched term; always at odd index because of capture group
            const xx = document.createElement("hightx");
            xx.style.border = '1px solid blue';
            xx.style.backgroundColor = '#ffea80'; 
            // No more need to determine an index or length: we have the exact substring
            xx.appendChild(textNode);
            el.insertBefore(xx, node);
        } else if (parts[n]) { // A (nonempty) substring that is not part of a match
            el.insertBefore(textNode, node);
        }
    }
    el.removeChild(node);
}