How to remove every text from a website with Javascript

2.3k views Asked by At

I want to have a Javascript function that removes every text from a website. The background is that in order to compare the appearance of the rendered DOM in difference browsers, I need to eliminate obvious differences before. As font rendering is a known difference, I want to remove every text. The solutions I found were always like this:

if(start.nodeType === Node.TEXT_NODE) 
{
    start.parentNode.removeChild(start);
}

But this only removes pure text nodes. I also want to find constructs like:

 <div>
        <p>
             <em>28.11.2014</em>
             <img></img>
                Testtext
             <span>
                <i>Testtext</i>
                Testtext
             </span>
        </p>
  </div>

Where the element containing text also contains children like or . That way, the element is not recognized as a text node.

So I basically want to turn the above DOM into this:

 <div>
        <p>
             <em></em>
             <img></img>
             <span>
                <i></i>
             </span>
        </p>
  </div>
3

There are 3 answers

0
Givi On BEST ANSWER

You can try something like this.
Demo

HTML:

<div id="startFrom">
    <p>
        <em>28.11.2014</em>
            <img></img>
            Testtext
        <span>
            <i>Testtext</i>
            Testtext
        </span>
    </p>
</div>  

JavaScript:

var startFrom = document.getElementById("startFrom");

function traverseDom(node) {
    node = node.firstChild;
    while (node) {
        if (node.nodeType === 3) {
            node.data = "";
        }
        traverseDom(node);
        node = node.nextSibling;
    }
}

traverseDom(startFrom);
console.log(startFrom);
1
Sampath Liyanage On

With Jquery.. DEMO

$('selecter').find("*").contents().filter(function() {
    return this.nodeType == 3;
}).remove();
3
Timothy Ha On

This code below is roughly checked, but you can try to put it in an external .js file and execute it from your document at onload

function cleantxt()
{
    var htmlsrc = document.documentElement.outerHTML;
    var htmlnew = '';
    var istag = false;
    for(i=0; i<htmlsrc.length; i++) {
        if(htmlsrc.charAt(i)=='<') {
            istag = true;
            htmlnew = htmlnew + htmlsrc.charAt(i);
        }
        else if(htmlsrc.charAt(i)=='>') {
            istag = false;
            htmlnew = htmlnew + htmlsrc.charAt(i);
        }
        else if(istag) {
            htmlnew = htmlnew + htmlsrc.charAt(i);
        }
    }
    document.getElementsByTagName("html")[0].innerHTML = htmlnew + 'Cleaned'; // just a signature to see it works 
}