HTML/jQuery - Preventing empty text nodes

5.2k views Asked by At

I'm trying to prevent the creation of empty text nodes whenever I add

  • newlines,
  • spaces (including non-breaking spaces)
  • tabs

to my HTML structure.

Eg.

<div><div>    <!-- Node not created -->
<div>
                <!-- Node created -->
<div>

Eg1.

<div><div>    <!-- Node not created -->
<div> <div>     <!-- Node created -->

Eg2.

<div><div>    <!-- Node not created -->
<div>            <!-- Node created -->
    <div>
                 <!-- Node created -->
    </div>    
<div>            <!-- Node created -->

Here, for a better undestading: see what happens into the first <DIV>s - jsFiddle

4

There are 4 answers

0
Kushagra Gour On

You can run a simple cleaning function on your DOM:

function cleanNode(node) {
 var child;
 for (var i = node.childNodes.length; i--;) {
  child = node.childNodes[i];
  // If commentt/textNode and has no non-whitespace character in it, delete it.
  if (child.nodeType === 3 || child.nodeType === 8 && !/\S/.test(child.nodeValue)) {
   node.removeChild(child);
   n--;
  }
  else {
   cleanNode(child);
  } 
 }
}
4
jasonslyvia On

Every time you manipulate the DOM, call node.normalize() to the parent node, it will do the job.

See more at Node.normalize

UPDATED

According to the fiddle you provide, I take a deep look into this issue, I run following code in Chrome 29, based on your html structure.

var i = 0;
function traverse(node){
    if (node.firstChild) {
        traverse(node.firstChild);
    }

    if (node.nodeType === 3) {
        console.log("text node " + ++i + ": " + node.nodeValue);
        if (node.nodeValue !== '') {
            console.log("text node " + i + " is not null");
        }
        if (node.nodeValue.match(/(\r\n|\r|\n)+/g)) {
            console.log("nonsense node");
        }
    }

    if (node.nextSibling) {
        traverse(node.nextSibling);
    }
}
document.addEventListener("DOMContentLoaded", doTraverse);

function doTraverse(){
    traverse(document.getElementsByClassName("selector")[0]);
}

and get these results:

text node 1: 

text node 1 is not null
nonsense node
text node 2: 

text node 2 is not null 
nonsense node 
text node 3: 

text node 3 is not null 
nonsense node 
text node 4: 

text node 4 is not null 
nonsense node 
text node 5: 
            text

text node 5 is not null 
text node 6: a paragraph 
text node 6 is not null
text node 7: 

text node 7 is not null 
nonsense node 
text node 8: a paragraph 
text node 8 is not null 
text node 9: a paragraph 
text node 9 is not null 
text node 10: 
             more text

text node 10 is not null 
text node 11: 


text node 11 is not null 
nonsense node 

to our surprise, there are way more empty text node than we expect. However, if we inspect these elements in Chrome's inspector, everything seems working fine. I guess Chrome optimizes this issue in rendering engine but not in DOM.

We have a clear idea that those empty text nodes actually contains linebreaks, which makes your code doesn't work as you expect. So if you do want to remove those text nodes, you can traverse the DOM , find them and remove them(which will be very inefficient and I appreciate better solution).

BTW, in this scenario, node.normalize() won't work for you since it just remove real empty text node.

2
DGS On

Add comments if you think it nessesary

<ul>
  <li>one</li><!--
  --><li>two</li><!--
  --><li>three</li>
</ul>

This will make the white space commented out and not insert a text node

from http://css-tricks.com/fighting-the-space-between-inline-block-elements/

eg 1

<div><div><!-- 
--><div><!--

--><div>

eg 2

<div><div><!--
--><div><!-- --><div>

eg 3

<div><div><!--
--><div><!--
    --><div><!--

    --></div><!--    
--><div>  

Makes your markup fairly ugly but removes the text nodes

0
12Me21 On

Using NodeIterator to remove the blank text nodes:
(this code works in browsers from ~2015+)

function cleanup() {
    document.createNodeIterator(
        document.body, NodeFilter.SHOW_TEXT,
        {acceptNode(node) {
            if (node.data.trim()=="") node.remove()
            return NodeFilter.FILTER_REJECT
        }}
    ).nextNode()
}

(Note: normally you would call .nextNode() in a loop, but by returning REJECT from the acceptNode function, you only have to call it once)

You should call cleanup() as soon as possible after the html is parsed. ex: using <script defer>, or a <script> at the end of the document. Or, if that isn't possible:

if (document.readyState=='loading')
    document.addEventListener('DOMContentLoaded', cleanup)
else
    cleanup()