Strip JavaScript from HTML DOM Tree with JavaScript

109 views Asked by At

How can one possibly sanitize a HTML DOM Tree from all JavaScript occurrences, meaning: on(click|mouseover|etc), href:javascript..., <script> and all other possible variants of (inline) JavaScript, while using JavaScript?

For example: I want users to upload their HTML file, copy the contents in the <body> tags and insert it into one of my pages. I don't want to allow JavaScript in their HTML files. I could use <iframe sandbox>, but I wondered whether there is another way.

1

There are 1 answers

0
Rick Hitchcock On BEST ANSWER

The following uses the Element.attributes collection to remove inline on handlers and attributes containing the word, "javascript" – without affecting other DOM attributes.

function disableInlineJS() {
  var obj = document.querySelectorAll('*');

  for (var i = 0; i < obj.length; i++) {
    for (var j in obj[i].attributes) {
      var attr = obj[i].attributes[j];
      if ((attr.name && attr.name.indexOf('on') === 0) ||
          (attr.value && attr.value.toLowerCase().indexOf('javascript') > -1)
         ) {
        attr.value= '';
      }
    }
  }
}
<button onclick="disableInlineJS()">Disable inline JavaScript</button><hr>

<div onmouseover="this.style.background= 'yellow';" ONmouseout="this.style.background= '';" style="font-size:25px; cursor: pointer;">
  Hover me
  <br>
  <a href="javAsCriPT:alert('gotcha')" style="font-weight:bold">Click me!</a>
  <br>
  <a href="http://example.com">Example.com!</a>
</div>
<button onclick="alert('gotcha')">Me, me!</button>

I don't think there's a way to remove script elements before they've had a chance to run.