I'm now making a web crawler.
getting a link from HTML is easy part but acquiring a link from the result of javascript is not easy for me.
Can I get the result of javascript so as to know where a link is referred to?
for example.
How can I retrieve the link to google.com from javascript code in Python?
<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<a href="#" id="goog">to google</a>
</body>
<script>
document.getElementById('goog').onclick = function() {
window.location = "http://google.com";
};
</script>
</html>
You would need to install node.js and run a separate piece of code that executes the Javascript code in context to emit the html. This is possible using
jsdom
but the key to it is extracting the Javascript code from the HTML page, and setting up the context correctly.