Using phantomjs, it's possible to get access to a copy of the modified DOM, post-parsing. Using a cURL call you can get access to the page pre-parsing. In the pre-parsed code, you may find errors which are corrected by a browser.
How do you get access to both the post-rendered changes and the pre-rendered content to make a comparison of the fixes the browser does automatically?
Is the best method to use DIFF on the two files or does phantomjs hold two copies of the content, the original and the modified forms? I can't seem to find the right way to phrase this to get an answer via google and a search here: https://stackoverflow.com/search?q=[phantomjs]+save+unaltered+source didn't turn up any results.
I'd like to avoid a second call to the same page for bandwidth/efficiency reasons.
There is no way to directly access the unaltered source (referred to as view-source in other browsers) in PhantomJS.
You could try to read the page from the PhantomJS cache (when run with the
--disk-cache=true
option), but there is an easier method. You can simply sent an AJAX request to get the source "on the wire", but then you would need to handle redirect yourself.You can already see with this simple script that the two files are different despite not involving JavaScript.
You might need to run with the
--web-security=false
option. Instead of passing theurl
into theget()
function, you may directly accesspage.url
: