Given a DOMDocument Object as a parameter, such as the below:
class Comparison {
public function __construct($domDocument=null){
$anchors = $domDocument->getElementsByTagName('a');
if($anchors && 0 < count($anchors)){
foreach($anchors as $anchor){
$original = ''; // Not sure how to get this
$ordered = $this->rearrangeAttributes($anchor);
$difference = $this->diff($original,$ordered);
echo 'Original Source: '.$original."\n";
echo 'Ordered Source: '.$ordered."\n";
echo 'Difference: '.$difference."\n\n";
}
}
}
}
How do you get the original HTML string indicated by $original?
My current approach is from here: http://php.net/manual/en/class.domnode.php
Try to get the parent of the node in question, get the innerHTML, however given that a certain degree of alteration happens on original source code in the conversion, it doesn't look like a robust way to do it. Are there ways to do this in a more effective fashion? I can pass in the raw HTML as well, but want to avoid the rabbit hole if there's a known solution.
UPDATE: If you want the parent source (cleaned) and the original doesn't matter, then Marc B's linked file is very useful: How to return outer html of DOMDocument?
But if you want the original source, you can try getting the line number http://php.net/manual/en/domnode.getlineno.php although, it's not clear if that's the cleaned source code or the original raw source code. Insight welcome!