Now that I can navigate a Web page via WWW::Mechanize and get information via HTML::TreeBuilder::XPath by accessing an id, I am left using Firebug to read the DOM in order to discover the layout of the HTML tree. The content that Mechanize captures is unstructured HTML, not good for human eyes.
Is using Firebug to ascertain the id I am after a typical approach? Once I get the id then I'm good to go, it's just that I've got several ids and pages with more ids to chase down and I was hoping to get (dump, print, etc.) a formatted layout of the DOM in order to make that discovery easier. Though granted, Firebug makes it pretty easy, too. I'm just wondering if I am missing an easier method.
If you need text,
xmllint --html --format
(comes with libxml2) does a decent job.If you want a tree and mess with it and test out various expressions in a GUI, then Xacobeo is your new best friend.
Note: since both those tools rely on libxml, replace HTML::TreeBuilder::XPath with HTML::TreeBuilder::LibXML for compatibility. Evaluating XPath will be faster that way, too.
If you know Javascript/JQuery, then also install FireQuery. You can then test out CSS expressions in FireBug, and use them with modules that select HTML through CSS expressions, e.g. Web::Query.