I have two web pages
Page 1:
<data>
<item>
<name>Item 1</name>
<url>http://someUrl.html</url>
</item>
</data>
Page 2: http://someUrl.html
<data>
<info>Info 1</info>
<info>Info 2</info>
<info>Info 3</info>
</data>
I want to crawl page 1 and follow all the links there and generate the following output
Item 1, Info 1
Item 1, Info 2
Item 1, Info 3
...
How can i achieve this using Xidel?
I recently found Xidel, so I'm no expert, but in my opinion it's an extremely powerful swiss-knife commandline scrape tool, that should be known by many more people.
Now, to answer your question I think the following (using html-templates) does exactly what you want:
Or, even shorter with CSS selectors:
Or, shortest with XPath:
The shortest line possible (but not in CSV format) would be:
The above commands are for Windows, so make sure to swap the quotes <-> double quotes when on mac/ux! If you need explanation for the different parts in the lines, just ask... :-) Cheers!