I am attempting to parse an HTML table using Nokogiri. The table is marked up well and has no structural issues except for table header is embedded as an actual row instead of using <thead>
. The problem I have is that I want every row but the first row, as I'm not interested in the header, but everything that follows instead. Here's an example of how the table is structured.
<table id="foo">
<tbody>
<tr class="headerrow">....</tr>
<tr class="row">...</tr>
<tr class="row_alternate">...</tr>
<tr class="row">...</tr>
<tr class="row_alternate">...</tr>
</tbody>
</table>
I'm interesting in grabbing only rows with the class row
and row_alternate
. However, this syntax is not legal in Nokogiri as far as I'm aware:
doc.css('.row .row_alternate').each do |a_row|
# do stuff with a_row
end
What's the best way to solve this with Nokogiri?
I would try this: