Search by class in Nokogiri nodeset

8.7k views Asked by At

I got the name of a CSS class from a Nokogiri node. Now I want to find all the nodes that also have the same class attached.

I don't know which HTML tag the element that I'm looking for has, and how deep it is. All i know is what class to search for.

I have already tried:

doc.xpath("//*[contains(@class, #{css})]")

But this seems to return WAY too many elements.

Also I have tried:

doc.xpath("//*[@class, #{css}]")

and this returns nothing.

I want to get the elements that contain that class, not every element that surrounds an element with that class.

Is it possible to do this with Nokogiri?

2

There are 2 answers

2
egwspiti On BEST ANSWER

Assuming that the class name is stored into class_name, I think that

doc.xpath("//*[contains(concat(' ', normalize-space(@class), ' '), ' #{class_name} ')]")

is what you're looking for.

This will match all the elements that contain class_name into their classes, ie if class_name is 'box', then it will match both elements like div class="box" and elements like div class="box left"

If you only want to match elements like div class="box" ie that have only one class and that class is the one you're looking for, then you could use this:

doc.xpath("//*[@class=\"#{class_name}\"]")
2
Jimeux On

As I said in my comment, .css() or .search() can find all elements of a given class.

Here's an example from a scraper I wrote a while ago. It finds the only .content div on the page (at() will select the first element only), and then finds all .col divs inside it. Then it loops through them and prints the title.

content = page.at('.content')
content.css('.col').each do |col|
    puts col.at('h5').text
end