How do I use Hpricot to search the inner_text of all elements?

Question

How do I use Hpricot to search the inner_text of all elements?

139 views Asked by Jackson Henley At 17 September 2013 at 19:45

I would like to use Hpricot to scan the inner_text of all elements, and know what element is currently being scanned. However, each approach I have taken leads to a recursion. Is there a built-in function to do this with Hpricot (or Nokogiri)? The code below just scans one level down:

@t = []
doc = Hpricot(open("some html doc"))
(doc/"html").each do |e|
  e.children.each do |child|
    if child.is_a?(Hpricot::Text)
      @t << child.to_s.strip
    end
  end
end

Original Q&A

There are 1 answers

**Mark Thomas** · Accepted Answer · 2013-09-19T01:21:43+00:00

Although I'm not sure exactly why you want to collect all text nodes (perhaps there is a more efficient solution), this should get you started:

require 'nokogiri'
doc = Nokogiri::HTML(open('doc'))

doc.at_css("body").traverse do |node|
  puts "***#{node.name}"
  puts node.text
end

It uses Nokogiri's traverse which will visit all nodes under your starting node.

TechQA.

How do I use Hpricot to search the inner_text of all elements?

There are 1 answers

Related Questions in RUBY-ON-RAILS

Related Questions in RUBY

Related Questions in NOKOGIRI

Related Questions in HPRICOT

Popular Questions

Trending Questions