open_uri / Nokogiri redirection problems

2.6k views Asked by At

I am using Nokogiri for scraping a webpage that works fine unless the page has a redirection loop.

So when I scraping this site: https://www.cardcomplete.com/besuchen-isie-uns-auf-facebook/

I get this error

/home/balint/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/open-uri.rb:224:in open_loop': redirection forbidden: https://www.cardcomplete.com/besuchen-isie-uns-auf-facebook/ -> http://www.facebook.com/cardcomplete (RuntimeError)

But when I try to scrape this site I get the same error but now it is redirected to the https version of the facebook page:

/home/balint/.rvm/rubies/ruby-2.2.1/lib/ruby/2.2.0/open-uri.rb:224:in `open_loop': redirection forbidden: http://www.facebook.com/cardcomplete -> https://www.facebook.com/cardcomplete (RuntimeError)

Of course, scraping the https version of the facebook page works.

I installed this open_uri_redirections gem that works for the facebook http->https redirection but not for the first link:

doc = Nokogiri::HTML(open('https://www.cardcomplete.com/besuchen-isie-uns-auf-facebook/', :allow_redirections => :safe))

How to solve this?

0

There are 0 answers