When attempting to load a page which is a CSV that has encoding of UTF-8, using Mechanize V2.5.1, I used the following code:
a.content_encoding_hooks << lambda{|httpagent, uri, response, body_io|
response['Content-Encoding'] = 'none' if response['Content-Encoding'].to_s == 'UTF-8'
}
p4 = a.get(redirect_url, nil, ['accept-encoding' => 'UTF-8'])
but I find that the content encoding hook is not being called and I get the following error and traceback:
/Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:787:in 'response_content_encoding': unsupported content-encoding: UTF-8 (Mechanize::Error)
from /Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:274:in 'fetch'
from /Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:949:in 'response_redirect'
from /Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:299:in 'fetch'
from /Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:949:in 'response_redirect'
from /Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize/http/agent.rb:299:in 'fetch'
from /Users/jackrg/.rbenv/versions/1.9.2-p290/lib/ruby/gems/1.9.1/gems/mechanize-2.5.1/lib/mechanize.rb:407:in 'get'
from prototype/test1.rb:307:in `<main>'
Does anyone have an idea why the content hook code is not firing and why I am getting the error?
What makes you think that?
The error message references this code:
So mechanize only recognizes the content encodings: '7bit', 'deflate', 'gzip', or 'x-gzip'.
From the HTTP/1.1 spec:
In other words, an http content encoding has nothing to do with ascii v. utf-8 v. latin-1.
In addition the source code for Mechanize::HTTP::Agent has this in it:
So it doesn't even look like you are calling the right hook.
Here is an example I got to work:
myprog.rb: