Mojo::UserAgent - Inspect the Content-Encoding header before decoding

339 views Asked by At

I'm attempting use Mojo::UserAgent to verify the gzip compression (Content-Encoding) of an application.

Unfortunately, it appears that this UA silently decodes the content and removes the Content-Encoding header afterwords.

The following is my minimal example

#!/usr/bin/env perl

use strict;
use warnings;

use Test::More tests => 3;

use Mojo::UserAgent;     # Version 8.26

my $ua = Mojo::UserAgent->new();

# As documented: https://docs.mojolicious.org/Mojolicious/Guides/Cookbook#Decorating-follow-up-requests
$ua->once(
    start => sub {
        my ( $ua, $tx ) = @_;
        $tx->req->headers->header( 'Accept-Encoding' => 'gzip' );
    }
);

my $tx = $ua->get('https://www.mojolicious.org');

is( $tx->req->headers->header('Accept-Encoding'), 'gzip', qq{Request Accept-Encoding is "gzip"} );

ok( $tx->res->is_success, "Response is success" );

# The following assertion fails.
# My theory is that Mojo::UserAgent is silently decoding the content, and changing
# the Content-Encoding and Content-Length to reflect the new values.  However, how
# do we inspect what the original response headers were?
is( $tx->res->headers->header('Content-Encoding'), 'gzip', qq{Response Content-Encoding is "gzip"} );

Results

$ perl mojo_useragent_content_encoding.pl
1..3
ok 1 - Request Accept-Encoding is "gzip"
ok 2 - Response is success
not ok 3 - Response Content-Encoding is "gzip"
#   Failed test 'Response Content-Encoding is "gzip"'
#   at mojo_useragent_content_encoding.pl line 30.
#          got: undef
#     expected: 'gzip'
# Looks like you failed 1 test of 3.

I was able to confirm that the payload is being gzip'd by analyzing the Apache logs. Additionally, this curl also confirms this example website is utilizing gzip encoding for requests

$ curl -i -H "Accept-Encoding: gzip" https://www.mojolicious.org
HTTP/1.1 200 OK
Date: Mon, 18 Jan 2021 21:28:14 GMT
Content-Type: text/html;charset=UTF-8
...
Content-Encoding: gzip
...

I am able to use LWP::UserAgent to confirm the proper Content-Encoding of the response.

However, I'm unable to determine how to inspect the Mojo::UserAgent response to view the real headers before any theoretical post processing was performed.

1

There are 1 answers

0
clamp On BEST ANSWER

You can set $ua->transactor->compressed(0); in your code or MOJO_GZIP=0 in your env to bypass auto decompression.

If you want to keep auto decompression and examine the headers before the decompression stage is reached (which also removes the Content-Encoding header) you can register a callback on the contents body event. This event is emitted after the headers are parsed but before the body is processed.

use strict ;
use warnings;
use 5.30.0;
use Test::More tests => 3;
use Data::Dumper;
use Mojo::UserAgent;     # Version 8.26

my $ua = Mojo::UserAgent->new();

# As documented: https://docs.mojolicious.org/Mojolicious/Guides/Cookbook#Decorating-follow-up-requests
$ua->once(
          start => sub {
              my ( $ua, $tx ) = @_;
              $tx->req->headers->header( 'Accept-Encoding' => 'gzip' );

              my $res = $tx->res;
              say 'register event listener';
              $res->content->on(body=>sub{test_res_encoding($tx)});
          }
      );
$ua->transactor->compressed(0);

my $tx = $ua->get('https://www.mojolicious.org');

is( $tx->req->headers->header('Accept-Encoding'), 'gzip', qq{Request Accept-Encoding is "gzip"} );

ok( $tx->res->is_success, "Response is success" );

#say Dumper $tx->res->headers;
# The following assertion fails.
# My theory is that Mojo::UserAgent is silently decoding the content, and changing
# the Content-Encoding and Content-Length to reflect the new values.  However, how
# do we inspect what the original response headers were?
sub test_res_encoding{
    my $tx = shift;
    is( $tx->res->headers->header('Content-Encoding'),
        'gzip',
        qq{Response Content-Encoding is "gzip"} );
}

Setting MOJO_EVENTEMITTER_DEBUG=1 in your env helps to see what is going on.