HAProxy 1.4: remove multiple slashes

1.9k views Asked by At

I want to remove multiple slashes with haproxy:

$ curl -I "http://www.host.com//some_path/sub_path/slug/"

and make it send a 301 response for SEO purposes:

HTTP/1.1 301 Moved Permanently
Cache-Control: no-cache
Content-length: 0
Location: /some_path/subpath/slug/

First approach: Create ACL -> replace slashes -> redirect

acl multiple-bars path_reg /{2,}
reqrep ^([^\ :]*)\ /{2,}(.*)     \1\ /\2 if multiple-bars
...
redirect prefix / code 301 if multiple-bars

The reqrep changes the URL so that "multiple-bars" is no more true. Therefore I got no 301 redirect

$ curl -I "http://www.host.com//some_path/sub_path/slug/"
HTTP/1.1 200 OK
domain=www.host.com
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Content-Type: text/html; charset=UTF-8
Date: Tue, 09 Jun 2015 18:16:36 GMT

Second approach: Create a Custom Header and redirect if it finds the header

acl multiple-bars path_reg /{2,}
reqadd X-HadMultipleBars if multiple-bars
reqrep ^([^\ :]*)\ /{2,}(.*)     \1\ /\2 if multiple-bars
...
redirect prefix / code 301 if { hdr_cnt(X-HadMultipleBars) 1 }

But HAProxy first processes reqrep rules and then reqadd. So:

[ec2-user@haproxy-stage-m3m-a haproxy]$ sudo service haproxy restart
Stopping haproxy:                                          [  OK  ]
Starting haproxy: [WARNING] 159/181957 (11430) : parsing [/etc/haproxy/haproxy.cfg:87] : a 'reqrep' rule placed after a 'reqadd' rule will still be processed before.
[WARNING] 159/181957 (11430) : parsing [/etc/haproxy/haproxy.cfg:103] : a 'reqidel' rule placed after a 'reqadd' rule will still be processed before.
[WARNING] 159/181957 (11430) : parsing [/etc/haproxy/haproxy.cfg:104] : a 'reqrep' rule placed after a 'reqadd' rule will still be processed before.
                                                           [  OK  ]

It won't redirect:

$ curl -I "http://www.host.com//some_path/sub_path/slug/"
HTTP/1.1 200 OK
domain=www.host.com
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Content-Type: text/html; charset=UTF-8
Date: Tue, 09 Jun 2015 18:16:36 GMT

Third approach:

acl multiple-bars path_reg /{2,}
reqrep ^([^\ :]*)\ /{2,}(.*)    \1\ /\2\nX-HadMultipleBars: if multiple-bars
...
redirect prefix / code 301 if { hdr_cnt(X-HadMultipleBars) 1 }

I don't like this hack, it looks dirty, but it works: the hdr_cnt gets the value of 1.

It works with multiple slashes:

$ curl -I "http://www.host.com/////some_path/sub_path/slug/"
HTTP/1.1 301 Moved Permanently
Cache-Control: no-cache
Content-length: 0
Location: /some_path/sub_path/slug/

But it doesn't work with sub_path multiple slashes:

$ curl -I "http://www.host.com//////some_path///sub_path////slug/////"
HTTP/1.1 301 Moved Permanently
Cache-Control: no-cache
Content-length: 0
Location: /some_path///sub_path////slug/////

I'm having trouble to make this regex somewhat recursive, but this code removes some 'random' slashes at each time:

http://www.host.com//////some_path///sub_path////slug/////
301 -> Location: /some_path///sub_path////slug/////
301 -> Location: /some_path//sub_path//slug///
301 -> Location: /some_path/sub_path/slug//
301 -> Location: /some_path/sub_path/slug/
200!

I tried to use this regex which works at http://www.regexr.com/ instead of backreferencing to ([^\ :]*)

#Parses Valid URL characters:
([!#$&-.0-;=?-[\]_a-z~]|%[0-9a-fA-F]{2})+

but it doesn't work on HAProxy 1.4 :( Can someone give me some help here?

Thanks!

1

There are 1 answers

0
alabama On

on haproxy version > 1.6

acl has_multiple_slash  path_reg /{2,}

http-request redirect code 301 location http://%[hdr(host)]%[url,regsub(/+,/,g)] if has_multiple_slash