I need to validate the request Accept header using regex (python). The regex must match application/json or application/jose+jwe and additional parameters (charset=utf8 and q value).

Originally the requirements to only accept application/json and q value and no other mimi types. I had the following regex that worked.

(^application/json;q=(0|1|(0\.[1-9]))$)|(^application/json$)

I know need to be able to include the charset paremeter to charset=utf8 and be able to match anywhere in the line.

I am new to regex and created the following regex but it does not match all the requirements (https://regex101.com/r/vFMCcI/11) -

(application/json; q=(0|1|(0\.[1-9])))$|(application\/json; charset=utf8)|(application\/json; charset=utf8 q=(0|1|(0\.[1-9])))|(application/json)

The test strings are

application/json,
application/json; q=0.2
application/json; charset=utf8
application/json; q=0.2 charset=utf8
application/json; charset=utf8 q=0.2
text/html, application/json; q=0.2, application/pdf

application/jose+jwe
application/jose+jwe; q=0.2
application/jose+jwe; charset=utf8
application/jose+jwe; q=0.2 charset=utf8
application/jose+jwe; charset=utf8 q=0.2
text/html, application/jose+jwe; q=0.2, application/pdf
  1. Why am I only getting partial matches for application/json; charset=utf8 q=0.2?
  2. The regex is getting too long when application/jose+jwe is not even included. It is adding on ms to the requests. Any pointers on this can be better optimised?

EDIT:

Q value must be 0-1 and only to 1 decimal place

0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1

The only acceptable charset value is charset=utf8

Thanks

2 Answers

1
Spirit On

Try this regex. It works with optional "q" and optional "charset"

application\/(json|jose\+jwe)(;\s)?((charset=utf8|q=[0-1]\.\d)(\s)?)*

https://regex101.com/r/ABjXH4/5

2
Rob Barber On

Here's a more specific pattern.

^application\/(json|jose\+jwe)(;(( q=(1|0\.\d))|( charset.utf8))+)?$

This will match what you described above. On the first test case there is a trailing comma but I wasn't sure if you wanted to include that. It's a simple add though.

^application\/(json|jose\+jwe)(;(( q=(1|0\.\d))|( charset.utf8))+)?,?$