Match browsers set to Scandinavian languages based on "Accept-Language"

899 views Asked by At

Question

I am trying to match browsers set to Scandinavian languages based on HTTP header "Accept-Language".

My regex is:

^(nb|nn|no|sv|se|da|dk).*

My question is if this is sufficient, and if anyone know about any other odd scandinavian (but "valid") language codes or obscure browser bugs causing false positives?

Used for

The regex is used for displaying a english link in the top of the Norwegian web pages (which is the primary language and the root of the domain and sub-domains) that takes you to the English web pages (secondary language and folder under root) when the browser language is not Scandinavian. The link can be closed / "opted-out" with hash stored in JavaScript localStorage if the user don't want to see the link again. We decided not to use IP geo-location because of limited time to implement.

2

There are 2 answers

2
Matthew On BEST ANSWER

Depending on the language you are working in there may be code in place you can use to parse this easily, e.g. this post: Parse Accept-Language header in Java <-- Also provides a good code example

Further - are you sure you want to limit your regex to the start of the string, as several lanaguages can be provided (the first is intended to be "I prefer x but also accept the following") : http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4

Otherwise your regex should work fine based on the what you were asking and here is a list of all browser language codes: http://www.metamodpro.com/browser-language-codes

I would also - in your shoes, make the "switch to X language" link easy to find for all users until they had opted not to see it again. I would expect many people may have a preference set by default in their browser but find a site actually using it to be unexpected i.e. a user experience like:

I prefer english but don't know enough to change this setting and have never had a reason to before as so few sites make use of it.

2
Mario Rossi On

That regular expression is enough if you are testing each item in accept-language individually.

If not individually, there are 2 problems:

  • One of the expected languages could not appear at the beginning of the header, but after.
  • Some of the expected languages abbreviations could appear as qualifier of a completely different language.