showing umlauts in html with utf8 charset

13.6k views Asked by At

This question is most likely answered many times before, but I have searched some hours now and I still don't understand one basic thing (most probably the utf8-charset itself...).

I have a html with german umlauts "ä" and "ö" (ä and ö):

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
hällö
</body>
</html>

which results into the output of "h�ll�".

When I leave out <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> the result becomes "hällö" in my browser (probably with some german charset settings??) - as it should be.

Why don't umlauts work like "normal" (ASCII) characters in utf8-charset and what can I do to make them so (beside of encoding, decoding and masking)?

2

There are 2 answers

0
Lucky's On BEST ANSWER

If you specified "charset=utf-8", you have to upload/use a "File" that is encoded with UTF-8.

To do this on Windows:

  1. Open your html/php.. file in Notepad.
  2. go to "File" and choose "Save As"
  3. Set the "Encoding" field to "UTF-8"

-> Profit

3
Remy Lebeau On

which results into the output of "h�ll�".

Those boxes are actually Unicode codepoint U+FFFD REPLACEMENT CHARACTER, which means your HTML file is not actually encoded in UTF-8, as ä and ö are not valid UTF-8 byte octet sequences and are thus being replaced.

You need to either:

  • make sure the file is actually saved in UTF-8 to begin with.

  • change your declared charset to what it really is (most likely ISO-8859-1) (and make sure it also matches the charset attribute of the HTTP Content-Type header, if present).

  • use HTML named entities instead of actual characters:

    h&auml;ll&ouml;