My page often shows things like ë, Ã, ì, ù, à in place of normal characters.
I use utf8 for header page and MySQL encode. How does this happen?
My page often shows things like ë, Ã, ì, ù, à in place of normal characters.
I use utf8 for header page and MySQL encode. How does this happen?
If you see those characters you probably just didn’t specify the character encoding properly. Because those characters are the result when an UTF-8 multi-byte string is interpreted with a single-byte encoding like ISO 8859-1 or Windows-1252.
In this case ë
could be encoded with 0xC3 0xAB that represents the Unicode character ë
(U+00EB) in UTF-8.
Even though utf8_decode
is a useful solution, I prefer to correct the encoding errors on the table itself. In my opinion it is better to correct the bad characters themselves than making "hacks" in the code. Simply do a replace
on the field on the table. To correct the bad encoded characters from OP :
update <table> set <field> = replace(<field>, "ë", "ë")
update <table> set <field> = replace(<field>, "Ã", "à")
update <table> set <field> = replace(<field>, "ì", "ì")
update <table> set <field> = replace(<field>, "ù", "ù")
Where <table>
is the name of the mysql table and <field>
is the name of the column in the table. Here is a very good check-list for those typically bad encoded windows-1252 to utf-8 characters -> Debugging Chart Mapping Windows-1252 Characters to UTF-8 Bytes to Latin-1 Characters.
Remember to backup your table before trying to replace any characters with SQL!
[I know this is an answer to a very old question, but was facing the issue once again. Some old windows machine didnt encoded the text correct before inserting it to the utf8_general_ci collated table.]
I actually found something that worked for me. It converts the text to binary and then to UTF8.
Source Text that has encoding issues: If ‘Yes’, what was your last
SELECT CONVERT(CAST(CONVERT(
(SELECT CONVERT(CAST(CONVERT(english_text USING LATIN1) AS BINARY) USING UTF8) AS res FROM m_translation WHERE id = 865)
USING LATIN1) AS BINARY) USING UTF8) AS 'result';
Corrected Result text: If ‘Yes’, what was your last
My source was wrongly encoded twice so I had two do it twice. For one time you can use:
SELECT CONVERT(CAST(CONVERT(column_name USING latin1) AS BINARY) USING UTF8) AS res FROM m_translation WHERE id = 865;
Please excuse me for any formatting mistakes
These are utf-8 encoded characters. Use utf8_decode() to convert them to normal ISO-8859-1 characters.