The dreaded � character and displaying info from database in UTF8

174 views Asked by At

So, I have a database and I use Navicat. We have a simple PHP website which is a few years old and we've upgraded the site to UTF8.

We have 'activities' on the site which handle UTF8 special characters perfectly, but we also have 'comments' on the site and curly single quotes and other special characters show me a �.

The database was converted to UTF via:

ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;

When I look at both databases in Navicat, I can see both are UTF8 and utf8_general_ci.

When I design the table I can see the 'activities' table I can see the cell is a mediumText and is setup with UTF8. When I design the 'comments' section, the cell that isn't working is a Blob and it doesn't have any character encoding info.

We're doing a pretty basic SELECT and then displaying via $vairable[column].

Does anyone know why the 'activities' would work perfectly with UTF8 and the 'comments' would have issues? We're not doing anything super fancy to either of them.

I have tried converting the Blob to a text field, but when I do that the database then escapes it'self when it's outputting to the page, so as soon as there is a single quote in the text it cuts off.

I have tried things like utf8_encode, stripslashes, mysql_real_escape_string, htmlentities, htmlspecialchars, but I'm not sure any of them would help anyway.

Thanks!

1

There are 1 answers

0
Daniel W. On

blob means binary large object. Binary data does not have any encoding in raw.

So you have latin1 or whatever data in a blob, and you show it and treat it like utf-8 data.

You need to manually convert the data using PHP or whatever.

Here is a good article from the performanceblog that describes what you can do:

http://www.mysqlperformanceblog.com/2013/10/16/utf8-data-on-latin1-tables-converting-to-utf8-without-downtime-or-double-encoding/

If you have problems firing your queries, use the console instead of phpMyAdmin and don't forget the connection encoding through SET NAMES

master> ALTER TABLE t CONVERT TO CHARACTER SET utf8, CHANGE comment comment TEXT;
master> SET NAMES utf8;