update mysql table encode special characters

3.7k views Asked by At

I have a mysql table called "Movies". I just found that some movies contains special characters. For example, there are some movies that I crawled and inserted in my table such as 'WALL·E' or 'Mr. and Mrs. Smith', but they are inserted in DB as 'WALL & middot ; E' and 'Mr. & amp ; Mrs. Smith'. This caused a problem in my survey application, because for example I have an autocompletion field in my survey where user should insert his/her favorite movie name, so if user try to find WALL·E, he has to insert 'WALL & middot ; E'!!

Is there any way (preferably in mysql) that I can update my table and replace all movies that contains special characters with their encoded names?

Thanks,

3

There are 3 answers

5
spencer7593 On BEST ANSWER

It's not a MySQL issue.

The issue is the values you are storing to the database.

If you want to "decode" the HTML entities in a string into the characters they represent, you can use PHP html_entity_decode function on the value you've scraped from web pages, and then store that value in the database.

Reference: http://php.net/manual/en/function.html-entity-decode.php


If you want to store the values in the database as encoded HTML entities, then you could search for those by taking whatever value the user is searching for, and encoding that. (That probably wasn't clear).

If the user is searching for "Mr. & Mrs. Smith", you could encode that using the PHP htmlentities function, to return:

 Mr. & Mrs. Smith

And use that as the search string.

The downside of this approach is that there are multiple ways to encode the same character. And the way that htmlentities encodes a string may not match what was done on the webpage. It's possible that some characters may not be encoded at all, etc.

4
Markus AO On

If you don't mind dealing with each character individually, you can just do this for each:

UPDATE movies SET movie_title = REPLACE(movie_title, '·', '·');

That is, if you want to replace the HTML entity with the actual character. If you just want to remove the hard-to-type character from current movie titles, simply use '' instead. Do take a backup before you go nuking your database with REPLACE though.

You may want to create a new column, e.g. movie_easytype_title, where you store a version of the title that's easy to type and match, and still retain the proper full title for display.

You should probably be running html_entity_decode on your titles before you input them in the database to avoid storing troublesome HTML entities to begin with.

3
milan kyada On

These solution sounds weird and i also know that it is for url ecoding and decoding. But it works.
while inserting data into database you can use urlencode().
And while displaying it you should use urldecode().