Strange Behaviour From htmlentities

81 views Asked by At

As I understand it, the html entity version of an apostrophe (single quote) is '.

That is how it is being encoded when I add data to my database. However when I try to search on the database I am getting a problem because the code I use gives the apostrophe as ' i.e. the zero is missing.

I have stripped the page down to the most basic bit of test code:

$hotelname = filter_input(INPUT_GET, 'hotelname', FILTER_SANITIZE_STRING);
//$hotelname = "Auberge de l'Etang Bleu";
$hotelname = htmlentities($hotelname,ENT_QUOTES,"UTF-8");
echo $hotelname;
exit();

Why would it do that? Is there something in different versions of PHP or something like that?

To further muddy the waters, if I comment out the first line and de-comment the second line, htmlentities appears to do nothing at all and it echos the version with the apostrophe.

Punctuation and accents always drive me mad, but this is even worse than usual. Is it me? (No doubt it is.)

EDIT
See my solution down below.

2

There are 2 answers

4
ceejayoz On BEST ANSWER

As I understand it, the html entity version of an apostrophe (single quote) is '.

You understand incorrectly. ' is correct (quick test indicates browsers understand both, but your DB will consider them different strings).

You should also consider encoding on display, not when saving to your database, which would prevent the difference from mattering at all. The PHP docs seem to indicate that some versions output ' and others output '.

0
TrapezeArtist On

In the end it seems that it was FILTER_SANITIZE_STRING that was the culprit. It was deleting the 0 from '.

I think I have now solved it by changing to using

$hotelname = $_GET['hotelname'];
$hotelname = htmlentities($hotelname, ENT_QUOTES);
$hotelname = mysqli_real_escape_string($db,$hotelname);