Trying to Wrap my brain around HTML character encodings and htmlspecialchars()

144 views Asked by At

I've been trying to get a proper understanding of character-encoding in HTML, and was hoping someone might be able to help me out with a small problem I've been encountering.

I'm pulling a paragraph of text from a mySQL database table (latin-1). The paragraph happens to have a right-single-quote in it, and I read it was a good idea to run that sort of string data through htmlspecialchars() before displaying on the screen, so I tried...

// So let's say $paragraph is a string like "The customer's computer is on".

echo htmlspecialchars($paragraph);

This renders to the screen as "The customer'’s computer is on". At first I thought that was weird, because I expected the ’ to automatically be rendered as a right-single-quote, but then I thought maybe I'd forgotten the meta-tag. Since the database table was latin-1, I thought the following tag would help it render correctly...

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">

But still no dice, it's still showing up as &#8217;. I also tried...

htmlspecialchars($paragraph, ENT_QUOTES, 'ISO-8859-1');

But it's still rendering the same. If I don't even use htmlspecialchars() it renders to the screen as expected, but I guess I'm just trying to understand why htmlspecialchars() doesn't render the way I was expecting. Maybe I'm completely misunderstanding the functions and how they're supposed to be rendering in the browser, so any help on this would be really appreciated, thanks!

Edit: To add some more oddness to the equation, I tried manually typing in &#8217; into the HTML document, and it does in fact render as right single-quotes. However, when I look at the HTML I see &#8217; where the htmlspecialchars() is outputting, instead of the right single-quote I was expecting. Does anyone know why that might be? Is that expected functionality?

1

There are 1 answers

1
Phlume On

Refer to this post: HTML code for an apostrophe

The apostrophe and the right single quote are two different characters. Perhaps it is rendering correctly because the apostrophe is what is in the db?