Why does filter_var()'s FILTER_SANITIZE_STRING filter encode single quotes as ' and double quotes as " while htmlentities() encodes single quotes as ' and double quotes as "?
Code Sample:
<?php
$string = "Well that's \"different.\"";
echo "filter_var: ".filter_var($string, FILTER_SANITIZE_STRING)."\n";
echo "htmlentities: ".htmlentities($string, ENT_QUOTES)."\n";
echo "htmlspecialchars: ".htmlspecialchars($string, ENT_QUOTES)."\n";
Output:
filter_var: Well that's "different."
htmlentities: Well that's "different."
htmlspecialchars: Well that's "different."
It's because
filterextension has nothing to do with HTML processing. It doesn't use HTML entity conversion table. It is just a stupid encoding based on the ASCII value."is 34 in ASCII'is 39 in ASCIIThe same applies for any other character that the
filterextension converts to HTML encoded form. It takes the ASCII numerical value in decimal, prepends&#and appends;. That's it! It's simple and efficient, even if it's not very correct.No offence to anyone, but using this extension for anything HTML related is a rather dumb idea. The constant
FILTER_SANITIZE_STRINGis deprecated now and it will be removed in future versions of PHP. There exists a filterFILTER_SANITIZE_FULL_SPECIAL_CHARSwhich is just a wrapper aroundhtmlspecialchars(), but I can't think of any reason to use this over the simplehtmlspecialchars()function.Some of these filters are a remainder from the era of lazy PHP. Developers used lazy approaches to security like magic quotes, which didn't provide enough security and often lead to more mess. These HTML filters were created with the same lazy approach in mind. It's better to provide something than nothing to mitigate XSS. However, this is definitely not the recommended practice anymore. Please format the output correctly using the appropriate functions to avoid XSS rather than relying on filters for sanitization.