I am parsing xml file from url(in code below), using file_get_contents() function, and simpleXML, to insert data into the table, i did well, but i have problem with encoding(russian words) i get this ->Черногория; file and database encoding is set to utf-8;
require_once 'mysql_connect.php';
/**
*
*
*/
error_reporting(E_ALL);
$sql = "CREATE TABLE IF NOT EXISTS `db_countries` (
`id` int(11) unsigned NOT NULL auto_increment,
`countrykey` varchar(255) NOT NULL default '',
`countryname` varchar(255) NOT NULL default '',
`countrynamelat` varchar(500) NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8";
mysql_query($sql);
$data = file_get_contents("http://www2.turtess-online.com.ua/export/dictionary/countries/");
$xml = new SimpleXMLElement($data);
echo $xml->body->dictionary->element["countryName"];
foreach ($xml->body->dictionary->element as $element) {
$countryname = mysql_real_escape_string($element["countryName"]);
$countrynamelat = mysql_real_escape_string($element["countryNameLat"]);
$countrykey = $element["countryKey"];
if ($countrykey) {
$q = $insert = 'INSERT INTO db_countries (countrykey, countryname, countrynamelat) VALUES ("' . $countrykey . '", "' . $countryname . '", "' . $countrynamelat . '")';
mysql_query($q);
} else {
echo "not valid key of country";
}
}
Make sure you insert Unicode content as well, database charset is not doing any "automagic" conversion.
As an alternative, I would suggest
utf8_encode($countryname)
as in :update : indeed, the XML source file shows a Windows 1251 charset
UPDATE(2) : i tested the code against this nifty little function and it works at last :)
credit goes to Martin Petrov