I am testing an endpoint that returns an XML, using the following codes, I am able to fetch and convert the XML and turn into an object

$url = "http://myendpoint.com/payload" //just a sample URL, confidential

$opts = array('http' => array('header' => 'Accept-Charset: UTF-8, *;q=0'));
$context = stream_context_create($opts);
$output = file_get_contents($url, false, $context);
$xml = simplexml_load_string($output);

It just so happened that our database contains foreign languages that are for sure of different charset. Now, with the codes above, I am able to retrieve the output properly. However, I have observed that the file_get_contents() takes too much time so I decided to use cURL.

Showing you my example code:

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 2);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING, "");
$output = curl_exec($ch);
$xml = simplexml_load_string($output);

The problem is the foreign language or characters are not being recognized, it keeps on saying

PCDATA invalid Char value 8

and I'm quite sure it's because I am unable to convert these texts or characters into UTF-8 readable format

Error message:

One of my tried fixes is adding the following codes in hopes to imitate my file_get_contents() technique:

$headers = array(
    "Accept-Charset: UTF-8, *;q=0"
);

//cURL code
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

But to no avail, the problem still persists

How to fix this? Been eating my time just to debug this error.

0 Answers