Perl : UTF8 characters gets distorted while trasnmitting

477 views Asked by At

I have a script which sends HL7 messages to Mirth. Here is the program:

use Net::HL7;
use Net::HL7::Connection;
use open ( ":encoding(UTF-8)", ":std" );
binmode(STDOUT, ":utf8");


my $conn = new Net::HL7::Connection('127.0.0.1', 7010);
my $msg = getHl7Message(); # Too many things happening in getHl7Message()
print($msg)                # All characters are correct in $msg when printed

my $hl7msg = new Net::HL7::Message($msg); 

print($hl7msg->toString(1)) #All characters are correct
my $response = $conn->send($hl7msg); #sent to Mirth

Now, when I check mirth, all characters outside of ASCII set are distorted.

What shall I do? Net::HL7::Connection uses IO::Socket internally.

I am also receiving this warning: Wide character in print at /usr/local/share/perl5/Net/HL7/Connection.pm line 143. I tried executing with -CS but still no gain.

Some info:

[user@server gs]$ lsb_release -a
LSB Version:    :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch
Distributor ID: OracleServer
Description:    Oracle Linux Server release 6.6
Release:    6.6
Codename:   n/a

/etc/environment and /etc/default/locale was empty. I added these two lines

LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8

Result:

[user@server gs]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
1

There are 1 answers

2
mob On BEST ANSWER

You are putting the UTF-8 encoding layer on STDOUT, but you don't put that layer on the HL7 channel (whatever that is) and you don't encode your message in UTF-8. That is why you see the Wide print ... warning. Write a UTF-8 encoded string to the HL7 connection.

my $msg = getHl7Message();
# $msg is not encoded.   
# map{chr}split//,$msg should produce some values larger than 255

print($msg);
# this is ok because the :utf8 layer was applied to STDOUT.
# Behind the scenes, the string $msg is encoded before it is output
# to the terminal.

my $msgutf8 = Encode::encode("UTF-8", $msg);
# Now $msgutf8 is a UTF-8 encoded "octet string".
# map{chr}split//,$msgutf8 should only produce vals between 0 and 255,
# and is safe to transmit

my $hl7msg = new Net::HL7::Message($msgutf8);