I occasionally have to write simple perl scripts to export data from XML files into CSV files for loading into a database.
I am encountering a problem "print"ing an element that has no value. Instead of just printing nothing, it prints the string "HASH(0x1ca05f8)" (or its siblings).
How do I stop it from doing this?
Below is the code that I am using, and the data that I am using. Thanks, --sw
parse.pl:
#!/usr/bin/perl
#use module
use XML::Simple;
use Data::Dumper;
#create object
$xml = new XML::Simple;
#read XML file
$data = $xml->XMLin("$ARGV[0]", ForceArray=>1);
foreach $pr (@{$data->{product}})
{
foreach $rv (@{$pr->{reviews}})
{
foreach $fr (@{$rv->{fullreview}})
{
print "$ARGV[1]", ",";
print "$ARGV[2]", ",";
print "$ARGV[3]", ",";
print "$ARGV[4]", ",";
print $pr->{"pageid"}->[0], ",";
print $fr->{"status"}->[0], ",";
print $fr->{"source"}->[0], ",";
print $fr->{"createddate"}->[0], ",";
print $fr->{"overallrating"}->[0], ",";
print $fr->{"email_address_from_user"}->[0], ",";
foreach $csg (@{$fr->{confirmstatusgroup}})
{
print join(";", @{$csg->{"confirmstatus"}});
}
print "\n";
}
}
}
data.xml:
<?xml version="1.0" encoding="UTF-8"?>
<products xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<product xsi:type="ProductWithReviews" locale="en_US">
<pageid>bshnbat612</pageid>
<reviews>
<fullreview>
<status>Approved</status>
<createddate>2014-03-28</createddate>
<source>email</source>
<confirmstatusgroup>
<confirmstatus>Verified Purchaser</confirmstatus>
<confirmstatus>Verified Reviewer</confirmstatus>
</confirmstatusgroup>
<overallrating>5</overallrating>
<email_address_from_user/>
</fullreview>
</reviews>
</product>
</products>
The output this creates:
,,,,bshnbat612,Approved,email,2014-03-28,5,HASH(0xe9fee8),Verified Purchaser;Verified Reviewer
In response to a suggestion made below, here is the Dumper output:
$VAR1 = {
'xmlns:xsi' => 'http://www.w3.org/2001/XMLSchema-instance',
'product' => [
{
'xsi:type' => 'ProductWithReviews',
'reviews' => [
{
'fullreview' => [
{
'source' => [
'email'
],
'email_address_from_user' => [
{}
],
'overallrating' => [
'5'
],
'confirmstatusgroup' => [
{
'confirmstatus' => [
'Verified Purchaser',
'Verified Reviewer'
]
}
],
'status' => [
'Approved'
],
'createddate' => [
'2014-03-28'
]
}
]
}
],
'pageid' => [
'bshnbat612'
],
'locale' => 'en_US'
}
]
};
Take a look at the
SuppressEmpty
option that can be passed to XML::Simple. Without it, XML::Simple will provide an empty hash for empty elements. By callingXMLin("$ARGV[0]", ForceArray=>1, SuppressEmpty=>1);
your output should be:,,,,bshnbat612,Approved,email,2014-03-28,5,,Verified Purchaser;Verified Reviewer