<Search>
<Country>USA</Country>
<Region>West</Region>
<Address>
<Home>
<Item>
<id>Number</id>
<value>135</value>
</Item>
<Item>
<id>Street</id>
<value>Pacific</value>
</Item>
<Item>
<id>City</id>
<value>Irvine</value>
</Item>
</Home>
<Home>
<Item>
<id>Number</id>
<value>1672</value>
</Item>
<Item>
<id>Street</id>
<value>Madison</value>
</Item>
<Item>
<id>City</id>
<value>Denver</value>
</Item>
</Home>
</Address>
I am trying to create the below table structure but I am not getting the desired result
I am trying to create the below table structure but I am not getting the desired result I am trying to create the below table structure but I am not getting the desired result I am trying to create the below table structure but I am not getting the desired result
Country Region Map
USA West {Number:135,Street:Pacific,City:Irvine}
USA West {Number:1672,Street:Madison,City:Denver}
`CREATE EXTERNAL TABLE search(
country string,
region string,
search array<struct<item:map<string,string>>>
)
PARTITIONED BY(date STRING)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES(
"column.xpath.country" = "/Search/country/text()",
"column.xpath.region" = "/Search/region/text()",
"column.xpath.item"="/Search/Address/Home/Item"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION '/search'
TBLPROPERTIES (
"xmlinput.start"="",
"xmlinput.end"=""
);
Is this possible or any other suggestions on how to get this data in the above format. Any help would be great. Thank you. `
Given the XML, the best you can do is probably something like this:
Here is the output:
To get to the elements:
To get the desired output:
you will have to flatten the
address
array using LATERAL VIEW https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView