How do we work with large XML schema(XSD)/XML messages on GreenPlum?
I reviwed the GreenPlum DBA guide. It talks about processing XML messages using STX tranformation.
If we have an XSD and we need to process the XML messages that was compliant with XSD.
How do we approach this situation in GreenPlum?
What are the best approaches?
Are there any other approaches to process XSD based /XML in Greenplum?
Thanks for your insights.
In general, there's no built-in functionality for processing XML files in Greenplum. The thing you are talking about is the gpfdist client-side transformation that allows you to convert your XML to CSV and then process it inside of the database.
The best option in processing the XMLs inside of the Greenplum would be to decompose them to the relational model and load as a set of tables. If you need to process just a general XML files, you can write a function in PL/Java or PL/Python that would parse XML for you and extract the information you need.
Here's an example of the function:
And here's an example of its invocation: