In our business, we receive and need to process thousands of XML files per day and all these files are in the same format. We would like to store these data into oracle tables and reserve the hierarchical relationship of these data and then we can query them using traditional SQL and do further analysis. What is the best way to do that? Is XML DB the right choice?
Update:
Currently, I am thinking using XML DB with data stored in structured storage. So I understand that I can define a XML Schema with annotations and I know the steps involved but I would like to get confirmed answers for the following questions:
- Can I annotate one XML Schema to create multiple tables for the parent-child relationship? I would like data to be stored in relational tables, not objects.
- Can I define the Primary keys, foreign keys for all these tables? It is not very clear to me how Oracle maintain the parent-child relationship in these tables.
- Can anyone show me a good example?
You definitely want to begin with XMLDB.
XMLDB is a whole world of features and functionality within itself.
Very briefly, you have three storage options w/ XMLDB and XMLTYPE data.
You can store:
1.) Into a CLOB datatype. If you do this, the XML just sits in the database, and it's a LOB. You can't index it, search it, etc. The database is a bit bucket, and you store the XML.
2.) BINARY XML: This is the newest option, introduced in 11gR2. This will tokenize and compress the XML, and store in in an encoded binary format. The advantage here, is that you can then use Oracle Text and XMLINDEX domain indexes, to index and search the content. This is the option I have the most experience with. We have a billion documents, average size around 12k, stored in an 11gR2 database. It's working really well for us.
3.) XML to relational mapping: You define a relational schema to store the data in your XML schema, and define how the XML elements map to which relational tables and fields. On ingestion, the XML is decomposed and stored into a conventional relational model. You can then search, sort, index, as you would any traditional database.
It really depends on what you want to do, as to which XML storage model you choose to go with.
You may try asking your questions on the Oracle XMLDB forum. It's pretty active, and has some folks that are a lot more knowledgeable on the subject than I am. https://forums.oracle.com/forums/forum.jspa?forumID=34
Hope that helps.
Reply to Update 1:
I really only have experience with Binary XML storage option, sorry. I think that checking out the XMLDB Sample Code on OTN, would be useful.
Please see:
http://www.oracle.com/technetwork/indexes/samplecode/xmldb-sample-523617.html