I am working on a data ingest application. I have to run queries in oracle and then insert the result set into Vertica. However, I find it takes a lot more time to do this task. And this is what I do.
1) Execute the query in oracle and obtains the result set. Also open connection with vertica and create a prepared statement.
2) Read the oracle result sets as a batch of 30K records and execute them using the Vertica prepared statement.
3) Read the next batch of records from Oracle and repeat Step 2 until all the records read from Oracle result set.
I face two problems with this approach:
1) The process is extremely slow.
2) Some of the records are missed and they don't get passed to Vertica.
My likely solutions:
- Copy Oracle result sets to a file and pass them to vertica. This requires to include some file cleaning stuff in code.
- I can use VerticaCopyStream API to insert them to vertica tables. VerticaCopyStream API works much faster than Prepared statements. It needs to be passed an input stream to the file.
- If solution 1 is not better than the current solution, I can consider passing an input stream of the result set to VerticaCopyStream API.
I am supposed to do this only through Java. Can you suggest the optimized solutions for this situation?
U can use process builder to create a process which can dump the data from oracle back to a file and load that back again to vertical using process builder So use process builder to create the command line which can dump into file and again use the same to load it back