Getting data from a database and outputting it in another DB dynamically with Pentaho

92 views Asked by At

I've been trying to solve the following problem for a while now. I hope you guy can help.

I am passing a table from one DB to another DB using pentaho. I am using a Data Input Step in which I am selecting the last Max ID. Then in another Data Input step I am selecting some of the columns in the table with a condition < Max ID from the previous step. Lastly, I Map these in a Data Output Step.

The problem with this is that I have 100 separate transformations for each separate table which as you might imagine is very hard to maintain. I've been trying to create a loop which would get each table, select the rows and pass the rows dynamically but I am struggling to do so.

Has anyone encountered this scenario?

2

There are 2 answers

1
AudioBubble On

I've been trying to create a loop which would get each table, select the rows and pass the rows dynamically

By saying this, if you want to automate the transformations, use the JOBS, go to:

file > new > job

and there you can add your transformations and using the "start" hop with schedule to run the transformations.

0
Bert-Jan Stroop On

If I understand your problem correctly, you are trying to setup a generic transformation for loading data from table X on Dataserver A to table Y on Dataserver B. This so you can then pump lists of tables, and columns into it, so you can run it for al those tables without setting up seperate transformations for all of them.

If that is the case then the magic word is Metadata-injection.

Just create a single transformation the way you want it to work for all tables, and leave the table fields in the steps, and column fields in the steps inside the transformation empty.

Then instead of calling this transformation from another transformation, you have to use the step "ETL Metadata Injection". It works pretty much the same as a transformation step, only you can push lists into it from dataflows, which you can then use to populate the metadata of the underlying process.

More details can be found here :