Kettle database lookup case insensitive

1.7k views Asked by At

I've a table "City" with more than 100k records. The field "name" contains strings like "Roma", "La Valletta".

I receive a file with the city name, all in upper case as in "ROMA". I need to get the id of the record that contains "Roma" when I search for "ROMA".

In SQL, I must do something like:

select id from city where upper(name) = upper(%name%)

How can I do this in kettle?

Note: if the city is not found, I use an Insert/update field to create it, so I must avoid duplicates generated by case-sensitive names.

3

There are 3 answers

0
Daniele Licitra On BEST ANSWER

This is how I did:

First "Modified JavaScript value" step for create a query:

var queryDest="select coalesce( (select id as idcity from city where upper(name) = upper('"+replace(mycity,"'","\'\'")+"') and upper(cap) = upper('"+mycap+"') ), 0) as idcitydest";

Then I use this string as a query in a Dynamic SQL row.

After that,

IF idcitydest == 0 then 
   insert new city; 
else
   use the found record

This system make a query for file's row but it use few memory cache

2
Rishu S On

You can make use of the String Operations steps in Pentaho Kettle. Set the Lower/Upper option to Y

Pass the city (name) from the City table to the String operations steps which will do the Upper case of your data stream i.e. city name. Join/lookup with the received file and get the required id.

More on String Operations step in pentaho wiki.

0
matthiash On

You can use a 'Database join' step. Here you can write the sql:

select id from city where upper(name) = upper(?)

and specify the city field name from the text file as parameter. With 'Number of rows to return' and 'Outer join?' you can control the join behaviour.

This solution doesn't work well with a large number of rows, as it will execute one query per row. In those cases Rishu's solution is better.