IBM DataStage: Evaluate string as code/expression

1.6k views Asked by At

I have a complex transformation where a lookup stage specifies one of approximately 30 different/specific string operations that has to be done on a row. I am wondering how to do this efficiently in DataStage?

The requirement is something like this:

    If 
      col_a = 1 
    Then 
      col_b := some_string_function(col_c) 
    Else If 
      col_a = 2 
    Then 
      col_b := some_other_string_function(col_d) 
    Else If 
      col_a = 3 
    Then 
      col_b := yet_another_string_function(col_c & col_d) 
    Else If ...

... and so on.

What I have explored so far:

My first impulse was to include the code (field name(s) and string functions) as string/field in the lookup table and use that code after the lookup in a transformer stage expression. However, there seems to be no way to evaluate a string as code inside a transformer expression?

Another solution I have come up with is to put the code into a lot of nested control statements inside a transformer stage, which seems terribly inefficient, especially since DataStage does not seem to offer a control statement equivalent to something like "CASE"/"SWITCH". Or does it?

Substituting (part of) the control statements with a switch stage feeding into different lookup/transfomer stages would seem more efficient since they could be done in parallel but would be a pain to design.

I have not yet dabbled in server routines.

2

There are 2 answers

0
Ray Wurlod On

DataStage BASIC has a bazillion string functions. OK, maybe only 440 or so. Its CASE construct compiles to the equivalent If..Then..Else structure. Actually, both of them compile to a series of TEST..JUMP instructions at the lowest level.

0
Hana On

I'm familiar with Datastage 8.5. Having a long If/Then/Else statement in the transform would work, but yes, it's messy and inefficient.

My first thought is to use a Server routine, of type Transform function.

The function could work like this:

Transform function    
Arguments: col_A, col_C, col_D

    FUNCTION CALC_B(col_A,col_C,col_D)
    Begin Case
      Case colA = 1
           Ans = StringFunc(colC)

      Case colA = 2
           Ans = OtherStringFunc(colC,colD)

      Case colA = 3 OR colA = 4
            Ans = YetOtherStringFunc(colC,colD)
    End Case

Then in your transform you could use the function to set your col_B value.

CALC_B(myrow.colA,myrow.colC,myrow.colD)

I think the biggest problem with this is whether BASIC has the string operations you need. Below is a link to their programming page.

IBM - Working with Routines

IBM - Basic Programming Language