SSIS Expression to handle NULLs in multiple columns

868 views Asked by At

I have an OLE DB source that has some nulls, it has 50 columns and I'm trying to extract the rows that have NULLs to Bad data destination, the expression that I had for the conditional split is

ISNULL([StudentName]) == TRUE

But that means I have to repeat it for all the columns that I have and I'm wondering if there's another way to handle this. especially since I have multiple tables to process. Thanks

5

There are 5 answers

0
KeithL On BEST ANSWER

Believe it or not this has been puzzling me since you posted it.

I have not figured a way to do it SSIS but here is a possible solution using a script component source. This will find all the keys associated with a row that has at least one null column.

    DataTable dt = new DataTable();

    string sql = "Enter your extract SQL here";
    using(OleDbConnection conn = new OleDbConnection("Data Source=XXXXXXXXX;Initial Catalog=YYYYY;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"))
    {
        using (OleDbCommand cmd = new OleDbCommand(sql,conn))
        {
            cmd.CommandType = CommandType.Text;
            conn.Open();
            dt.Load(cmd.ExecuteReader());
        }
    }

    foreach (DataRow row in dt.Rows)
    {
        foreach (DataColumn col in dt.Columns)
        {
            if (row[col] == DBNull.Value)
            {
                Output0Buffer.AddRow();
                Output0Buffer.NullRowKey = row.Field<int>(0); //This is the 0 based index of the key in the data
                break;
            }
        }
    }

The problem I was running into was trying to use Columns in an Output0Buffer data type which is what you have to use when using Script Component of a transformation. I need to actually use script component source and load the data into a data table in order to have access to a Columns property.

0
Hadi On

The easiest solution (I am not sure if it is recommended from performance perspective) is to use an SQL command in the OLE DB Source where you should add a new column which is a concatenation of all columns. Then, you should use this column in the conditional split instead of mentioning all columns. As example:

SELECT *, [Column1] + [Column2] + [Column3] as [CheckColumn]
FROM [Table]
0
K4M On

It's not clear if you meant the data bad when ALL columns are NULL or just ANY column is NULL.

If it is ALL, then you can do this:

COALESCE(col1, col2, col3, .....) IS NULL

If it is 'ANY`, then you can do the following:

(col1 + col2 + col3 + .....) IS NULL

In BOTH cases:

you will need to convert non-string columns (numeric, date, etc) to string via CONVERT(varchar(8000), col)

0
KeithL On

Here is a completely different solution using your already formed conditional split of:

ISNULL([StudentName])

You can build an OR test in Excel by pasting a list of columns (I get this from the metadata on the data flow path). I usually use this technique for building the destination table.

Paste into excel and get the column names into Col A.

Enter this formula into ColB:

="ISNULL("&A1&")||"

Enter this formula directly in B2:

=B1&"ISNULL("&A2&")||"

This should yield a result in B2 like this:

ISNULL(Col1)||ISNULL(Col2)||

Now drag and drop B2 to the bottom of your list (in your case B50).

B50 will now hold your final formula for the conditional split:

ISNULL(Col1)||ISNULL(Col2)||ISNULL(Col3)||ISNULL(Col4)||ISNULL(Col5)||ISNULL(Col6)||ISNULL(Col7)||ISNULL(Col8)||ISNULL(Col9)||ISNULL(Col10)||ISNULL(Col11)||ISNULL(Col12)||ISNULL(Col13)||ISNULL(Col14)||ISNULL(Col15)||ISNULL(Col16)||ISNULL(Col17)||ISNULL(Col18)||ISNULL(Col19)||ISNULL(Col20)||ISNULL(Col21)||ISNULL(Col22)||ISNULL(Col23)||ISNULL(Col24)||ISNULL(Col25)||ISNULL(Col26)||ISNULL(Col27)||ISNULL(Col28)||ISNULL(Col29)||ISNULL(Col30)||ISNULL(Col31)||ISNULL(Col32)||ISNULL(Col33)||ISNULL(Col34)||ISNULL(Col35)||ISNULL(Col36)||ISNULL(Col37)||ISNULL(Col38)||ISNULL(Col39)||ISNULL(Col40)||ISNULL(Col41)||ISNULL(Col42)||ISNULL(Col43)||ISNULL(Col44)||ISNULL(Col45)||ISNULL(Col46)||ISNULL(Col47)||ISNULL(Col48)||ISNULL(Col49)||ISNULL(Col50)||

Just delete the final ||.

1
MLeblanc On

Sure there is another way, why not create a script that will generate the condition for you?

if the 50 columns is not in a particular table, just create a tmp table with all the column you need ( select ..... into #tmp from .... )

Then, generate the condition.

declare @schema sysname = 'Purchasing'
declare @table sysname = 'PurchaseOrders'
        
;with cols as (
        select  convert(varchar(max), 'ISNULL(' + QUOTENAME(column_name) + ') == TRUE') as col, TABLE_SCHEMA, TABLE_NAME, ordinal_position 
        from INFORMATION_SCHEMA.COLUMNS 
        where TABLE_SCHEMA = @schema and TABLE_NAME = @table and ORDINAL_POSITION = 1
        union all
        select CONVERT (varchar(max) , cl.col + ' || ISNULL(' + QUOTENAME(column_name) + ') == TRUE') as col, c.TABLE_SCHEMA, c.TABLE_NAME, c.ordinal_position 
        from INFORMATION_SCHEMA.COLUMNS c
        inner join cols cl on cl.TABLE_SCHEMA = c.TABLE_SCHEMA and cl.TABLE_NAME = c.TABLE_NAME and c.ORDINAL_POSITION = cl.ORDINAL_POSITION + 1
    )
    select '= ' + cols.col 
    from cols
    where ORDINAL_POSITION = (select MAX(ordinal_position) from cols)

using the WideWorldImporters databases, this will result with condition, remove what not needed.

= ISNULL([PurchaseOrderID]) == TRUE || ISNULL([SupplierID]) == TRUE || ISNULL([OrderDate]) == TRUE || ISNULL([DeliveryMethodID]) == TRUE || ISNULL([ContactPersonID]) == TRUE || ISNULL([ExpectedDeliveryDate]) == TRUE || ISNULL([SupplierReference]) == TRUE || ISNULL([IsOrderFinalized]) == TRUE || ISNULL([Comments]) == TRUE || ISNULL([InternalComments]) == TRUE || ISNULL([LastEditedBy]) == TRUE || ISNULL([LastEditedWhen]) == TRUE

But beware, no matters which method you choose (mine or generating condition with Excel), testing 50 columns in SSIS might add an enormous overheap, especially if you have a lot of rows.

A better alternative would be to put a flag on each row in the dataset and filter/redirect whether "HasNullValues" is set or not.