SSIS query with rowcount conditional split writes 0 rows

1k views Asked by At

I have an OLE DB source with a simple query that writes a flat csv file to the network. If I do only that, row(s) are written from the source to the file fine. But, I only want to do that if there are more than 0 rows.

So, I added a variable (apptCount) and a conditional split. (first time using these) The data viewer shows rows are there between the oledb data source and the row count object that is mapped to the variable apptCount The data viewer shows rows are also between the row count object and the conditional split object. In the conditional spit, I have Case 1 set to @[User::apptCount]>0

But, no rows are written to the csv file. Any help appreciated. /jb

2

There are 2 answers

2
billinkc On

Variables are only read/written to when the data flow starts and complete. When the DF is actively running, the Conditional Split is going to use the value of the @[User::apptCount] before the Data Flow began (likely the design-time value of 0 unless you initialized it otherwise).

Your choices are to either run your query twice (Execute SQL Task that gets the expected Row Count) and then enable/disable the DF as needed. Or have it OLEDB Source -> Row Count -> Flat File destination and then have a conditional File System Task go in and delete the file if the Row Count variable was zero.

You might also check because if the DF starts, I think it's going to initialize your output to an empty file which may or may not be the expected behavior for your systems. If presence of file means you had data, a consumer might be disappointed to see that it is empty.

Roll your own Row Count transformation

As a thought experiment, let's implement the row count transformation as a Script component. It'a trivial component.

Add your row count variable @[User::apptCount] to the ReadWriteVariables collection

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    int mRowCount;

    public override void PreExecute()
    {
        base.PreExecute();
        this.mRowCount = 0;
    }

    public override void PostExecute()
    {
        base.PostExecute();
        this.Variables.apptCount = this.mRowCount;
    }

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        mRowCount++;
    }
}

That's it. 1 local variable, mRowCount that we initialize in PreExecute. In PostExecuted, we assign the value of our local member variable to the global/SSIS Variable we passed in. And in the *ProcessInputRow method, we just increment the member variable.

If you try and access or modify the value of apptCount anywhere but in Pre/Post Execute, it's not going to work. I think at one point (2005 days), it'd allow you to do it but no change would happen but now I think it properly throws an exception. A Data Viewer is likely doing something similar. It's batching up a 100 rows or whatever and showing them in a grid box but that's just internal/member variables. The Data Viewer can't do anything like "Send this row to the error path." It's just inspection.

The SSIS execution engine is designed to go parallel when it can. There's no way to get the internal pieces exposed to the external pieces except for coordinated/thread safe methods like PostExecute.

I suppose if you wanted, you could write an Asynchronous script component that only emitted rows to the output buffer if at least one row was received. I wouldn't do that. It's a lot of maintenance and tedious coding for no benefit but it's an option.

0
Janet Barnett On

This wasn't working so I backed out and set the variable directly in an execute sql event using returnValue in control flow and then using an expression constraint. Much easier than trying to set row count within a data flow.