skip error rows from flat file in ssis

605 views Asked by At

I am trying to load data from a flat file. The file is around 2.5 GB in size and row count is close to billion. I am using a flat file source inside DFT. Few rows inside the file does not follow the column pattern, for example there is a extra delimiter or say text qualifier as value of one column. I want to skip those rows and load rest of the rows which has correct format. I am using SSIS 2014. Flat file source inside DFT is failing. I have set alwaysCheckforrowdelimiter property to false but still does not work. Since the file is too huge manually opening and changing is not possible. Kindly help.

1

There are 1 answers

0
Arnaud Gastelblum On

I have the same idea as Nick.McDermaid but I can maybe help you a bit more. You can clean your file with a regular expression. (In a script)

You just need to define a regex to match lines with the number of delimiter you want. Other lines should be deleted.

Here is a visual example executed in Notepad++

Notepad++ Example screenshot

Here is the pattern used for my example:

^[A-Z]*;[A-Z]*;[A-Z]*;[A-Z]*$

And the data sample:

AA;BB;CC;DD
AA;BB;CC;DD
AA;BB;CC;DD;EE
AA;BB;CC;DD
AA;BB;CC
AA;BB;CC;DD
AA;BB;CC;DD

You can try it on line: https://regex101.com/r/PIYIcY/1

Regards, Arnaud