Substring from-to

76 views Asked by At

I am working on a SSIS(2017) solution to read and load data from these 3 excel file names:

message_EDF_100420202.csv
message_UltaBIO_10042020.csv
message_SEIDV_10042020.csv

What I need to do is get only EDF or UltraBIO or SEIDV as a new column (derived column task)

so I need some help to set up correctly the substring function inside the derived column task.

any suggestion?

2

There are 2 answers

0
billinkc On

It appears your pattern is message_ Stuff-I-Want _junk (spaces not present in actual pattern). It's delimited by underscores and since the starting text is constant, that makes life easier.

Create a new column called MessageLessName

Remove the message_ portion with an expression

REPLACE([SourceFile], "message_", "")

Now, we want to take the left N most characters where N corresponds to the location of the underscore in our new column MessageLessName. For ease of debugging, I propose you add a second Derived Column Task to the output of the first one (where we defined MessageLessName). Here, we're going to create FirstUnderscore column

findstring([MessageLessName], "_", 1)

Finally, we'll add a third Derived Column Task and here-in is where we'll get to the final file name.

LEFT([MessageLessName], [FirstUnderscore])

Now that may be off by one due to my being lazy but because you can check each step along the way, you can verify MessageLessName is exactly what you think it should be and that FirstUnderscore is N characters in from our MessageLessName column.

2
KeithL On

script component using Split.

Row.ColumnName.ToString().Split('_')[1];
  1. You are taking the column value and casting to string. (current value is the whole string)
  2. Next is splitting based on '_' (current value is an array of three strings)
  3. Finally you are taking the second value (0 based) (current value is the string you want)

Here's a little bonus. Getting the date as well:

    string[] breakdown = Row.fileNames.Split('_');
    Row.Type = breakdown[1];

    string dateToFix = breakdown[2].Replace(".csv", "");
    Row.Date = DateTime.Parse(dateToFix.Substring(0,2) +"/" 
         + dateToFix.Substring(2,2) + "/" + dateToFix.Substring(4,4));