SQL Server ETL via BIML

634 views Asked by At

I'm quite new to BIML and I'm pretty stuck with a problem, for which I couldn't find a proper solution here or in any other forum.

The task at hand is to create SSIS ETL packages for something like 60 tables in order to load data from server A to server B. Since I didn't fancy the idea to create the packages manually, I decided to give BIML a try. However, I GET the desired packages (one per table), BUT each and every package throws a "The metadata of the following output column does not match the metadata of the external columns with which the output columns are associated" in the source of the data flow and with this an error/warning in the destination as well. The column which throws the error (or at least one column...) is of DataType [image] (I don't have influence on this data type - this is as it is provided by the source system). However, the OutputColumn which BIML generates receives the datatype DT_NTEXT, which of course does not match the type of the external column (image).

Following what I got so far:

<#@ template language="C#" hostspecific="true"#>
<#@ import namespace="System.Data" #>

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <Connections>
        <OleDbConnection Name="Source" ConnectionString="#####"/>
        <OleDbConnection Name="Target" ConnectionString="#####"/>
    </Connections>
    <Packages>
        <#
            string metadataConnectionString = "#####";
            DataTable tables = ExternalDataAccess.GetDataTable(metadataConnectionString,
            "SELECT s.name + '_' + t.name AS Bez, '[' + s.name + '].[' + t.name + ']' AS Col FROM sys.tables t INNER JOIN sys.schemas s on t.schema_id = s.schema_id");
            foreach (DataRow row in tables.Rows)
            {
            #>
              <Package Name="Extract_<#=row[0]#>" ConstraintMode="Parallel" AutoCreateConfigurationsType="None">
                    <Tasks>
                        <ExecuteSQL Name="Truncate <#=row[0]#>" ConnectionName="Target">
                            <DirectInput>
                                Truncate Table <#=row[1]#>
                            </DirectInput>
                        </ExecuteSQL>

                        <Dataflow Name="Copy Data">
                            <PrecedenceConstraints>
                                <Inputs>
                                    <Input OutputPathName="Truncate <#=row[0]#>.Output" />
                                </Inputs>
                            </PrecedenceConstraints>
                            <Transformations>
                                  <OleDbSource Name="Retrieve Data" ConnectionName="Source">
                                      <ExternalTableInput Table="<#=row[1]#>" />
                                  </OleDbSource>
                                  <OleDbDestination Name="Insert Data" ConnectionName="Target">
                                      <ExternalTableOutput Table="<#=row[1]#>"/>
                                  </OleDbDestination>
                              </Transformations>
                          </Dataflow>
                    </Tasks>
              </Package>
            <#
            }
        #>
    </Packages>
</Biml>

Since I read that it might be connected to the Provider used in the Connections: I am using Provider=SQLNCLI11.1 with Integrated Security=SSPI

Any help would be highly appreciated. Thanks!

1

There are 1 answers

0
SqlKindaGuy On BEST ANSWER

Its because BIML and SSIS uses its own datatype. See this link for correct Datatypes:

https://www.cathrinewilhelmsen.net/2014/05/27/sql-server-ssis-and-biml-data-types/

You could put this in your code in a snippet and call it. Properbly you also need to change your source from ExternalTableInput to direct input, so that you write your SQL Select automatically. This is old code - Its much easier to put into a DLL Libaray.

What i have done is like this:

<Packages>  
<# foreach (DataRow dr in dt.Rows) {
PackageName = "EXTRACT_SQL_"+dr["S_EntityName"].ToString().Replace(".","_").ToUpper(); #>
<Package Name="<#=PackageName #>" ConstraintMode="Linear">
    <#=CallBimlScript("..\\..\\..\\Log\\StaticVariables.biml", "Log") #>
    <Tasks>
        <Container Name="SEQ Load arc <#=dr["D_EntityName"] #>" ConstraintMode="Linear">
            <Tasks>
                <#=CallBimlScript("..\\..\\..\\Log\\ETL_ExecutionLogStart.biml", PackageName) #>
                <ExecuteSQL Name="SQL TRUNCATE TABLE <#=dr["D_EntityName"] #>"  ConnectionName="<#=dr["D_DatabaseName"] #>" ResultSet="None"> 
                    <DirectInput>   
                        TRUNCATE TABLE [<#=dr["D_SchemaTypeName"]#>].[<#=dr["D_EntityName"] #>]
                    </DirectInput>
                </ExecuteSQL>
                <Dataflow Name="DFT_SourceToExtract_<#=dr["D_EntityName"] #>">
                    <Transformations>
                        <OleDbSource Name="OLDB SRC_<#=dr["S_DatabaseName"]#>_<#=dr["S_EntityName"] #>" ConnectionName="<#=dr["S_DatabaseName"]#>">
                            <DirectInput><# DataTable ColumnMetaData = ExternalDataAccess.GetDataTable(metadataConnectionString, @"
                                SELECT DISTINCT [S_ColumnName] 
                                FROM [model].[vRelationDatabaseSchemaEntityColumn]
                                WHERE [S_SystemTypeName] = 'SQL Server' AND S_EntityID = "+dr["S_EntityID"]+"");
                                string columns = String.Join(String.Format(", {0}", System.Environment.NewLine), ColumnMetaData.Rows.Cast<DataRow>().Select(r => "["+r["S_ColumnName"]+"]").ToArray()); #>
                                SELECT 
                                <#=columns #> 
                                FROM [<#=dr["S_SchemaTypeName"] #>].[<#=dr["S_EntityName"] #>] WITH (NOLOCK)
                            </DirectInput>
                        </OleDbSource>
                        <DerivedColumns Name="DRV Konverter datatyper">
                            <InputPath OutputPathName="OLDB SRC_<#=dr["S_DatabaseName"]#>_<#=dr["S_EntityName"] #>.Output" />
                                <#=CallBimlScript("..\\DerivedColumnDatatypeConversion.biml",dr["S_EntityID"]) #>
                        </DerivedColumns>
                        <#=CallBimlScript("..\\..\\..\\Log\\ETL_ExecutionLogDataFlowDerivedColumns.biml") #>
                        <#=CallBimlScript("..\\..\\..\\Log\\ETL_ExecutionLogDataFlowRowCount.biml") #>
                        <OleDbDestination Name="OLDB DEST_<#=dr["D_DatabaseName"] #>_<#=dr["D_EntityName"] #>" ConnectionName="<#=dr["D_DatabaseName"] #>">
                            <ExternalTableOutput Table="[<#=dr["D_SchemaTypeName"] #>].[<#=dr["D_EntityName"] #>]" />
                                <#=CallBimlScript("..\\OleDbDestinationColumnMapping.biml", dr["S_EntityID"]) #>
                        </OleDbDestination>
                    </Transformations>
                </Dataflow>
                <#=CallBimlScript("..\\..\\..\\Log\\ETL_ExecutionLogEnd.biml") #>
            </Tasks>
        </Container>
    </Tasks>
</Package>  
<#} #>
<Package Name="<#=MasterPackageName#>" ConstraintMode="Linear" ProtectionLevel="DontSaveSensitive">
    <#=CallBimlScript("..\\..\\..\\Log\\StaticVariables.biml", "Log") #>
    <Tasks>
        <#=CallBimlScript("..\\..\\..\\Log\\ETL_ExecutionLogStart.biml", MasterPackageName) #>
        <Container Name="SEQ Run packages in parallel" ConstraintMode="Parallel">
            <Tasks>      
            <# foreach (string package in packagescreated) { #>
                 <ExecutePackage Name="EPT_<#=package#>">
                    <ExternalProjectPackage Package="<#=package#>.dtsx" ></ExternalProjectPackage>
                </ExecutePackage>
            <# } #>
            </Tasks>
        </Container>
        <#=CallBimlScript("..\\..\\..\\Log\\ETL_ExecutionLogEnd.biml") #>
    </Tasks>
</Package>

As you can see i call DerivedColumnDatatypeConversion which looks like this:

<#@ import namespace="System.Data" #>
<#@ include file="..\\..\\Connections\\LocalConnection.biml" #>
<#@ include file="..\..\Utility\i-GetSqlToBimlDatatype.biml" #>
<#@ property name="pEntityID" type="Int32" #> 
<#
int EntityID = pEntityID;
bool isLenghtInt = false; 
int intLenght;

var metadataSelect = @"
SELECT [S_ColumnName]
  ,[S_Datatype]
  ,[D_ColumnName]
  ,[D_Datatype]
  ,[D_Length]
  ,[D_Precision]
  ,[D_Scale]
  ,[Conversion]
  ,[S_SystemTypeName]
  ,[FlatFileType] as [S_FlatFileType]
 FROM [model].[vRelationDatabaseSchemaEntityColumn] v
left join [model].[SourceFile] s on s.[FK_Entity]=v.[S_EntityID]
 WHERE [S_EntityID] = " + EntityID + "";

DataTable dt = ExternalDataAccess.GetDataTable(metadataConnectionString, 
metadataSelect);
#>
<Columns>
<Column Name="DummyIfEmpty" DataType="Int32">1</Column>
<# foreach (DataRow dr in dt.Rows) {
isLenghtInt = Int32.TryParse(dr["D_Length"].ToString(),out intLenght);

    if (dr["Conversion"].ToString() == "Convert_Date")
    {
        if(dr["S_Datatype"].ToString().ToLower() == "datetime" || dr["S_Datatype"].ToString().ToLower() == "datetime2" || dr["S_Datatype"].ToString().ToLower() == "date") 
        {
            if(dr["S_SystemTypeName"].ToString().ToLower() == "db2" || dr["S_FlatFileType"].ToString().ToLower() == "Fixed Width"){
                //Fjern kommenteringen på denne for at genaktivere datatypekonvertering på DB2
                //WriteLine('<Column Name=\"' + dr["S_ColumnName"] + '_Converted' + '\" ' + GetSqlToBimlDatatype(dr["D_DataType"].ToString())  +'>(DT_DBDATE)(SUBSTRING(['+dr["S_ColumnName"]+'],7,4)+"-"+SUBSTRING(['+dr["S_ColumnName"]+'],4,2)+"-"+SUBSTRING(['+dr["S_ColumnName"]+'],1,2))</Column>');
            } else {
                WriteLine("<Column Name=\"" + dr["S_ColumnName"] + "_Converted" + "\" " + GetSqlToBimlDatatype(dr["D_DataType"].ToString())  +">(DT_DBDATE)["+dr["S_ColumnName"]+"]</Column>");
            }
        }
        else
        {
            WriteLine("<Column Name=\"" + dr["S_ColumnName"] + "_Converted" + "\" " + GetSqlToBimlDatatype(dr["D_DataType"].ToString())  +">DATEADD(\"d\",(DT_I8)[" + dr["S_ColumnName"] + "],(DT_DBDATE)\"1960-01-01\")</Column>");
        }
    }


if(dr["S_Datatype"].ToString().ToLower() == "varchar" || dr["S_Datatype"].ToString().ToLower() == "text" ) {

    if(dr["D_Length"].ToString().ToLower() == "max" || (intLenght > 4000 && isLenghtInt == true) )  { //Hvis varchar er max eller over 4000 så konverter til nvarchar(4000) 

    WriteLine("<Column Name=\"" + dr["S_ColumnName"]  + "_Converted" + "\" "+GetSqlToBimlDatatype("nvarcharmax")  +">(DT_WSTR,4000)["+dr["S_ColumnName"]+"]</Column>");

    }
    else
    {
    WriteLine("<Column Name=\"" + dr["S_ColumnName"]  + "_Converted" + "\" "+GetSqlToBimlDatatype(dr["D_DataType"].ToString(),dr["D_Length"].ToString())  +">(DT_WSTR,"+dr["D_Length"]+")["+dr["S_ColumnName"]+"]</Column>");
}
}


    if(dr["S_Datatype"].ToString().ToLower() == "char")
    {
        if(dr["D_Length"].ToString().ToLower() == "max")
        {
            WriteLine("<Column Name=\"" + dr["S_ColumnName"]  + "_Converted" + "\" "+GetSqlToBimlDatatype("charmax") +">(DT_WSTR,4000)["+dr["S_ColumnName"]+"]</Column>");
}
else
{
            WriteLine("<Column Name=\"" + dr["S_ColumnName"] + "_Converted" + "\" "+GetSqlToBimlDatatype(dr["D_DataType"].ToString(),dr["D_Length"].ToString()) +">(DT_WSTR,"+dr["D_Length"]+")["+dr["S_ColumnName"]+"]</Column>");

}
        }
  if (dr["Conversion"].ToString() == "Convert_Datetime")
 {
  if(dr["S_Datatype"].ToString().ToLower() =="datetime")
    { 
 WriteLine("<Column Name=\"" + dr["S_ColumnName"]  + "_Converted" + "\" "+ 
 GetSqlToBimlDatatype(dr["D_DataType"].ToString()) + ">(DT_DBTIMESTAMP)[" + 
 dr["S_ColumnName"] + "]</Column>");

}
else
{
    WriteLine("<Column Name=\"" + dr["S_ColumnName"]  + "_Converted" + "\" "+ GetSqlToBimlDatatype(dr["D_DataType"].ToString()) + ">([" + dr["S_ColumnName"] + "] > 2147483647) ? DATEADD(\"Hh\",(DT_I8)[" + dr["S_ColumnName"] + "]/3600,(DT_DBTIMESTAMP)\"1960-01-01\") : DATEADD(\"s\",(DT_I8)[" + dr["S_ColumnName"] + "],(DT_DBTIMESTAMP)\"1960-01-01\")</Column>");
}
}

    if (dr["Conversion"].ToString() == "Convert_Int")
    WriteLine("<Column Name=\"" + dr["S_ColumnName"] + "_Converted" + "\" "+ 
GetSqlToBimlDatatype(dr["D_DataType"].ToString()) + ">(DT_I4)ROUND([" + 
dr["S_ColumnName"] + "],0)</Column>");
if (dr["Conversion"].ToString() == "Convert_Bigint")
    WriteLine("<Column Name=\"" + dr["S_ColumnName"] + "_Converted" + "\" " 
+ GetSqlToBimlDatatype(dr["D_DataType"].ToString())+">(DT_I8)ROUND([" + 
dr["S_ColumnName"] + "],0)</Column>");
if (dr["Conversion"].ToString() == "Convert_Decimal")
    WriteLine("<Column Name=\"" + dr["S_ColumnName"]  + "_Converted" + "\" " + GetSqlToBimlDatatype(dr["D_DataType"].ToString(), null, dr["D_Precision"].ToString(), dr["D_Scale"].ToString()) + ">(DT_NUMERIC," + dr["D_Precision"] + "," + dr["D_Scale"] + ")[" + dr["S_ColumnName"] + "]</Column>");

}#> 

 <#+ 
private string GetSqlToBimlDatatype(string datatype, string length = null, string precision = null, string scale = null)
    {
        string bimlDatatype = "DataType=\"AnsiString\" Length=\"50\" CodePage=\"1252\"";
        switch (datatype)
        {
            case "uniqueidentifier":
                bimlDatatype = "DataType=\"Guid\" Length=\"12\"";
                break;
                // Dates
            case "image":
                bimlDatatype = "DataType=\"Binary\"";
                break;
            case "date":
                bimlDatatype = "DataType=\"Date\" ";
                break;
            case "time":
                bimlDatatype = "DataType=\"Time\"";
                break;
            case "smalldatetime":
                bimlDatatype = "DataType=\"DateTime\"";
                break;
            case "datetime":
                bimlDatatype = "DataType=\"DateTime\"";
                break;
            case "datetime2":
                bimlDatatype = "DataType=\"DateTime2\"";
                break;
            case "datetimeoffset":
                bimlDatatype = "DataType=\"DateTimeOffset\"";
                break;
                // Intigers
            case "tinyint":
                bimlDatatype = "DataType=\"SByte\"";
                break;
            case "smallint":
                bimlDatatype = "DataType=\"Int16\"";
                break;
            case "int":
                bimlDatatype = "DataType=\"Int32\"";
                break;
            case "bigint":
                bimlDatatype = "DataType=\"Int64\"";
                break;
                // Numerics
            case "real":
                bimlDatatype = "DataType=\"Single\"";
                break;
            case "money":
                bimlDatatype = "DataType=\"Currency\"";
                break;
            case "float":
                bimlDatatype = "DataType=\"Double\"";
                break;
            case "numeric":
                bimlDatatype = "DataType=\"Decimal\" Precision=\"" + precision + "\" Scale=\"" + scale + "\"";
                break;
            case "smallmoney":
                bimlDatatype = "DataType=\"Currency\"";
                break;
            case "decimal":
                bimlDatatype = "DataType=\"Decimal\" Precision=\"" + precision + "\" Scale=\"" + scale + "\"";
                break;
            case "timestamp":
                bimlDatatype = "DataType=\"Binary\"";
                break;
            case "bit":
                bimlDatatype = "DataType=\"Boolean\"";
                break;
            case "hierarchyid":
                bimlDatatype = "DataType=\"String\" Length=\"36\"";
                break;
                // Strings
            case "varchar":
                bimlDatatype = "DataType=\"AnsiString\" Length=\"" + length + "\"";
                break;
            case "char":
                bimlDatatype = "DataType=\"AnsiStringFixedLength\" Length=\"" + length + "\"";
                break;
                case "charmax":
                bimlDatatype = "DataType=\"String\"";
                break;
            case "nvarchar":
                bimlDatatype = "DataType=\"StringFixedLength\" Length=\"" + length + "\"";
                break;
            case "nvarcharmax":
                bimlDatatype = "DataType=\"String\"";
                break;
            case "nchar":
                bimlDatatype = "DataType=\"StringFixedLength\" Length=\"" + length + "\"";
                break;
            case "sysname":
                bimlDatatype = "DataType=\"String\" Length=\"128\"";
                break;
                // Review as these types should not be part of DWH
            case "sql_variant":
                bimlDatatype = "DataType=\"Object\"";
                break;
            case "text":
                bimlDatatype = "DataType=\"AnsiString\" Length=\"" + length + "\"";
                break;
            case "ntext":
                bimlDatatype = "DataType=\"String\" Length=\"" + length + "\"";
                break;
                // Review as these types should not be part of DWH
            case "varbinary":
                bimlDatatype = "DataType=\"Binary\" Length=\"" + length + "\"";
                break;
            case "binary":
                bimlDatatype = "DataType=\"Binary\" Length=\"" + length + "\"";
                break;
            case "xml":
                bimlDatatype = "DataType=\"Xml\"";
                break;
           // default:
             //   break;
        }
        return bimlDatatype;
    }
#>