Azure Data Factory using Blob and package located in a Managed Instance

62 views Asked by At

I have an Azure Managed Instance and also deployed an Azure Data Factory V2 for one of our clients.

My doubt right now is about one request from the client regarding execute a package SSIS but one of the steps is about get data from an excel file.

This excel file is located in an on-prem server so I suggested configure Windows Authentication in order to be able to access and get and retrieve that excel file from on-prem server.

But I think would be better and easier change that location and move those excel files in a blob container. For that reason I already linked a new service of Azure Blob in ADF and also created a new 'integration runtimes' but here is my problem:

1º: In this moment I have two integration runtimes, one was created in order to connect and create SSISDB in our managed instance. And the new one was to linked service with AZ Blob. Is this really neccesary? Having two IRT will produce connection problems between AZ Managed Instance and the Blob Container? How will the MI connect and get the excel file from the blob container?

2º: My client have his own package SSIS, they only need change the path of the excel file?

Example:

First version: C/ssis/excelfile.xlsx

Second version: https://StoAcc.blob.core.windows.net/ssis/excelfile.xlsx (example)

The rest of the steps works fine, the error comes only when try to access to excel file. I don't know if I need to do more steps and configuration on Data Factory or Managed Instance. It's my first time mounting an Azure Data Factory and Managed Instance since zero.

Thanks and best regards

Edit:

enter image description here

1

There are 1 answers

0
Alberto Morillo On

About the first question, you need create linked services for the source and the sink (destination), and then use a Visual Studio project to create a pipeline that copies the data from Azure Storage to Azure SQL Managed Instance (MI), and here you will find steps to do it.

Do you need an integration runtime for Azure SQL MI? The answer is yes. The default Integration Runtime (IR) that Azure Data Factory provides does not have support for VNets and thus it can’t be used to connect to SQL MI VNet, which means that it can’t be used to move data in and out of an Azure SQL MI. However, if you install an Self-Hosted Integration Runtime (SHIR) on an Azure VM that has been created on the same subnet of the Azure SQL MI or on Azure VM created on a different VNet that has been peered to the Azure SQL MI Vnet, then ADF can access the Azure SQL Managed Instance. The process is now just the same you follow to create a pipeline or use a Copy Data Tool to load the Excel data into an Azure SQL Database with the exception that when creating the Linked Service that will allow pipelines to connect to Azure SQL or when using the Copy Data Tool you have to specify the newly created runtime.

enter image description here

You also need a linked service to connect to Azure Storage Accounts. The default Azure Data Factory IR can connect to Azure Storage accounts, so the important integration runtime is the SHIR for Azure SQL MI.

The Azure SSIS-Integration Runtime is needed of you would like to continue using the traditional SSIS packages with Azure SQL MI instead of using more the ADF pipelines or Copy Data Tool.