I've created an azure data factory to schedule a U-SQL script using "DataLakeAnalyticsU-SQL" activity. See the code below:
InputDataset
{
"name": "InputDataLakeTable",
"properties": {
"published": false,
"type": "AzureDataLakeStore",
"linkedServiceName": "LinkedServiceSource",
"typeProperties": {
"fileName": "SearchLog.txt",
"folderPath": "demo/",
"format": {
"type": "TextFormat",
"rowDelimiter": "\n",
"columnDelimiter": "|",
"quoteChar": "\""
}
},
"availability": {
"frequency": "Hour",
"interval": 1
}
}
}
OutputDataset:
{
"name": "OutputDataLakeTable",
"properties": {
"published": false,
"type": "AzureDataLakeStore",
"linkedServiceName": "LinkedServiceDestination",
"typeProperties": {
"folderPath": "scripts/"
},
"availability": {
"frequency": "Hour",
"interval": 1
}
}
}
Pipeline:
{
"name": "ComputeEventsByRegionPipeline",
"properties": {
"description": "This is a pipeline to compute events for en-gb locale and date less than 2012/02/19.",
"activities": [
{
"type": "DataLakeAnalyticsU-SQL",
"typeProperties": {
"scriptPath": "scripts\\SearchLogProcessing.txt",
"degreeOfParallelism": 3,
"priority": 100,
"parameters": {
"in": "/demo/SearchLog.txt",
"out": "/scripts/Result.txt"
}
},
"inputs": [
{
"name": "InputDataLakeTable"
}
],
"outputs": [
{
"name": "OutputDataLakeTable"
}
],
"policy": {
"timeout": "06:00:00",
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"retry": 1
},
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"name": "CopybyU-SQL",
"linkedServiceName": "AzureDataLakeAnalyticsLinkedService"
}
],
"start": "2016-12-21T17:44:13.557Z",
"end": "2016-12-22T17:44:13.557Z",
"isPaused": false,
"hubName": "denojaidbfactory_hub",
"pipelineMode": "Scheduled"
}
}
I've created all required Linked Services successfully. But after deploying the pipeline, there is no time slice is created for input dataset. See below image:
Whereas Output Dataset is expecting an upstream input dataset timeslice. As a result, the time slices of output dataset remains in pending execution state and my Azure data factory pipeline is not working. See below image: Any suggestion to resolve this issue.
If you don't have another activity that is creating your InputDataLakeTable, you need to add the attribute
https://learn.microsoft.com/en-us/azure/data-factory/data-factory-faq
https://learn.microsoft.com/en-us/azure/data-factory/data-factory-create-datasets