ADF :: How to substitute files in Blob Storage

254 views Asked by At

In the Azure Cost Management > Export, there is an export called Daily export of month-to-date:

enter image description here

Such export happens daily and it's exported to a Blob Storage.

PROBLEM: Because it's a month-to-day export, if today it's the 26 or April, today it will create a .csv file with all the Cost Management information from the 1th of April to the 26th.

But tomorrow another .csv file will be generated with all the costs from the 1th of April to the 27th.

This way I will have double informations!

GOAL: an ideal solution would be that as soon as a new file is exported to that Storage Account the old file is deleted.

So there is always only 1 .csv file that contains all the data from month-to-date.

SCOPE: Everything can be in scope:

  • Azure Data Factory
  • Logic Apps
  • Automation Accounts
  • Power Automate

...whatever works.

2

There are 2 answers

8
RithwikBojja On BEST ANSWER

I have reproduced in my environment and got expected results as below:

Design:

enter image description here Clearly: enter image description here enter image description here You can reproduce the above design with below code:

Logic app code:

{
    "definition": {
        "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
        "actions": {
            "Delete_blob_(V2)": {
                "inputs": {
                    "headers": {
                        "SkipDeleteIfFileNotFoundOnServer": false
                    },
                    "host": {
                        "connection": {
                            "name": "@parameters('$connections')['azureblob']['connectionId']"
                        }
                    },
                    "method": "delete",
                    "path": "/v2/datasets/@{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/files/@{encodeURIComponent(encodeURIComponent('/rithwik/',variables('ammo')))}"
                },
                "runAfter": {
                    "For_each": [
                        "Succeeded",
                        "Failed"
                    ]
                },
                "type": "ApiConnection"
            },
            "For_each": {
                "actions": {
                    "Append_to_array_variable": {
                        "inputs": {
                            "name": "emo",
                            "value": "@triggerBody()?['LastModified']"
                        },
                        "runAfter": {
                            "Compose": [
                                "Succeeded"
                            ]
                        },
                        "type": "AppendToArrayVariable"
                    },
                    "Append_to_array_variable_2": {
                        "inputs": {
                            "name": "vammo",
                            "value": "@items('For_each')?['DisplayName']"
                        },
                        "runAfter": {
                            "Append_to_array_variable": [
                                "Succeeded"
                            ]
                        },
                        "type": "AppendToArrayVariable"
                    },
                    "Compose": {
                        "inputs": "@items('For_each')?['LastModified']",
                        "runAfter": {},
                        "type": "Compose"
                    },
                    "Condition": {
                        "actions": {
                            "Append_to_string_variable": {
                                "inputs": {
                                    "name": "ammo",
                                    "value": "@variables('vammo')[0]"
                                },
                                "runAfter": {},
                                "type": "AppendToStringVariable"
                            }
                        },
                        "else": {
                            "actions": {
                                "Append_to_string_variable_2": {
                                    "inputs": {
                                        "name": "ammo",
                                        "value": "@variables('vammo')[1]"
                                    },
                                    "runAfter": {},
                                    "type": "AppendToStringVariable"
                                }
                            }
                        },
                        "expression": {
                            "and": [
                                {
                                    "less": [
                                        "@ticks(variables('emo')[0])",
                                        "@ticks(variables('emo')[1])"
                                    ]
                                }
                            ]
                        },
                        "runAfter": {
                            "Append_to_array_variable_2": [
                                "Succeeded"
                            ]
                        },
                        "type": "If"
                    }
                },
                "foreach": "@body('Lists_blobs_(V2)')?['value']",
                "runAfter": {
                    "Initialize_variable_3": [
                        "Succeeded"
                    ]
                },
                "type": "Foreach"
            },
            "Initialize_variable": {
                "inputs": {
                    "variables": [
                        {
                            "name": "emo",
                            "type": "array"
                        }
                    ]
                },
                "runAfter": {
                    "Lists_blobs_(V2)": [
                        "Succeeded"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Initialize_variable_2": {
                "inputs": {
                    "variables": [
                        {
                            "name": "vammo",
                            "type": "array"
                        }
                    ]
                },
                "runAfter": {
                    "Initialize_variable": [
                        "Succeeded"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Initialize_variable_3": {
                "inputs": {
                    "variables": [
                        {
                            "name": "ammo",
                            "type": "string"
                        }
                    ]
                },
                "runAfter": {
                    "Initialize_variable_2": [
                        "Succeeded"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Lists_blobs_(V2)": {
                "inputs": {
                    "host": {
                        "connection": {
                            "name": "@parameters('$connections')['azureblob']['connectionId']"
                        }
                    },
                    "method": "get",
                    "path": "/v2/datasets/@{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/foldersV2/@{encodeURIComponent(encodeURIComponent('JTJmcml0aHdpaw=='))}",
                    "queries": {
                        "nextPageMarker": "",
                        "useFlatListing": false
                    }
                },
                "metadata": {
                    "JTJmcml0aHdpaw==": "/rithwik"
                },
                "runAfter": {},
                "type": "ApiConnection"
            }
        },
        "contentVersion": "1.0.0.0",
        "outputs": {},
        "parameters": {
            "$connections": {
                "defaultValue": {},
                "type": "Object"
            }
        },
        "triggers": {
            "When_a_blob_is_added_or_modified_(properties_only)_(V2)_2": {
                "evaluatedRecurrence": {
                    "frequency": "Second",
                    "interval": 3
                },
                "inputs": {
                    "host": {
                        "connection": {
                            "name": "@parameters('$connections')['azureblob']['connectionId']"
                        }
                    },
                    "method": "get",
                    "path": "/v2/datasets/@{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/triggers/batch/onupdatedfile",
                    "queries": {
                        "checkBothCreatedAndModifiedDateTime": false,
                        "folderId": "JTJmcml0aHdpaw==",
                        "maxFileCount": 1
                    }
                },
                "metadata": {
                    "JTJmcml0aHdpaw==": "/rithwik"
                },
                "recurrence": {
                    "frequency": "Second",
                    "interval": 3
                },
                "splitOn": "@triggerBody()",
                "type": "ApiConnection"
            }
        }
    },
    "parameters": {
        "$connections": {
            "value": {
                "azureblob": {
                    "connectionId": "/subscriptions/b83c1ed3-c5b6-74c23f/resourceGroups/rbojja-/providers/Microsoft.Web/connections/azureblob",
                    "connectionName": "azureblob",
                    "id": "/subscriptions/b8b-b5ba-2074c23f/providers/Microsoft.Web/locations/eastus/managedApis/azureblob"
                }
            }
        }
    }
}

Firstly have 1 blob like below:

enter image description here

Blob can be of any type.

Then Uploaded new Blob:

enter image description here

Then blob got deleted:

enter image description here

Output:

enter image description here

Also added some other new steps to be more accurate: enter image description here Code view:

{
    "definition": {
        "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
        "actions": {
            "Delete_blob_(V2)": {
                "inputs": {
                    "headers": {
                        "SkipDeleteIfFileNotFoundOnServer": false
                    },
                    "host": {
                        "connection": {
                            "name": "@parameters('$connections')['azureblob']['connectionId']"
                        }
                    },
                    "method": "delete",
                    "path": "/v2/datasets/@{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/files/@{encodeURIComponent(encodeURIComponent('/rithwik/',variables('xyz')))}"
                },
                "runAfter": {
                    "Initialize_variable_4": [
                        "Succeeded",
                        "Failed"
                    ]
                },
                "type": "ApiConnection"
            },
            "For_each": {
                "actions": {
                    "Append_to_array_variable": {
                        "inputs": {
                            "name": "emo",
                            "value": "@triggerBody()?['LastModified']"
                        },
                        "runAfter": {
                            "Compose": [
                                "Succeeded"
                            ]
                        },
                        "type": "AppendToArrayVariable"
                    },
                    "Append_to_array_variable_2": {
                        "inputs": {
                            "name": "vammo",
                            "value": "@items('For_each')?['DisplayName']"
                        },
                        "runAfter": {
                            "Append_to_array_variable": [
                                "Succeeded"
                            ]
                        },
                        "type": "AppendToArrayVariable"
                    },
                    "Compose": {
                        "inputs": "@items('For_each')?['LastModified']",
                        "runAfter": {},
                        "type": "Compose"
                    },
                    "Condition": {
                        "actions": {
                            "Append_to_string_variable": {
                                "inputs": {
                                    "name": "ammo",
                                    "value": "@variables('vammo')[0]"
                                },
                                "runAfter": {},
                                "type": "AppendToStringVariable"
                            }
                        },
                        "else": {
                            "actions": {
                                "Compose_2": {
                                    "inputs": "@variables('vammo')[1]",
                                    "runAfter": {},
                                    "type": "Compose"
                                },
                                "Set_variable": {
                                    "inputs": {
                                        "name": "ammo",
                                        "value": "@{outputs('Compose_2')}"
                                    },
                                    "runAfter": {
                                        "Compose_2": [
                                            "Succeeded"
                                        ]
                                    },
                                    "type": "SetVariable"
                                }
                            }
                        },
                        "expression": {
                            "and": [
                                {
                                    "less": [
                                        "@ticks(variables('emo')[0])",
                                        "@ticks(variables('emo')[1])"
                                    ]
                                }
                            ]
                        },
                        "runAfter": {
                            "Append_to_array_variable_2": [
                                "Succeeded"
                            ]
                        },
                        "type": "If"
                    }
                },
                "foreach": "@body('Lists_blobs_(V2)')?['value']",
                "runAfter": {
                    "Initialize_variable_3": [
                        "Succeeded"
                    ]
                },
                "type": "Foreach"
            },
            "Initialize_variable": {
                "inputs": {
                    "variables": [
                        {
                            "name": "emo",
                            "type": "array"
                        }
                    ]
                },
                "runAfter": {
                    "Lists_blobs_(V2)": [
                        "Succeeded"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Initialize_variable_2": {
                "inputs": {
                    "variables": [
                        {
                            "name": "vammo",
                            "type": "array"
                        }
                    ]
                },
                "runAfter": {
                    "Initialize_variable": [
                        "Succeeded"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Initialize_variable_3": {
                "inputs": {
                    "variables": [
                        {
                            "name": "ammo",
                            "type": "string"
                        }
                    ]
                },
                "runAfter": {
                    "Initialize_variable_2": [
                        "Succeeded"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Initialize_variable_4": {
                "inputs": {
                    "variables": [
                        {
                            "name": "xyz",
                            "type": "string",
                            "value": "@variables('ammo')"
                        }
                    ]
                },
                "runAfter": {
                    "For_each": [
                        "Succeeded",
                        "Failed"
                    ]
                },
                "type": "InitializeVariable"
            },
            "Lists_blobs_(V2)": {
                "inputs": {
                    "host": {
                        "connection": {
                            "name": "@parameters('$connections')['azureblob']['connectionId']"
                        }
                    },
                    "method": "get",
                    "path": "/v2/datasets/@{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/foldersV2/@{encodeURIComponent(encodeURIComponent('JTJmcml0aHdpaw=='))}",
                    "queries": {
                        "nextPageMarker": "",
                        "useFlatListing": false
                    }
                },
                "metadata": {
                    "JTJmcml0aHdpaw==": "/rithwik"
                },
                "runAfter": {},
                "type": "ApiConnection"
            }
        },
        "contentVersion": "1.0.0.0",
        "outputs": {},
        "parameters": {
            "$connections": {
                "defaultValue": {},
                "type": "Object"
            }
        },
        "triggers": {
            "When_a_blob_is_added_or_modified_(properties_only)_(V2)_2": {
                "evaluatedRecurrence": {
                    "frequency": "Second",
                    "interval": 3
                },
                "inputs": {
                    "host": {
                        "connection": {
                            "name": "@parameters('$connections')['azureblob']['connectionId']"
                        }
                    },
                    "method": "get",
                    "path": "/v2/datasets/@{encodeURIComponent(encodeURIComponent('AccountNameFromSettings'))}/triggers/batch/onupdatedfile",
                    "queries": {
                        "checkBothCreatedAndModifiedDateTime": false,
                        "folderId": "JTJmcml0aHdpaw==",
                        "maxFileCount": 1
                    }
                },
                "metadata": {
                    "JTJmcml0aHdpaw==": "/rithwik"
                },
                "recurrence": {
                    "frequency": "Second",
                    "interval": 3
                },
                "splitOn": "@triggerBody()",
                "type": "ApiConnection"
            }
        }
    },
    "parameters": {
        "$connections": {
            "value": {
                "azureblob": {
                    "connectionId": "/subscriptions/b8074c23f/resourceGroups/bojja/providers/Microsoft.Web/connections/azureblob",
                    "connectionName": "azureblob",
                    "id": "/subscriptions/b83c1ed3-c574c23f/providers/Microsoft.Web/locations/eastus/managedApis/azureblob"
                }
            }
        }
    }
}

enter image description here

3
Itachi07 On

With blob storage, writing a file with the same name to the same location will automatically overwrite the existing one. I'm not too familiar with cost management, but you can use an ADF pipeline and set the sink as your blob storage account container, and ensure the file is the same name so it will automatically overwrite the existing e.g. fileApril.csv will overwrite fileApril.csv. You can also set parameters in ADF to change naming convention to retain versioning e.g. using dynamic date() parameter should allow you to keep a file per month based on when the pipeline was run.