Clear Databricks Artifact Location

Question

Clear Databricks Artifact Location

422 views Asked by jlim At 24 December 2022 at 08:39

I am using dbx cli to deploy my workflow into databricks. I have .dbx/project.json configured below:

{
    "environments": {
        "default": {
            "profile": "test",
            "storage_type": "mlflow",
            "properties": {
                "workspace_directory": "/Shared/dbx/projects/test",
                "artifact_location": "dbfs:/dbx/test"
            }
        }
    },
    "inplace_jinja_support": false,
    "failsafe_cluster_reuse_with_assets": false,
    "context_based_upload_for_execute": false
}

Everytime when I run dbx deploy ..., it stores my tasks scripts into the DBFS with some hash folder. If I ran 100 times dbx deploy ..., it creates 100 hash folders to store my artifacts.

Questions

How do I clean up the folders ?
Any retention policy or rolling policy that keeps the last X folders only ?
Is there a way to reuse the same folder everytime we deploy ?

As you can see, there are alot of folders generated whenever we ran dbx deploy. We just want to use the latest, the older one is not needed any more

Original Q&A

There are 2 answers

**jlim** · Answer 1 · 2022-12-25T04:05:40+00:00

I finally found a way to remove the old DBFS files. I just ran dbfs rm -r dbfs:/dbx/test before running deploy. This method is not ideal because if you have running cluster or cluster pending to start, it will fail due to the previous hash folder is being removed. Instead of depending on DBFS, i have configure my workflow to use GIT, this way i can remove the DBFS data without worrying any job is using it. It is strange that databricks still generate hash folder although no artifacts is uploaded to DBFS file system while using GIT as workspace

**renardeinside** · Answer 2 · 2023-01-30T09:21:57+00:00

renardeinside On 30 January 2023 at 09:21

author of dbx here.

There is a built-in command that cleans up the workspace and the artifact location:

dbx destroy ...

Please carefully read the documentation before running this command.

TechQA.

Clear Databricks Artifact Location

Questions

There are 2 answers

Related Questions in DATABRICKS

Related Questions in DATABRICKS-DBX

Popular Questions

Popular Tags

Trending Questions