Is there any difference between the Job data returned from Databricks Jobs API 2.1 vs 2.0?

1.3k views Asked by At

The main difference I see is that 2.1 requires paging results with a max page size of 25, while 2.0 does not support paging and will get you all of the results in one call. Is there any difference in the structure or content of the "Job" objects returned from the Jobs Get and List apis, e.g. /api/2.0/jobs/list? It is difficult for me to tell because I can't find the 2.0 specification, and the documentation just gives examples.

2

There are 2 answers

2
Saideep Arikontham On
  • The response returned from the GET methods of Jobs API 2.1 and 2.0 are different for each of the endpoints. The following is a sample response returned from each of the endpoints.

jobs/list:

When using jobs 2.0:

enter image description here

When using jobs 2.1:

enter image description here

  • The difference here is that the cluster details and notebooks paths are missing in Jobs 2.1.

jobs/get:

  • Using jobs 2.0: enter image description here

  • Using jobs 2.1

enter image description here

  • Information about tasks are present in Jobs API 2.1 but not in 2.0.

jobs/runs/get:

  • There are not many notable differences using this endpoint.

jobs/runs/get-output:

  • The response has been returned by the endpoint when using Jobs 2.0 API. But when using Jobs 2.1, it returned the following error:

enter image description here

So, the main difference for the GET methods is noticeable for jobs/list endpoint. The cluster details and the notebook details are absent in jobs 2.1 endpoint. Therefore, choose the API as per requirement.

2
Alex Ott On

There are few significant changes between 2.0 & 2.1 APIs:

  • Biggest change is support for jobs with multiple tasks:
    • now instead of single task in the top-level object you need to specify them in the tasks array
    • you can specify dependencies between tasks
    • you can reuse the same job cluster between multiple tasks - it will give your faster tasks startup
    • there are more supported task types in 2.1: dbt, sql, ...
  • Similarly, things like, get run output are supporting multiple tasks, but for example, you can't get output for the top-level job run, but need to get it for individual tasks
  • the list operation is now supporting paginated output that allows to overcome a previous limit of 3000 jobs per workspace. It also now supports listing of jobs by name
  • there is a new API call for repairing/re-runnign the failed tasks without the whole job.