At my company, we are using Clockify to track time. Now, it's my job to extract the data from Clockify API and use it in a Power BI report.
To do this, I use a POST request for a Time Entries Detailed Report, so I get all the time entries. However, there is a limit of a 1000 records when accessing this API. To overcome this issue, I use a function that POST the request, which takes a pagenumber parameter and a second function that generates a list from 1 to x, where it loops through all the pages and stops when 0 time entries are returned for that page. Code provided below.
This was working fine for a while, but more and more colleagues are tracking more and more time. Therefor my query refresh fails more and more often, due to request timeout. My hypotheses: looping through all the pages now takes too long.
Function:
fClockifyDetailedReport
= ( pagenumber ) =>
let
body = "{ ""dateRangeStart"": ""2023-01-01T00:00:00.000"", ""dateRangeEnd"": """&DateTime.ToText(DateTime.LocalNow(), [Format="yyyy-MM-dd'T'HH:mm:ss", Culture="en-US"]) &""", ""detailedFilter"": { ""page"": """&Number.ToText( pagenumber )&""", ""pageSize"": ""1000"", ""options"": { ""totals"": ""EXCLUDE"" } } }", Source = Json.Document(Web.Contents("https://reports.api.clockify.me/v1/workspaces/{WORKSPACEID}/reports/detailed", [ Headers = [#"Content-Type"="application/json", #"X-Api-Key"="{APIKEY}"], Content=Text.ToBinary(body) ]))
in
Source
Loop to make the table:
tClockifyDetailedReport
= List.Generate(
()=> [pagenumber = 1, report = fClockifyDetailedReport(1)],
each not List.IsEmpty([report][timeentries]),
each [pagenumber = [pagenumber] + 1, report = fClockifyDetailedReport( [pagenumber])],
each[report
]
)
Is there any way to overcome this issue? My solution would be a data warehouse to store the data, where data from previous months will no longer be refreshed, since they no longer change. That way I only have to request the data for the current month, what will reduce the number of pages significantly. Is there a way in Power Query to do this? Some way of only refreshing a part of the data? Or do I need to use different software to extract the data to a data warehouse and get my data into Power Bi by accessing this data warehouse instead of the API directly?
Help will be highly appreciated!
If your problem is your web request is getting rate limited, or a regular time out -- you can use Value.WaitFor()
Often APIs will return values in the response header that will say essentially 'you have X requests until time Y' or 'you hit the rate limit, you can resume at time t' where 't' might be 10 seconds from now
use
ManualStatusHandlingIf you don't, HTTP Status errors come back as exceptions. Instead, you probably want to capture
429,500,404.Easier debugging
For testing you can use httpbin.org/Status_codes to generate web Errors on demand.
Try catch
You can use
catchas syntactic sugar over the oldHasErrorsyntax.eis theErrorRecorde[Message]returns the error as plain text.If you just want the error as a text message, use
e[Message]A method to visually debug queries
Here's a longer example that visually catches errors