I am creating a service that allows users to apply filters on bigquery data and export it as csv or json. Is there a way I can estimate the time, bigquery will take to export a set of rows.
Currently, I am recording the number of rows and the time it took to finish the export job. Then I take the average time of exporting a single row to estimate the time. But it is certainly not a linear problem.
Any suggestion on the prediction algorithm would be great too.
Unfortunately, there isn't a great way to predict how long the export would take. There are a number of factors:
Since most of these factors aren't really under your control, the best practices are:
Note that there are a couple of open bugs with respect to extract performance that we're working on addressing.