In my bigquery client, I am enabling the storage read client, as per the documentation and examples at https://github.com/googleapis/google-cloud-go/blob/f2b13307a85e81e278476ea51a359cbb3974667a/bigquery/examples_test.go#L165-L186
My code is basically identical:
err = bigQueryClient.BQ.EnableStorageReadClient(context.Background(), option.WithCredentialsFile(o.GoogleServiceAccountCredentialFile))
[...]
it, err := query.Read(context.TODO())
if err != nil {
log.WithError(err).Error("error querying test status from bigquery")
errs = append(errs, err)
return status, errs
}
for {
testStatus := apitype.ComponentTestStatusRow{}
err := it.Next(&testStatus)
if err == iterator.Done {
break
}
....
}
I then make a query that returns about 150,000 rows, about 150MB. I then iterate over them. This is taking about 30 seconds, which is really slow to me. Similar activity from postgres is about 3 seconds.
I was trying to figure out if my app is using the storage API at all, and my understanding is this is an RPC-based protocol. I profiled the golang app using pprof and I see a lot of time being spent in JSON decoding from the bigquery client. I then look at tcpdump and see it making lots of https connections.
This makes me think it's not using the storage API. Is there a way to figure out why, or to force it to use the faster storage API if it's not?
While debugging and trying to use the storage API directly, I discovered my user was missing the correct permission to use the storage API (bigquery.readsessions.create). EnableStorageReadClient in the golang library doesn't actually return an error if you can't use it, everything just falls back silently to the very slow REST API.
Reproducer code
Results using an account with and without that permission:
I filed a bug against the golang library.
https://github.com/googleapis/google-cloud-go/issues/9102