How to select data from Google BigQuery in Clojure via Java interop?

252 views Asked by At

I couldn't find any examples online. Can anyone point me to an example of how to select data from Google BigQuery in Clojure via Java interop?

[com.google.cloud/google-cloud-bigquery "2.16.0"]

Here's the Java example Google provides:

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.TableResult;

// Sample to query in a table
public class Query {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "MY_PROJECT_ID";
    String datasetName = "MY_DATASET_NAME";
    String tableName = "MY_TABLE_NAME";
    String query =
        "SELECT name, SUM(number) as total_people\n"
            + " FROM `"
            + projectId
            + "."
            + datasetName
            + "."
            + tableName
            + "`"
            + " WHERE state = 'TX'"
            + " GROUP BY name, state"
            + " ORDER BY total_people DESC"
            + " LIMIT 20";
    query(query);
  }

  public static void query(String query) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      QueryJobConfiguration queryConfig = QueryJobConfiguration.newBuilder(query).build();

      TableResult results = bigquery.query(queryConfig);

      results
          .iterateAll()
          .forEach(row -> row.forEach(val -> System.out.printf("%s,", val.toString())));

      System.out.println("Query performed successfully.");
    } catch (BigQueryException | InterruptedException e) {
      System.out.println("Query not performed \n" + e.toString());
    }
  }
}
1

There are 1 answers

3
Martin Půda On BEST ANSWER

I wasn't able to test this code, so you will probably have to do some adjustments, but at least for the general idea:

Dependency: [com.google.cloud/google-cloud-bigquery "2.16.0"]

Import in ns: (:import (com.google.cloud.bigquery BigQueryOptions QueryJobConfiguration BigQuery BigQueryException BigQuery$JobOption))

(defn use-query [query]
  (try (let [^BigQuery big-query (.getService (BigQueryOptions/getDefaultInstance))
             ^QueryJobConfiguration query-config (.build (QueryJobConfiguration/newBuilder query))
             results (.query big-query
                             query-config
                             (into-array BigQuery$JobOption []))]
         (doseq [row (.iterateAll results)
                 val row]
           (printf "%s" val))
         (println "Query performed successfully."))
       (catch BigQueryException e (printf "Query not performed \n %s" e))
       (catch InterruptedException e (printf "Query not performed \n %s" e))))

(let [project-id "MY_PROJECT_ID"
      dataset-name "MY_DATASET_NAME"
      table-name "MY_TABLE_NAME"
      query (str "SELECT name, SUM(number) as total_people\n"
                 " FROM `"
                 project-id
                 "."
                 dataset-name
                 "."
                 table-name
                 "`"
                 " WHERE state = 'TX'"
                 " GROUP BY name, state"
                 " ORDER BY total_people DESC"
                 " LIMIT 20")]
  (use-query query))