How to increase query execution time of Presto in Spark

1.1k views Asked by At

I'm currently connecting to Presto using Spark. Our queries are getting timed out after 60m, to increase query execution time I've set query.max-execution-time parameters in getDBProperties() like below

private def constructPrestoDataFrame(sparkSession : SparkSession, jobConfig : Config, query : String) : DataFrame = {
    sparkSession
      .read
      .jdbc(getPrestoConnectionUrl(jobConfig), query, getDBProperties(jobConfig))
  }

  private def getDBProperties(jobConfig : Config) : Properties = {
    val dbProperties = new Properties
    dbProperties.put("user", jobConfig.getString("presto.user"))
    dbProperties.put("password", jobConfig.getString("presto.password"))
    dbProperties.put("Driver", jobConfig.getString("presto.driver.name"))
    dbProperties.put("query.max-execution-time", "2d")

    dbProperties
  }

  private def getPrestoConnectionUrl(jobConfig : Config) : String = {
    s"jdbc:presto://${jobConfig.getString("presto.host")}:8443/${jobConfig.getString("presto.catalogue.name")}?SSL=true&SSLTrustStorePath=${jobConfig.getString("sslTrustStorePath")}"+
      "&SSLTrustStorePassword="+URLEncoder.encode(jobConfig.getString("sslTrustStorePassword"))
  }

When I run the job I'm getting exception saying exception caught: Cause = null Message = Unrecognized connection property 'query.max-execution-time'

We use apache-spark-2.3.x, presto-jdbc-driver-300.

1

There are 1 answers

0
Beniamin H On

Adding MAX_EXECUTION_TIME to sessionVariables in the URL does the job for me:

jdbc:mysql://{host}:{port}/{database}?sessionVariables=MAX_EXECUTION_TIME=123456666

Query to verify:

SELECT @@max_execution_time

Expected output:

+--------------------+ 
|@@max_execution_time| 
+--------------------+ 
| 123456666          | 
+--------------------+