Spark Thrift server load full dataset into memory before transmission via JDBC

Question

Spark Thrift server load full dataset into memory before transmission via JDBC

885 views Asked by Triffids At 01 November 2018 at 08:37

Spark Thrift server trying to load full dataset into memory before transmission via JDBC, on JDBC client I'm receiving error:

SQL Error: org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 48 tasks (XX GB) is bigger than spark.driver.maxResultSize (XX GB)
  org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 48 tasks (XX GB) is bigger than spark.driver.maxResultSize (XX GB)
  org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 48 tasks (XX GB) is bigger than spark.driver.maxResultSize (XX GB)

Query: select * from table. Is it possible enable something like stream mode for Thrift Server? The main goal - grant access from Pentaho ETL to Hadoop cluster using SparkSQL via JDBC connection. But if Thrift Server should load full dataset into memory before transmission this approach will not work.

Original Q&A

There are 2 answers

**Sanjai Verma** · Answer 1 · 2018-11-03T12:12:20+00:00

Sanjai Verma On 03 November 2018 at 12:12

I your situation increase the spark driver memory and max result size as spark.driver.memory=xG ,spark.driver.maxResultSize=xG. according to https://spark.apache.org/docs/latest/configuration.html

**Triffids** · Answer 2 · 2018-11-03T07:59:55+00:00

Triffids On 03 November 2018 at 07:59

Solution: spark.sql.thriftServer.incrementalCollect=true

TechQA.

Spark Thrift server load full dataset into memory before transmission via JDBC

There are 2 answers

Related Questions in APACHE-SPARK

Related Questions in SPARK-THRIFTSERVER

Popular Questions

Popular Tags

Trending Questions