DataType (UDT) v.s. Encoder in Spark SQL

388 views Asked by jack At 04 October 2020 at 23:23

In Spark SQL, there're limited DataTypes for Schema, and there're limited Encoders for converting JVM objects to and from the internal Spark SQL representation.

In practice, we may have errors like this regarding DataType, which usually happens in a DataFrame with custom types, BUT NOT in a Dataset[T] with custom types. Discussion of DataType (or UDT) points to How to define schema for custom type in Spark SQL?

java.lang.UnsupportedOperationException: Schema for type xxx is not supported

In practice, we may have errors like this regarding Encoder, which usually happens in a Dataset[T] with custom types, BUT NOT in a DataFrame with custom types. Discussion of Encoder points to How to store custom objects in Dataset?

Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._ Support for serializing other types will be added in future releases

In my understanding, both touches the internal Spark SQL optimizer (which is why only a limited number of DataTypes and Encoders are provided); and both DataFrame and Dataset are just Dataset[A], then..

Question (or more.. confusion)

Why first error only appears in DataFrame but not in Dataset[T]? Same question for the second error...
Can creating UDT solve the 2nd error? Can creating encoders solve the 1st error?
How should I understand the relation between each, and how do they interact with Dataset or Spark SQL engine?

The initiative of this post is to explore more in the two concepts and to attract open discussion, so please bear a bit if the questions are not too specific.. and any sharing of understanding is appreciated. Thanks.

Original Q&A

TechQA.

DataType (UDT) v.s. Encoder in Spark SQL

There are 0 answers

Related Questions in APACHE-SPARK

Related Questions in APACHE-SPARK-SQL

Related Questions in APACHE-SPARK-DATASET

Related Questions in USER-DEFINED-TYPES

Related Questions in APACHE-SPARK-ENCODERS

Popular Questions

Popular Tags

Trending Questions