When to Use the Low-Level APIs?

1.3k views Asked by At

In spark, Resilient Distributed Datasets (RDDs) are low-level API's and dataframes are a high-level API's so my question is when to use low-level API's?

1

There are 1 answers

0
swapnil shashank On

Spark has two fundamental sets of APIs: the low-level “unstructured” APIs, and the higher-level structured APIs.

RDD can be process both structured as well as unstructured data where as a dataframe organizes the data into row column format therefore works on structured data. You can convert a dataframe to rdd if required.

In general people use dataframe and therefore high level api's as it gives more options. But this purely depends on your requirement.

I will suggest you to read either through books like 'Learning Spark' or 'Spark -The Defintive Guide', for more clarification.