List Question
20 TechQA 2023-10-04T22:55:57.317000Spark reads all columns in filtering when using scala syntax
129 views
Asked by jack
Why would finding an aggregate of a partition column in Spark 3 take very long time?
471 views
Asked by RyanCheu
How to structure large queries in spark
281 views
Asked by Zelazny7
Does Spark SQL optimize lower() on both sides?
92 views
Asked by Kathmandude
How do you inspect candidate logical plans of cost-based SQL optimizer in spark (scala)?
109 views
Asked by wpunter
is it possible to avoid second exchange when spark joins two datasets using joinWith?
168 views
Asked by dpolaczanski
Apache Spark What is the difference between requiredChildDistribution and outputPartitioning?
115 views
Asked by Arjunlal M.A
Apache Spark dataframe lineage trimming via RDD and role of cache
384 views
Asked by alexanoid
What is the role of Catalyst optimizer and Project Tungsten
3.9k views
Asked by Surendiran Balasubramanian
Dataframe API vs Spark.sql
2.7k views
Asked by Vijaya Bhaskar
Export a spark logical/physical plan?
1.6k views
Asked by Hamza EL KAROUI
Is it possible to outperform the Catalyst optimizer on highly skewed data using only RDDs
57 views
Asked by Allen Han
For "iterative algorithms," what is the advantage of converting to an RDD then back to a Dataframe
786 views
Asked by Allen Han
steps in spark physical plan not assigned to DAG step
131 views
Asked by user276537
Spark internals: benefits of Project
241 views
Asked by Alon
Long linear queries in Spark against a graph stored in Hive tables
83 views
Asked by Anthony Arrascue
Rewrite LogicalPlan to push down udf from aggregate
486 views
Asked by adream307
What happened to the ability to visualize query plans in a Databricks notebook?
475 views
Asked by mauna
Spark optimize "DataFrame.explain" / Catalyst
702 views
Asked by BiS
How to create custom Spark-Native functions without forking/modifying Spark itself
1k views
Asked by cozos