Can Someone explain which component of the spark architecture convert Spark Application to DAG? Can Someone help me with where can I find the complete internal working of Spark architecture in absolute ultra depth?
I am trying to understand Apache Spark Architecture in Depth. At a very first stage, I understood, Spark Application is converted to DAG(Directed Acyclic Graph.) This DAG is scheduled by DAG Schedular to be executed according to the execution plan prepared by Spark Physical Execution Engine(Tungsten).
That would be the Catalyst Optimizer. This article discusses the Catalyst Optimizer in quite some detail.
Don't hesitate looking at the source code if you're looking for extreme detail, you'll always learn something new :D
Hope this helps!