TechQA.

Linked Questions

  • spark job failing in windows with java.io.IOException: (null) entry in command string: null chmod 0644
  • org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute, tree: EMP_SAL#7736
  • Passing command line arguments to Spark-shell
  • Spark: Best practice for retrieving big data from RDD to local machine
  • Compressing sequence file in Spark?
  • Hadoop and Spark
  • Spark: Monitoring a cluster mode application
  • Installing Apache Spark on Windows
  • How can I use the literal value of a spark dataframe column?
  • How to filter invalid xmls
  • How to write data to hive table with snappy compression in Spark SQL
  • how does spark does in-memory computation
  • Killing Spark job using command Prompt
  • spark-sql 1.3 writes parquet much faster than spark-sql 2.4
  • Spark duplicated workers instantiated

Popular Questions

  • Partially applied generic function "cannot be cast to Nothing"
  • Peek and Pop not an option
  • Run JIRA in port 80 as root
  • Agar.io style ripple effect for canvas arcs
  • What is the difference between [ValidateModel] and a check of valid state in ASP.NET?
  • Passing shared_ptr to std::function (member function)
  • UWP location tracking even when the app was suspended
  • Docker – fix service IP addresses
  • Dynamic partition in hive
  • How to enable Indications on Client Configuration descriptor from iOS8

Decide number of Executors to process large amount of data

Asked by sri At 11 May at 15:12

How many executors are required to process 1PB of data in Spark. Is there any formula to calculate number of executors.

Thanks,

apache-spark
Original Q&A

0 Answers

Related Questions

  • API compatibility between scala and python?
  • Installed Spark, built against right hadoop version , getting cannot assigned requested address error
  • spark RDD (Resilient Distributed Dataset) can be updated?
  • Spark Clusters: worker info doesn't show on web UI
  • What will spark do if I don't have enough memory?
  • Does PySpark offer advantage when data size is bigger than memory?
  • Spark FileStreaming not Working with foreachRDD
  • Connection Refused When Running SparkPi Locally
  • What is the difference between map and flatMap and a good use case for each?
  • When to use SPARK_CLASSPATH or SparkContext.addJar
  • Apache Spark compile failed while installing Netty
  • Java samples for GraphX
  • Passing set of lines in Apache Spark
  • Passing configuration to Spark Job
  • Change Executor Memory (and other configs) for Spark Shell
  • Privacy
  • Terms
  • Cookies