When I create an RDD using sc.textFile
in Spark, I get a result like:
org.apache.spark.rdd.RDD[String] = file:///home/cloudera/data MapPartitionsRDD[133] at textFile at <console>:23
What does the [133]
represent? I see that it increases, so feels like some kind of ID.
Yes, looking at the implementation of RDD, its the ID of the RDD that will be used to identify the RDD uniquely within the SparkContext,
See below the toString() method of RDD where the id gets included along with the creation site.