When using textFile to create an RDD in Spark, what is the index that is displayed in the result?

101 views Asked by At

When I create an RDD using sc.textFile in Spark, I get a result like:

org.apache.spark.rdd.RDD[String] = file:///home/cloudera/data MapPartitionsRDD[133] at textFile at <console>:23

What does the [133] represent? I see that it increases, so feels like some kind of ID.

1

There are 1 answers

0
Sathish On BEST ANSWER

Yes, looking at the implementation of RDD, its the ID of the RDD that will be used to identify the RDD uniquely within the SparkContext,

See below the toString() method of RDD where the id gets included along with the creation site.

override def toString: String = "%s%s[%d] at %s".format(
    Option(name).map(_ + " ").getOrElse(""), getClass.getSimpleName, id, getCreationSite)