I'm following http://spark.apache.org/docs/latest/sql-programming-guide.html
After typing:
val df = spark.read.json("examples/src/main/resources/people.json")
// Displays the content of the DataFrame to stdout
df.show()
// +----+-------+
// | age| name|
// +----+-------+
// |null|Michael|
// | 30| Andy|
// | 19| Justin|
// +----+-------+
I have some questions that I didn't see the answers to.
First, what is the $-notation? As in
df.select($"name", $"age" + 1).show()
Second, can I get the data from just the 2nd row (and I don't know what the data is in the second row).
Third, how would you read in a color image with spark sql?
4th, I'm still not sure what the difference is between a dataset and dataframe in spark. The variable df is a dataframe, so could I change "Michael" to the integer 5? Could I do that in a dataset?
$
is not annotation. It is a method call (shortcut fornew ColumnName("name")
).