I am working on this notebook. https://databricks.com/notebooks/simple-aws/petastorm-spark-converter-pytorch.html
I tried running the first line
df = spark.read.parquet("/databricks-datasets/flowers/parquet") \
.select(col("content"), col("label_index")) \
.limit(1000)
However I got this error
Path does not exist: dbfs:/databricks-datasets/flowers/parquet;
I am wondering where I can find the parquet version of the flowers dataset on databricks. FYI I am working on the community edition.
This dataset was converted into Delta format, so path right now is
/databricks-datasets/flowers/delta
, instead of/databricks-datasets/flowers/parquet
, and you need to read it with the corresponding code:P.S. You can always use
%fs ls path
command to see what files are at given pathP.P.S. I'll ask to fix that notebook if it's possible