I want to read parquet files in Synapse Notebook. I tried it using wildcard but the "FileNotFoundError" occurred.
The folder structure I want to read is like this.
test/year={yyyy}/month={MM}/day={dd}/*.parquet
And the code executed like this.
df = pd.read_parquet('abfss://[email protected]/test/*/*/*/*.parquet', storage_options = '')
Any answer would be helped. Thank you.
The wildcard character (*) is not supported in the path of the
pd.read_parquet()
function. It takes*
as the absolute filepath and tries to read the file. That is the reason forFile not found
error. To read all the files under specified folder, you can usespark.read.parquet
function.Code: