Here is my code to read the parquet files stored in an S3 bucket path. When it finds the parquet files in the path, it works, but gives exceptions.NoFilesFound
when it cannot find any file.
import boto3
import awswrangler as wr
boto3.setup_default_session(profile_name="myAwsProfile", region_name="us-east-1")
path_prefix = 's3://example_bucket/data/parquet_files'
path_suffix = '/y=2021/m=4/d=13/h=17/'
table_path = path_prefix + path_suffix
df = wr.s3.read_parquet(path=table_path)
print(len(df))
Output:
22646
If there is no file in the S3 path, for example, if I change the path_suffix
from '/y=2021/m=4/d=13/h=17/'
to '/y=2021/m=4/d=13/h=170/'
, I get the following error:
---------------------------------------------------------------------------
NoFilesFound Traceback (most recent call last)
<ipython-input-9-17df460412d8> in <module>
11
12 file_prefix = table_path + date_prefix
---> 13 df = wr.s3.read_parquet(path=file_prefix)
/usr/local/lib/python3.9/site-packages/awswrangler/s3/_read_parquet.py in read_parquet(path, path_suffix, path_ignore_suffix, ignore_empty, ignore_index, partition_filter, columns, validate_schema, chunked, dataset, categories, safe, map_types, use_threads, last_modified_begin, last_modified_end, boto3_session, s3_additional_kwargs)
602 paths = _apply_partition_filter(path_root=path_root, paths=paths, filter_func=partition_filter)
603 if len(paths) < 1:
--> 604 raise exceptions.NoFilesFound(f"No files Found on: {path}.")
605 _logger.debug("paths:\n%s", paths)
606 args: Dict[str, Any] = {
NoFilesFound: No files Found on: s3://example_bucket/data/parquet_files/y=2021/m=4/d=13/h=170/.
Looks like it is coming from awswrangler
Python library, so botocore.exceptions can't catch it. I can simply use python's try:
and except:
to bypass, but I need to catch it to properly handle it. How can I do this?
If you want to just catch the exception,