I have a parquet file with a number of columns of type converted_type (legacy): TIMESTAMP_MICROS. I want to check if the flag isAjustedToUTC is true. I can get it this way:
import pyarrow.parquet as pq
import re
arrow = pq.ParquetFile("/Parquet/File/Path/filename.parquet")
timestamp_string = str(arrow.metadata.row_group(0).column(79).statistics.logical_type)
re.search("isAdjustedToUTC=(.*), timeUnit",timestamp_string).group(1)
This gives me either true or false as string. Is there another way to retrieve the value of isAdjustedToUTC without using a regex?
As far as I can tell it's not possible.
logical_typeis of typepyarrow._parquet.ParquetLogicalTypewhich doesn't expose directly it's underlying members.The only available fields are:
You could use the
to_jsonfunction, but it's as dirty as the option you've suggested: