I am currently working with great expectations and am trying to validate a dataframe against expectations based on a very similar set of data. The dataframes are close to identical and have identical data types. However, according to the data doc 50% of the expectations fail, even though the dataframe has the expected values:
I assumed that when I created the dataframe as a data source, it might have automatically converted the dataframe types to something other than their default values. So I tried changing the data types of the original dataframe used to create the expectations, but still ran into identical errors. Below is the code. Might anyone have suggestions?
# Specify the name of the suite you want to use
suite_name = "my_suite"
datasource = context.sources.add_or_update_pandas(name="temp_datasource")
dataframe = pd.read_csv("temp_df.csv")
name = "temp_dataframe"
data_asset = datasource.add_dataframe_asset(name=name)
batch_request = data_asset.build_batch_request(dataframe=dataframe)
checkpoint = context.add_or_update_checkpoint(
name="my_checkpoint",
validations=[
{
"batch_request": batch_request,
"expectation_suite_name": suite_name,
},
],
)