Data Config
from sagemaker import clarify
data_config_unbalanced = clarify.DataConfig(
s3_data_input_path='s3://a/b.csv', # S3 object path containing the unbalanced dataset
s3_output_path='s3://a/', # path to store the output
label='My Categorical Target Column', # target column
headers=df.columns.to_list(),
dataset_type='text/csv',
)
Clarify Config
bias_config_unbalanced = clarify.BiasConfig(
label_values_or_threshold=['Yes'], # desired sentiment
facet_name='My facet continous int column' # sensitive column (facet)
)
Any ideas how to resolve Clarify Job keeps returning ClientError: Threshold values must be provided for continuous features
from clarify job. Documentation was not very helpful