Basic data validation in AWS Glue against schema/expected file format, including row level

1.4k views Asked by At

I am new to AWS. Need to process the daily feeds (should be same format every day) received via SFTP and loaded into S3, then processed by AWS Glue and loaded into the database. I.e.:

SFTP --> Amazon S3 --> AWS Glue --> Validate data --> Load into Aurora DB

How can I validate the data received from S3 against a certain pre-defined schema? How do I log/report/send alerts for when some elements in the file are not valid? Does it allow the row level validation (i.e., row 5 and 9 invalid, but the rest of the file can be valid)?

I searched online and cannot find this basic information.

0

There are 0 answers