I have a KFH application that puts compressed json files as snappy into an S3 bucket. I have also a Glue Crawler that creates schema using that bucket. However, the crawler classifies the table as UNKNOWN. It cannot detect the file is json indeed. According to below doc, Glue crawler provides snappy compression with JSON files but I couldn't achieve it. https://docs.aws.amazon.com/glue/latest/dg/add-classifier.html#classifier-built-in
Thanks.
THis could happen, when the JSON files don't have same schema or it is complicated for the in-built classifiers to classify.
If JSON files have different schemas then you should filter different schema files. You can test this bc just running crawler on few JSON files.
If you are sure that the schema is same, but the crawler can't read it then build your own custom JSON classifier. You can read about it here. Once built, attach it to your Crawler and it should be able to read and status should change from UNKNOWN to your classifier's name.