Error BadGzipFile when read gz file via python gzip

36 views Asked by At

The source file is json and we zipped it to gz format. The file is good. I am able to open the file with notepad++

import gzip

# Open the GZIP file in text mode ('rt')
with gzip.open('example.gz', 'rt') as f:
    file_content = f.read()
    print(file_content)

The error I get is

BadGzipFile: Not a gzipped file (b'{\n')

I also try to read line by line and get the same error

import gzip

with gzip.open('example.gz', 'r') as fin:
    for line in fin:
        print('got line:', line)

This is my sample json data:

{
  "metadata_version": 1,
  "created": "2024-01-31T16:02:11.400125+00:00",
  "domain": {
    "name": "myname",
    "version": 1,
    "type": "core"
  },
  "id1": "01HNG439A8M7395MB9CWC4XSKC",
  "id2": {
    "id3": "efbc9315-6a27-455b-9050-02ea08eb1b69",
    "id4": "05933069-eeb5-4801-8801-fdd9819d08bf",
    "id5": "8b642da5-e954-402c-bcb9-a196d594ed62"
  },
  "data": "AAAAAAAA22RzW7CMBCEXyXymVQJNOHnVgGlHIoikvbQ2+IsYMnYdNemQlXfvQ4Q4MB1ZvebWftXVASGQTplzcyrWozEWub9LM8HsUyyNE5TxBjSvoyTJEuyfPUMvXUqOmKJ3x7ZTcChGBmvdUeMNQIps3mznnE"
}
1

There are 1 answers

0
Anson On BEST ANSWER

The GZ file is downloaded from AWS S3. When we download the file, AWS unzips it to its original JSON format automatically, and the file name remains myfile.gz. Despite the file name being myfile.gz, it is actually a JSON file, not a GZ file.