pd.read_excel / ExcelWriter raise BadZipFile: File is not a zip file

176 views Asked by At

I am trying to concatenate a series of excel files with Pandas. I created a for loop to read in all file for a provided directory and I specified the sheet name alongside the columns I wanted to read in. Once I read those files I concatenated them as a DataFrame variable which I named 'list_of_df'.
I have tested my code up to the point where they are concatenated and I can see the error is initialized once I begin to concatenate the various excel sheets. Currently I am getting the error 'File is not a zip file'

import pandas as pd
import glob

df = []
list_of_df = pd.DataFrame()

for filename in all_files:
    df = pd.read_excel(filename, 
                       sheet_name = 'cycle', 
                       usecols=['Chg. Cap.(mAh)','DChg. Cap.(mAh)'], 
                       engine = 'openpyxl')
    list_of_df = pd.concat([list_of_df. df], ignore_index = True)

print(list_of_df)

with pd.ExcelWriter('output3.xlsx',  engine = 'openpyxl', mode = 'a',) as writer:
    df.to_excel(writer, sheet_name = 'output3')

This is the error raised:

Traceback (most recent call last):

  File ~\AppData\Local\anaconda3\Lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File c:\documents\python scripts\chg._and_dchg._curves_for_excel5.py:30
    df = pd.read_excel(filename, sheet_name = 'cycle', usecols=['Chg. Cap.(mAh)','DChg. Cap.(mAh)'], engine = 'openpyxl')

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\util\_decorators.py:211 in wrapper
    return func(*args, **kwargs)

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\util\_decorators.py:331 in wrapper
    return func(*args, **kwargs)

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\io\excel\_base.py:482 in read_excel
    io = ExcelFile(io, storage_options=storage_options, engine=engine)

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\io\excel\_base.py:1695 in __init__
    self._reader = self._engines[engine](self._io, storage_options=storage_options)

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\io\excel\_openpyxl.py:557 in __init__
    super().__init__(filepath_or_buffer, storage_options=storage_options)

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\io\excel\_base.py:545 in __init__
    self.book = self.load_workbook(self.handles.handle)

  File ~\AppData\Local\anaconda3\Lib\site-packages\pandas\io\excel\_openpyxl.py:568 in load_workbook
    return load_workbook(

  File ~\AppData\Local\anaconda3\Lib\site-packages\openpyxl\reader\excel.py:315 in load_workbook
    reader = ExcelReader(filename, read_only, keep_vba,

  File ~\AppData\Local\anaconda3\Lib\site-packages\openpyxl\reader\excel.py:124 in __init__
    self.archive = _validate_archive(fn)

  File ~\AppData\Local\anaconda3\Lib\site-packages\openpyxl\reader\excel.py:96 in _validate_archive
    archive = ZipFile(filename, 'r')

  File ~\AppData\Local\anaconda3\Lib\zipfile.py:1302 in __init__
    self._RealGetContents()

  File ~\AppData\Local\anaconda3\Lib\zipfile.py:1369 in _RealGetContents
    raise BadZipFile("File is not a zip file")

BadZipFile: File is not a zip file
1

There are 1 answers

0
OCa On

Summarizing the value from the comments:

  1. The error traceback indicates that the issue is with the reading of the Excel files df = pd.read_excel(filename, sheet_name = 'cycle', usecols=['Chg. Cap.(mAh)','DChg. Cap.(mAh)'], engine = 'openpyx), not the writing after concatenating, as you seemed to believe.
  2. Possible causes have been reported:

File corruption might be too easy a culprit and no evidence was given. The sensitivity tag on the other hand looks well documented.

Based on this, to turn this post into a useful one, both for you and present/future people experiencing the same issue,

  • in the spirit of MRE, reformulate your post into reading one or two files that cause / don't cause the issue (a major edit that would be better done by yourself - might not be accepted if I do it)
  • please come report your findings!