I tried using openpyxl and had a lot of trouble, now im trying pandas and an getting tons of tracebacks. Seems like copy pasting merged cells is hard and very specific, either that or im just a nooby. Can anyone help with tips or solutions?
import os
import pandas as pd
# Set the paths for the interim and finals folders
interim_folder = r'C:\Users\Jake_\Downloads\Code\Interims'
finals_folder = r'C:\Users\Jake_\Downloads\Code\Finals'
# Get a list of all files in the interim folder
interim_files = os.listdir(interim_folder)
# Iterate over each file in the interim folder
for file_name in interim_files:
# Check if the file is an Excel file
if file_name.endswith('.xlsx'):
interim_file_path = os.path.join(interim_folder, file_name)
finals_file_path = os.path.join(finals_folder, file_name)
# Read the interim Excel file using pandas with the xlrd engine
interim_data = pd.read_excel(interim_file_path, sheet_name='Financial Report Interim Mar 10', engine='xlrd')
# Copy the required data to a new DataFrame
finals_data = pd.DataFrame()
finals_data['Column B'] = interim_data.iloc[3:8, 1] # Copy B4:B8
finals_data['Column D'] = interim_data.iloc[5:201, 3] # Copy D6:D200
finals_data['Column E'] = interim_data.iloc[5:201, 4] # Copy E6:E200
finals_data['Column F'] = interim_data.iloc[5:201, 5] # Copy F6:F200
finals_data['Column G'] = interim_data.iloc[5:201, 6] # Copy G6:G200
# Write the data to the finals Excel file
writer = pd.ExcelWriter(finals_file_path, engine='openpyxl')
finals_data.to_excel(writer, sheet_name='Final Financial Report July 7', index=False)
writer.save()
and heres the tracebacks i get
What im trying to do: I have about 100 files in a folder called interims - and each of those files has a corresponding file thats literally the same name a folder called finals. I want to copy a specific few cells from a specific worksheet called "Financial Report Interim Mar 10" from each of the files in the interim folder and paste into a worksheet called "Final Financial Report July 7" in the corresponding files in the finals folder. Have been trying and failing for like 6 hours.
Thanks
Tried using openpyxl and now pandas, still no luck at all.
It looks like the excel workbook you have provided is the one being written to i.e. the 'Final' sheet, so not sure what the interim sheet looks like.
However to address the actual issue in the Openpyxl only code from the traceback you provided;
You want to obtain the top left cell of the merged cells. That cannot be obtained directly from merged cells, change that section
to something like this
With the above code example if the cells D6 and D7 were merged then
top_left_cell
will be'D6'
.But as noted in the comment with the main loop;
you will be trying to process non existent cells. Using the example of cells D6 and D7 being merged, on the first row loop the cells will be
D6, E6, F6
etc, then on the second loop the first cell will be the non existantD7
so at that point the code needs to determine that or it will break again.The same issue can also occur in the opposite way. If D7 is a valid cell in the interim sheet but merged on the Final sheet (and top left is 'D6') then the line
would also fail since you cannot write to it in the Finals sheet. It would return the error;