How to change pdf filenames in a directory from excel data?

56 views Asked by At

I have an excel file with a column named old_name. Which is a column of a list of old filenames that need to be changed. Then in the next column I have new_name. I want to change all pdfs in a directory from old name to the new name. Like this:

Old_name New_name
Nike Adidas
Puma Gucci

The code I've tried is:

import os
import pandas as pd

directory = r"C:\x\x\x\PDF"
df = pd.read_excel("New_name.xlsx")

for index, row in df.iterrows():
    old_name = row["Old_name"]
    new_name = row["New_name"]
    old_path = os.path.join(directory, old_name)
    new_path = os.path.join(directory, new_name)

    if os.path.exists(old_path) and old_name.endswith(".pdf"):
        os.rename(old_path, new_path)
        print(f"Renamed {old_path} to {new_path}")

The problem is that I get a error of FileNotFoundError. Using this code, it seems like it's looking for the new_name data in the directory, they are not there (they are in the excel) so then it says error due to not finding in the directory.

2

There are 2 answers

0
Tunahan A. On

This might work:

import pandas as pd
import os

df = pd.read_excel("New_name.xlsx") 
directory = r"C:/x/x/x/PDF/"
replace_dict = dict(zip(df["Old_name"].unique(), df["New_name"].unique()))

pdf_files = [f for f in os.listdir(directory) if ".pdf" in f]

for old, new in replace_dict.items():
    if f"{old}.pdf" in pdf_files:
        os.rename(f"{directory}/{old}.pdf", f"{directory}/{new}.pdf")
0
Timeless On

I can't trigger the FileNotFoundError with your code. Although this one does not rename the files, no error is thrown since the if is never satisfied. Anyways, to fix your code, you need to add the .pdf extension to the filenames from your spreadsheet/df :

directory = "C:/x/x/x/PDF"
df = pd.read_excel("New_name.xlsx")

for index, row in df.iterrows():
    old_name = f'{row["Old_name"]}.pdf' # <- updated
    new_name = f'{row["New_name"]}.pdf' # <- updated
    old_path = os.path.join(directory, old_name)
    new_path = os.path.join(directory, new_name)
        
    if os.path.exists(old_path): # <- updated
        os.rename(old_path, new_path)
        print(f"Renamed {old_path} to {new_path}")

Another variant (with ) :

from pathlib import Path

directory = Path("C:/x/x/x/PDF")
df = pd.read_excel("New_name.xlsx")

for old_name, new_name in zip(df["Old_name"], df["New_name"]):
    old_path = directory / f"{old_name}.pdf"
    new_path = directory / f"{new_name}.pdf"

    if old_path.exists():
        old_path.rename(new_path)
        print(f"Renamed {old_path} to {new_path}")