I have a list of different files in my folder and these files have several formats, like PDF, txt, Docx and HTML. I want to validate the format of the files in python.
Here is my attempt
import os
import pdftables_api
import glob
path = r"myfolder\*"
files = glob.glob(path)
for i in files:
if i.endswith('.pdf'):
conversion = pdftables_api.Client('my_api')
conversion.xlsx(i,r"destination\*")
The reason for this is I want to iterate through each file and check if the file is pdf, then it is pdf, convert it into excel using API from PDFTable_api package in python and save it in the destination folder. But I don't feel like this is an efficient way to do this.
Can anyone please help me if there is an efficient manner of achieving this?