Process only files with filename above a certain number

46 views Asked by At

I have a single directory full of millions of files with file names such as e.g.:

234.txt
235.txt
236.txt

I would like to work through the files with a name that has an integer prefix above a certain value, which is determined by the last file processed in a previous run and fetched from a database.

At the minute I have:

for root, dirs, files in os.walk(directory):
    for filename in files:
        if int(re.split("\.",filename)[0]) > last_processed_id:
            <do some thing with file>

But I have hundreds of thousands of files, so this approach takes some time doing pointless work checking if the filename has been processed before. Is there a faster/better way to limit the files returned from os.walk() short of moving the files. once processed.

0

There are 0 answers