I have a highly branched folder structure as shown below. I want to match a barcode, which is nested between other descriptors and open/read a file of interest in that barcode's folder. The contents of each XXbarcodeXX folder are basically the same.
First 3 Levels of Directory

I have tried to use os.walk(), glob.glob(), and os.listdir() in combination with fnmatch, but none yielded the correct folder. and glob.glob() just returned an empty list which I think means it didnt find anything.

The closest of which I did not let finish bc it seemed to be going top down through each folder rather than just checking the folder names in the second level. This was taking far too long bc some folders in the third and fourth levels have hundreds of files/folders.

import re

path='C:\\my\\path'
barcode='barcode2'
 
for dirname, dirs, files in os.walk(path):
    for folds in dirs:
        if re.match('*'+barcode+'*', folds):
            f = open(os.path.join(dirname+folds)+'FileOfInterest.txt', 'w')
1

There are 1 answers

0
n1colas.m On

The * in re.match regex you are using will probably generate an error (nothing to repeat at position 0) since is using a quantifier (zero or more times) without any preceding token. You may try to replace your regex with '..' + barcode + '..'. This regex will match your expected barcode string between any two characters (except for line terminators). In the command os.path.join you may join all the path's names and the desired file in the same command to avoid any issues with the specific OS separator.

import os
import re

path='dirStructure'
barcode='barcode2'

for dirname, dirs, files in os.walk(path):
    for folds in dirs:
        if re.match('..' + barcode + '..', folds):
            f = open(os.path.join(dirname, folds, 'FileOfInterest.txt'), 'r')
            print(f.readlines())
            f.close