I am doing an os.walk() over a certain part of my OneDrive synced folder structure. It all worked fine until recently. Now ALL files from one specific directory are ignored. I tested several possible reasons and narrowed it down to this: The directory that is ignored is the one that holds the most files (897 at this point).
If I remove two of the files from said directory (it does not matter which two), it works and all files are recognized. When I add the files again, the result is the same: No files from that directory turn up in my os.walk() result list.
I did check Microsoft's Restrictions and limitations in OneDrive and SharePoint, but am far from any of the file size and number (1 ,2) limits mentioned.
My code looks like this
files = []
for root, dir, files in os.walk(mainDirectory):
for f in files:
if 'Common part' in root:
files.append(os.path.join(root, f))
'Common part' is a text string, that all relevant folders in the mainDirectory have in common.
The directory itself is recognized all the times, just the files are not added to my list. So, I tried another approach featuring glob.glob(). Here, the results are a bit different but still not satisfactory:
folders = []
for root, dir, files in os.walk(mainDirectory):
for d in dir:
if d.startswith('Common part')
folders.append(os.path.join(root, d))
files = [glob.glob(os.path.join(f,'*.xlsx')) for f in folders]
This does give me approximately half the files from the problematic folder. Again, when I remove two files, it gives me the full list.
When I copy/move the files to a local (not OneDrive synced) path, it works. So I guess it does have to do with OneDrive. Having the files outside of OneDrive is not an option.
The directory in question is not directly in my OneDrive but a "Sync"/"Shortcut" from SharePoint.
All files can be opened, they are downloaded, not on-demand. I have removed the sync and re-synced the folder. I have restarted OneDrive (and my machine) several times
I am really at a loss here. Any hints welcome!
Update: Thanks to the help of @GordonAitchJay, it could be established, that at the threshold of files (or sum of file sizes?) functions like os.listdir() and win32file.FindFilesW() stop returning their usual output and instead return OSError: [WinError 87] The parameter is incorrect
Also, in the meantime, we reproduced the same behaviour on another machine within the same organization. This was conducted after a full reset of my OneDrive did not result in any improvement.
Though I can't prove it, it seems that OneDrive is up to some sort of tomfoolery that causes win32's
FindNextFileWto fail with aERROR_INVALID_PARAMETERerror, but apparently only when it is called by Python'sos.walk,os.listdir, andwin32file.FindFilesW, and when some files have been deleted from the OneDrive directory syncing a SharePoint folder. Utterly bizarre. I'm thinking maybe OneDrive hooksFindNextFileWwhich remains after ending the OneDrive process and services with Task Manager.A workaround is to use ctypes to call the lower level NtQueryDirectoryFile function (which is ultimately what
FindNextFileWcalls anyway).Eryk Sun's answer to another question has a working example. I have copied it below, and have only changed the last couple lines: