I have a php-application which is (per request) scanning for the existance of some files. (on a network share)
I'm using glob
for this, cause usually i just know the beginning of the filename.
I noticed, that glob
does not return files, that are currently opened by any client, thus my application thinks file_xy
is not existing, if somebody has opened it.
Is there a way to make glob
return opened (:= locked?) files as well?
The strange thing is, that this is no where mentioned. However I can confirm that glob is NOT returning files, that are currently opened by a client... (As soon as the client closes the accessing application, glob
will return the file as usual)
ps.: not even glob("\\server\share\*")
is returning the file as long as its opened. (Network Share allows the maximum number of concurrent users)
$dir = opendir ("\\server\share");
while ($file = readdir($dir)){
echo $file."<br />";
}
shows the file in question perfectly fine, no matter if opened by another client or not. - So I can almost exclude any access-limit / permission thingy...
I figured out the cause even if I do not know the reason now:
The Issue with glob()
not finding an opened file appears, when the file is located on a drive that's using Windows Server 2012 R2 build in data-deduplication feature.
If I move the file to a non deduplicated share, glob()
can read it, even when opened by multiple clients.
Since I have a working alternative, this question should mainly focus on the question why glob does not work - or let's say work different here. There has to be a difference in how glob
and readdir
are accessing the underlaying filesystem to determine the contents.
Another Proof
There is another proof, that this relates to data-deduplication: I configured the feature to "only" deduplicate files older than 3 days.
I set up a cronjob, "opening and globing" a certain file on the share. Once it was ~ 3 days old (Windows decides when to deduplicate), glob failed to list the file while its opened by another client.
Thus, glob is able to find open files, that has been copied to the share WITHIN the first 3 days - and then starts to miss it, once it has been deduplicated.
Observations
glob
glob
fails, causing this post :-)
scandir
Using the mentioned scandir
function shows the very same behavior:
- deduplicated file opened by a client - missing in the resulting array.
- deduplicated file not opened by a client - part of the resulting array.
opendir / readdir
I want to underline again, that opendir
along with readdir
works in both cases.
RecursiveDirectoryIterator
This produced the expected result at any time as well.
File Attributes
I noted, that deduplicated files are shown with a "Size on Harddrive" of 0 Bytes, while not yet deduplicated files (which are successfully found) are shown with the size they are logically occupying (based on filesystems cluster-size):
However this would not explain why it makes a difference whether a file is opened by a client or not. Size report is equal at any time.
I'm not sure if this is what you're looking for but i use scandir() to list all the files in a directory, then you can excecute any command on them once you know the name. It will work on open files as well
PHP scandir documentation source