Python - File modification time comparison, strange behavior

228 views Asked by At

I use a Python backup script for my files and I back up from my hard drive to both pen drives that are detached from my PC and permanently attached external drives.

I have logic in my script that does a copy from source to destination only if the source file is newer.

If the destination file is newer, I just report an error and don't do any copy.

This works well for the permanently attached external drives. But for the the pen drives, for most of the files, the destination file is reported as being newer than the source file.

I use my pen drives for backups only and never for anything else. So it is impossible for files on the pen drives to be newer.

What could be the problem?

Thank you, Vishy

2

There are 2 answers

0
Alfe On

You could have come across a problem with different time stamps on different file system types. Since you post so little information on these, I have to take a wild guess.

The mechanism I'm thinking about is this:

Your original file system of type A (e. g. ext3fs, reiserfs, ntfs, ...) might contain time stamps for each file which have a precision of milliseconds. The backup file system (e. g. fat32, ...) might have a different precision for the time stamps (e. g. only seconds). During creation of the backup the system will have to decide how to handle that. The millisecond information must be lost, and maybe the value gets rounded, so a 12:23:34.789 might be rounded to a 12:23:35. (This of course should apply to around 50% of the files.)

When comparing file times, depending on the cleverness of the routines, this result might be interpreted as "the backup is newer than the original".

As I said, this is just a wild guess, so you should have a look at the concrete time stamps to find out.

0
Tim Pietzcker On

The most probable reason is the following:

  • Your PC's hard drive uses NTFS.
  • Your pen drive uses FAT.
  • The last backup of file X was before a switch from daylight saving time to normal time.
  • You're running the next backup in normal time.

Since only NTFS is DST-aware, that means that suddenly your files on the pen drive are exactly one hour ahead of those on your hard drive.

So you should be checking for (and ignoring) timestamp differences of exactly one hour, or (better) format your pen drives in NTFS.

After all, it's possible to have a file that's newer on your desktop than on your pen drive that your program will fail to back up if the modification date differs by less than an hour, and the modification occurred before a switch to DST, and you're running the backup after that...