Currently I am using the following command to download an entire directory from an SFTP server to our own. The problem is, this directory gets larger every day and most of the files in it aren't necessary. So what I do today is that I download the entire folder and then clean the unnecessary ones.
But our customer is not enjoying this solution because it leads to heavy file transfers (which they are paying for).
The current version:
sshpass -p $FTP_PASS sftp -o StrictHostKeyChecking=no -o HostKeyAlgorithms=+ssh-dss [USERNAME]@[SFTP_DOMAIN].com <<EOF
get -r Export
EOF
I'd like to improve on this script so that instead of downloading the entire folder, the script searches for files that start with a specific string and only then get the latest version of them.
E.g.
We are looking for the latest versions of the ones that start with either Subscribers_Extracts
or Clicks
or Account_Extract
and we have the following list in the directory:
Subscribers_Extracts_1.csv
Subscribers_Extracts_2.csv
Subscribers_Extracts_3.csv
Subscribers_Extracts_4.csv (latest modified)
Clicks_ftyftyf.csv
Clicks_67546754675.csv (latest modified)
Clicks_783635ghgh.csv
Account_Extract_uguyfuyfuf.csv
Then the files we should download are going be
Subscribers_Extracts_4.csv
Clicks_67546754675.csv
Account_Extract.csv
Note that we picked the files based on the Modified date and not the numbers on their names.
Also note that the last type aka Account_Extract.csv
is the only file matching the third pattern so we receive that regardless of its modified date.
How can I save my customer a lot of data transfer?
rsync can sync files modified on the last given amount of time, but for something more flexible you can check the last modification date (in seconds) of a file with:
date +%s -r filename
Then implement the check in a loop for each filename root (i.e. checking for every
f in Subscribers_Extract*
), saving the filename for which the date is higher.However
date -r
doesn't work on OS X systems.EDITED
If you can ssh into the remote server and execute bash script, this gives the name of the latest "Subscribers_..." modified, which you can copy: