How to securely access local data in remote SLURM job w/o exposing sensitive information?

76 views Asked by At

I want to run a SLURM job on a remote cluster to analyze sensitive data stored locally (/data/mydata.csv). The job is defined in job.sh, which executes an analysis script (r_script.R) for data analysis. To avoid uploading sensitive data to the remote server, I attempted to directly load it from the local system pasting an SSH command together within r_script.R,

ssh [email protected] 'cat /data/mydata.csv'

which essentially opens a connection to load the data into memory.

I set up SSH key pairs for passwordless access and wrote a Bash script (job.sh) with a -d flag for specifying the data file location.

This setup works fine when running the script directly,

$ ./job.sh -d [email protected]:/data/mydata.csv

but fails under SLURM,

$ sbatch ./job.sh -d [email protected]:/data/mydata.csv

due to unavailable SSH or SSH keys in SLURM, resulting in a status 255 error. Tried ssh-add -L which indeed fails.

Considering an alternative to wget a password-protected .zip and pass the password with a -p flag for interactive input, I realized this exposes the password in plain text, which is not secure.

I am looking for a method, such as an sbatch flag or a secure file transfer protocol available in SLURM, to securely transfer my sensitive local data to a SLURM job while avoiding direct uploads. How can this be achieved?

I can share scripts for more context if needed. Any insights or suggestions are welcome.

0

There are 0 answers