I tried to use aes-gcm and sodiumoxide Rust crates and in both cases I was having an issue where I tried to encrypt a file larger than the size of a chunk buffer I will get an invalid ciphertext which is not decryptable.
Here's the example conditions I want this program to run at:
I need to encrypt 50 gigabyte file but have only 2 gigabytes of ram and 4 gigabytes of storage, so I can't take the entire file into memory and I can't write it to a new file in chunks. I need to somehow replace chunks of the file with it's encrypted ones.
Here's my code:
use clap::Parser;
use sodiumoxide::crypto::secretbox;
use sodiumoxide::crypto::secretbox::{Key, Nonce};
use sodiumoxide::hex;
use std::fs::OpenOptions;
use std::io::{self, Read, Seek, SeekFrom, Write};
const CHUNK_SIZE: usize = 1024; // Adjust chunk size as needed
#[derive(Parser)]
struct Options {
#[clap(subcommand)]
command: Command,
}
#[derive(Parser)]
enum Command {
Encrypt,
Decrypt,
}
fn main() -> io::Result<()> {
let args = Options::parse();
match args.command {
Command::Encrypt => encrypt_file("poemfortest.txt"),
Command::Decrypt => decrypt_file("poemfortest.txt"),
}
}
fn encrypt_file(input_file_path: &str) -> io::Result<()> {
let input_file = OpenOptions::new()
.read(true)
.write(true)
.open(input_file_path)?;
// Initialize sodiumoxide library
sodiumoxide::init().expect("Failed to initialize sodiumoxide");
let rawkey =
hex::decode("ba72744932db553b55cb944aa8e5739cf23a7f668c32d164c51ce09d4d631160").unwrap();
let rawnonce = hex::decode("48ccdb552de220538ac1667e7e99054fd39ad417c5d83c6e").unwrap();
// Generate the key and nonce used for encryption
let key = Key::from_slice(&rawkey).expect("Invalid key");
let nonce = Nonce::from_slice(&rawnonce).expect("Invalid nonce");
let mut input_file = input_file;
let mut buffer = [0u8; CHUNK_SIZE];
let mut offset: u64 = 0;
loop {
let bytes_read = input_file.read(&mut buffer)?;
if bytes_read == 0 {
break;
}
// Encrypt the chunk
let encrypted_chunk = secretbox::seal(&buffer[..bytes_read], &nonce, &key);
// Seek back to the beginning of the chunk
input_file.seek(SeekFrom::Start(offset))?;
// Write the encrypted chunk to the input file
input_file.write_all(&encrypted_chunk)?;
// Move the offset to the next chunk
offset += bytes_read as u64;
}
println!("Encryption completed successfully.");
Ok(())
}
fn decrypt_file(input_file_path: &str) -> io::Result<()> {
let input_file = OpenOptions::new()
.read(true)
.write(true)
.open(input_file_path)?;
// Initialize sodiumoxide library
sodiumoxide::init().expect("Failed to initialize sodiumoxide");
let rawkey =
hex::decode("ba72744932db553b55cb944aa8e5739cf23a7f668c32d164c51ce09d4d631160").unwrap();
let rawnonce = hex::decode("48ccdb552de220538ac1667e7e99054fd39ad417c5d83c6e").unwrap();
// Generate the key and nonce used for decryption
let key = Key::from_slice(&rawkey).expect("Invalid key");
let nonce = Nonce::from_slice(&rawnonce).expect("Invalid nonce");
let mut input_file = input_file;
let mut buffer = [0u8; CHUNK_SIZE];
let mut offset: u64 = 0;
loop {
let bytes_read = input_file.read(&mut buffer)?;
if bytes_read == 0 {
break;
}
// Decrypt the chunk
let decrypted_chunk = match secretbox::open(&buffer[..bytes_read], &nonce, &key) {
Ok(decrypted) => decrypted,
Err(_) => {
eprintln!("Error: Failed to decrypt chunk.");
return Ok(());
}
};
// Seek back to the beginning of the chunk
input_file.seek(SeekFrom::Start(offset))?;
// Write the decrypted chunk to the input file
input_file.write_all(&decrypted_chunk)?;
// Move the offset to the next chunk
offset += bytes_read as u64;
}
println!("Decryption completed successfully.");
Ok(())
}
I expected it to work for files larger than the chunk buffer size but turns out my implementation of chunking data is bad.
I'd use memory mapping the file, then encrypting it in place and storing the nonce / authentication tag at the end. In my opinion using GCM or ChaCha20/Poly1305 are both a bit funky for this kind of purpose. ChaCha20 is somewhat better because it offers a larger nonce & maximum message size, but it may be tricky to make the implementations behave correctly. Meanwhile, there is no problem with using SHA-256 and many processors have SHA-256 acceleration. Try SHA-512 if it is slow on a 64 bit system, you might be surprised.
Example that I've generated using ChatGPT as I'm not a Rust programmer, I did review the code though and had it adjusted multiple times, and the code was generated according to my exacting specs. I mainly kept it like it is to avoid syntactical errors.
Users beware:
Almost forgot, to me the decryption function would be too boring to add, but here it is anyway. Using the constants in the encryption function as well would probably be a good idea.
DO NOT perform operations on the file before the HMAC value has been verified and the nonce & HMAC have been removed! If this function fails for any reason you may need to perform some cleanup!