I have a bunch of files with mixed IDs in a directory (linux env.) and look like this:
SRR7821874_1.fastq.gz
SRR7821874_2.fastq.gz
SRR7821870_1.fastq.gz
SRR7821870_2.fastq.gz
I also have a 2-column tab-delimited file (called rename.tsv) based on which I try to replace IDs:
Read Sample
SRR7821874 GSM3385663
SRR7821870 GSM3385659
Besides, I would like to concurrently change _1 to _S1_L001_R1_001 and _2 to _S1_L001_R2_001 in the file names, so the final result should look like this:
SRR7821874_1.fastq.gz --> GSM3385663_S1_L001_R1_001.fastq.gz
SRR7821874_2.fastq.gz --> GSM3385663_S1_L001_R2_001.fastq.gz
SRR7821870_1.fastq.gz --> GSM3385659_S1_L001_R1_001.fastq.gz
SRR7821870_2.fastq.gz --> GSM3385659_S1_L001_R2_001.fastq.gz
I've tried the following script with no success as apparently it requires the full file names to rename them (just for ID replacement part):
while read -r Read Sample; do mv "$Read" "$Sample"; done < rename.tsv
You can try:
We use
tailto skip the header line, and we enable thenullglobbash option to expand"${from}_"*.fastq.gzas the null string instead of the pattern itself if no file matches. As this is part of a pipe thenullgloboption is restored to its previous state at the end."${f##*_}"and"${num%%.*}"are two of the numerous bash parameter expansions.Note that you can use a more accurate pattern if needed. For instance, if you know that the number is always 1 or 2 you could replace
"${from}_"*.fastq.gzwith"${from}_"[12].fastq.gz. Or, if it is any one-digit number:"${from}_"[0-9].fastq.gz.