I have a file which first line contain a series of fields, tab separated (\t
). I'm trying to walk through the lines and use some of the fields as variables for a programme. The code I have so far is the following:
{
A=$(head -1 id_table.txt)
read;
while IFS='\t' read $A;
do
echo 'downloading '$SRA_Sample_s
echo $tissue_s
#out_dir=`echo $tissue_s | sed 's/ /./g'` #Replacing spaces by dots
#/soft/bio/sequence/sratoolkit-2.3.4-2/bin/fastq-dump.2.3.4 --split-3 --outdir $out_dir --ncbi_error_report $SRA_Sample_s
done
} <./id_table.txt
Output (Wrong):
downloading _s Inser
downloading provided> <no
downloading provided> <no
downloading provided> <no
It fails because it's not getting correctly the fields. Perhaps the <>
characters are creating confusion? Different files have the name of the columns ordered differently and some columns are missing in some files. I'm stuck here.
The file looks like this:
BioSample_s MBases_l MBytes_l Run_s SRA_Sample_s Sample_Name_s age_s breed_s sex_s Assay_Type_s AssemblyName_s BioProject_s BioSampleModel_s Center_Name_s Consent_s InsertSize_l Library_Name_s Platform_s SRA_Study_s biomaterial_provider_s g1k_analysis_group_s g1k_pop_code_s source_s tissue_s
SAMN02777951 4698 3249 SRR1287653 SRS607026 SL01 19 SL01 female RNA-Seq <not provided> PRJNA247712 Model organism or animal SICHUAN UNIVERSITY public 200 <not provided> ILLUMINA SRP041998 Chengdu Research Base of Giant Panda Breeding <not provided> <not provided> <not provided> blood
SAMN02777952 4451 3063 SRR1287654 SRS607028 XB01 12 XB01 male RNA-Seq <not provided> PRJNA247712 Model organism or animal SICHUAN UNIVERSITY public 200 <not provided> ILLUMINA SRP041998 Chengdu Research Base of Giant Panda Breeding <not provided> <not provided> <not provided> blood
SAMN02777953 4553 3139 SRR1287655 SRS607025 XB02 6 XB02 female RNA-Seq <not provided> PRJNA247712 Model organism or animal SICHUAN UNIVERSITY public 200 <not provided> ILLUMINA SRP041998 Chengdu Research Base of Giant Panda Breeding <not provided> <not provided> <not provided> blood
You may find an awk script more robust and less cumbersome to use than a shell loop:
.
I'd say you should DEFINITELY avoid the shell loop if it wasn't for you calling an external command and so doing more than just text processing.
Alterantively, consider using awk for the text processing and then piping to a shell loop for the external command execution:
.