My data is a munged GWAS summary statistic with the first field containing the rs-number and the following fields containing data like the alleles and z-values. I aim to filter out every row that contains a valid rs-Number but no statistical data (i.e. every following column is empty).
I work in the windows console on a remote cluster with a gzip file (.gz). My command is
#filter out empty columns with valid rs-number zcat ${targetDir}/data.sumstats.gz|awk -F '\t' '$1 ~ /^rs[0-9]+$/ && NF > 1' | grep -vE '^\s*\t*$' > ${targetDir}/data.sumstats.filtered.txt
This still returns an output file containing both complete and empty (except for the rs ID) columns though.
Do you have an idea why that could be? Thanks for your help!