Compare two files and append the values, leave the mismatches as such in the output file

271 views Asked by At

I'm trying to match two files,file1.txt(50,000 lines), file2.txt(55,000 lines). I want to campare file2 to file 1 extract the values of column 2 and 3 and leave the mismatches as such. Output file must contain all the ids from file2 i.e., it should have 55000 lines. Note: All the ids in file 1 are not present in file2. i.e the actual matches could be less than 50,000.

file1.txt

ab1 12 345  
ab2 9 456  
gh67 6 987  

file2.txt

ab2 0 0  
ab1 0 345  
nh7 0 0  
gh67 6 987  

Output

ab2 9 456  
ab1 12 345  
nh7 0 0  
gh67 6 987 

This is what i tried but it only print the matches (so instead of 55,000 lines i have 49,000 lines in my output file)

awk "NR==FNR {f[$1]=$0;next}$1 in f{print f[$1],$0}" file1.txt file2.txt >output.txt
1

There are 1 answers

1
bkmoney On BEST ANSWER

This awk script will work

NR == FNR {
    a[$1] = $0
    next
}
$1 in a {
    split(a[$1], b)
    print $1, (b[2] == $2 ? $2 : b[2]), (b[3] == $3 ? $3 : b[3])
}
!($1 in a)

If you save this as a.awk and run

awk -f a.awk foo.txt foo1.txt

This will output

ab2 9 456
ab1 12 345
nh7 0 0
gh67 6 987