Exclude duplicate lines from two different files and generate new?

160 views Asked by At

I have few wordlist files which is based on word per line.

Now, i want to generate new file, that should be:

compared first and second file and putting words from second file, that are not found in first, putting them in third file.

File_1

word1

word2

word3

word4

word5

File_2

word1

word3

word5

word7

word9

I tried few things:

cat file.1 file.2 | sort -n | uniq -u | cat > file.3

But i get:

File_3

word2

word4

word7

word9

Also i tried:

cat file.1 file.2 | sort -n | uniq -d | cat > file.3

but again I get:

File_3

word1

word3

word5

I tried also with ECHO

echo $(cat file.{1,2} | sort -n | uniq -u) > file.3

But i get same words, and what is worst it prints all at one line.

The final File_3 should contain:

word7

word9

Beacause these words are not found in first file.

Any idea how to accomplish this?

4

There are 4 answers

3
scriptmonster On BEST ANSWER

If your file is not big you can basically cat first file twice:

cat file.1 file.1 file.2 | sort -n | uniq -u | cat > file.3

but this is expensive for big files.

or using grep you can achieve this (thanks to @tripleee):

grep -F -x -v -f file.1 file.2 > file.3
0
user000001 On

Maybe use awk:

$ awk 'NR==FNR{a[$0];next}!($0 in a)' file_1 file_2 > file_3
$ cat file_3
word7
word9
0
John Bartholomew On

You can use the comm program to do this:

comm -13 <(sort file_1) <(sort file_2)
0
tripleee On

Try this.

grep -F -x -v -f file.1 file.2 >file.3