Remove duplicates ignoring specific columns

Question

Remove duplicates ignoring specific columns

72 views Asked by Liam McCartney At 30 January 2023 at 21:38

I want to remove all duplicates from a file but ignoring the first 2 columns, I mean don't comparing those columns.

This is my example input:

111  06:22  apples, bananas and pears
112  06:28  bananas
113  07:07  apples, bananas and pears
114  07:23  apples and bananas
115  08:01  bananas and pears
116  08:23  pears
117  09:22  apples, bananas and pears
118  12:23  apples and bananas

I want this output:

111  06:22  apples, bananas and pears
112  06:28  bananas
114  07:23  apples and bananas
115  08:01  bananas and pears
116  08:23  pears

I've tried this bellow, but it only compares the third column and ignores the rest of the line:

awk '!seen[$3]++' sample.txt

Original Q&A

There are 2 answers

Daweo On 31 January 2023 at 09:46

You might use substr string function to get desired part of line for comparison, let file.txt content be

111  06:22  apples, bananas and pears
112  06:28  bananas
113  07:07  apples, bananas and pears
114  07:23  apples and bananas
115  08:01  bananas and pears
116  08:23  pears
117  09:22  apples, bananas and pears
118  12:23  apples and bananas

then

awk '!arr[substr($0,11)]++' file.txt

gives output

111  06:22  apples, bananas and pears
112  06:28  bananas
114  07:23  apples and bananas
115  08:01  bananas and pears
116  08:23  pears

Explanation: get lines which are unique by getting substring of whole line ($0) starting at 11th character.

(tested in GNU Awk 5.0.1)

**konsolebox** · Accepted Answer · 2023-01-30T21:48:45+00:00

konsolebox On 30 January 2023 at 21:48 BEST ANSWER

Store $0 to a temporary variable, set $1 and $2 to empty, then use newly composed $0 as key:

awk '{ t = $0; $1 = $2 = "" } !seen[$0]++ { print t }' sample.txt

TechQA.

Remove duplicates ignoring specific columns

There are 2 answers

Related Questions in BASH

Related Questions in AWK

Related Questions in DUPLICATES

Related Questions in UNIX-TEXT-PROCESSING

Popular Questions

Trending Questions