How to find duplicate lines in a file?

Question

How to find duplicate lines in a file?

7.4k views Asked by Vicky At 09 January 2017 at 12:20

I have an input file with foillowing data:

line1
line2
line3
begin
line5
line6
line7
end
line9
line1
line3

I am trying to find all the duplicate lines , I tried

sort filename | uniq -c

but does not seem to be working for me :

It gives me :

  1 begin
  1 end
  1 line1
  1 line1
  1 line2
  1 line3
  1 line3
  1 line5
  1 line6
  1 line7
  1 line9

the question may seem duplicate as Find duplicate lines in a file and count how many time each line was duplicated? but nature of input data is different .

Please suggest .

Original Q&A

There are 4 answers

RARE Kpop Manifesto On 15 June 2022 at 14:57

you'll have to modify the standard de-dupe code just a tiny bit to account for this:

if you want unique copy of the duplicates, then it's very much same idea:

  {m,g}awk 'NF~ __[$_]++' FS='^$'
  {m,g}awk '__[$_]++==!_'

If you want every copy printed for duplicates, then whenever the condition yields true for the first time, print 2 copies of it, plus print new matches along the way.

Usually it's waaaaaaaaay faster to first de-dupe, then sort, instead of the other way around.

wsdzbm On 15 June 2022 at 13:29

try

sort -u file

or

awk '!a[$0]++' file

slyfox1186 On 07 January 2024 at 22:09

Pass the file name as the first argument to this script.

Example: find-dupes.sh name.ext

#!/usr/bin/env bash

# Check if a file name is provided
if [ $# -eq 0 ]; then
    echo "Usage: $0 [file]"
    exit 1
fi

# File to check for duplicates
file="$1"

# Check if the file exists
if [ ! -f "$file" ]; then
    echo "Error: File not found."
    exit 1
fi

# Finding duplicates
duplicates=$(sort "$file" | uniq -d)

if [ -z "$duplicates" ]; then
    printf "\n%s\n" "No duplicates were found in $file."
else
    printf "\n%s\n\n" "Duplicate lines in $file:"
    echo "$duplicates"
fi

**Angel Bochev** · Accepted Answer · 2017-01-09T12:22:44+00:00

Angel Bochev On 09 January 2017 at 12:22 BEST ANSWER

use this:

sort filename | uniq -d
man uniq

TechQA.

How to find duplicate lines in a file?

There are 4 answers

Related Questions in SORTING

Related Questions in UNIQ

Popular Questions

Popular Tags

Trending Questions