Trying to remove duplicates words in a String using sed of MacOS

Question

Trying to remove duplicates words in a String using sed of MacOS

183 views Asked by Rafa_izu At 06 December 2024 at 13:35

I'm trying to remove duplicates words in a string using sed from MacOS using the following command:

sed -r 's/^([A-Za-z0-9_]+) \1$/\1/' <<< 'The best of of The United Kingdom'

But it only returns

The best of of United Kingdom

What I'm missing? Could you guys give me hand? Please.

Original Q&A

There are 4 answers

**dan** · Answer 1 · 2022-01-02T06:53:54+00:00

dan On 02 January 2022 at 06:53

You are unnecesarily anchoring the regex at the start and end of line. Remove ^ and $. Change -r to the POSIX -E and it will work on BSD/Mac sed. You also need the g flag to replace multiple repeating word patterns.

sed -E 's/([A-Za-z0-9_]+) \1/\1/g'

**sseLtaH** · Answer 2 · 2022-01-02T01:10:23+00:00

sseLtaH On 02 January 2022 at 01:10

You can try this sed

$ sed 's/\([^ ]* \)\1\+/\1/' input_file
The best of The United Kingdom

Your original code had unneeded anchors ^|$. Here is a fixed version

$ sed -r 's/([A-Za-z0-9_]* )\1+/\1/' <<< 'The best of of The United Kingdom'
The best of The United Kingdom

**Ryszard Czech** · Answer 3 · 2022-01-02T22:20:33+00:00

Use

sed -E 's/[[:<:]]([[:alnum:]_]+)([[:space:]]+\1)+[[:>:]]/\1/'

EXPLANATION

--------------------------------------------------------------------------------
  [[:<:]]                  the boundary between a non-word char or
                           start of string and a word char
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [[:alnum:]_]+            any character of: letters and digits,
                             '_' (1 or more times (matching the most
                             amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (                        group and capture to \2 (1 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    [[:space:]]+             any character of: whitespace characters
                             (like \s) (1 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    \1                       what was matched by capture \1
--------------------------------------------------------------------------------
  )+                       end of \2
--------------------------------------------------------------------------------
  [[:>:]]                  the boundary between a word char (\w) and
                           something that is not a word char

**Rafa_izu** · Answer 4 · 2022-01-02T01:48:11+00:00

Rafa_izu On 02 January 2022 at 01:48

I installed gnu-sed. Problem solved.

TechQA.

Trying to remove duplicates words in a String using sed of MacOS

There are 4 answers

Related Questions in MACOS

Related Questions in SED

Related Questions in BSD

Popular Questions

Popular Tags

Trending Questions