How to migrate this regex to JavaScript

Question

How to migrate this regex to JavaScript

77 views Asked by Samul At 30 March 2024 at 01:01

I have this regex that works perfectly with PHP:

$que = preg_replace("/(?<=^| )([a-z])\\1*(?= |$)\K|([a-z])(?=\\2)/","$3",$que);

This regex removes repeated chars inside strings (for example, axxd becomes axd however xxx will still become xxx). The problem that I am facing, is because it does not work with JS (I think the negative lookbehind does not work with JS).

More examples:

the string aaa baaab xxx would become aaa bab xxx
the string ahhj aaab cc iiik would become ahj ab cc ik

Do you have a solution for this that is at least a little efficient? I mean, I will probably use this regex on a string with 1k chars, so if the regex is not efficient, the browser may freeze.

Original Q&A

There are 3 answers

**Nick** · Answer 1 · 2024-03-30T01:52:49+00:00

The negative lookbehind is not likely to be your issue as they are supported on almost all current release browsers. However JavaScript regex doesn't recognise \K as a meta sequence but rather as a literal K. You can work around that using this regex:

\b([a-z])\1+(?!\1|\b)|(?<=([a-z]))((?!\2)[a-z])\3+

This matches either \b([a-z])\1+(?!\1|\b):

\b : word boundary
([a-z]) : a letter, captured in group 1
\1+ : one or more repetitions of the captured letter
(?!\1|\b) : lookahead assertion that the next location is not another repetition of the captured letter or a word boundary

or (?<=([a-z]))((?!\2)[a-z])\3+:

(?<=([a-z])) : a positive lookbehind for a letter, captured in group 2
((?!\2)[a-z]) : another letter which is not the same as the previously captured letter, captured in group 3
\3+ : one of more repetitions of the captured letter

The first part of the regex will capture repeated letters at the beginning of a word; the second part captures repeated letters in the middle or at the end of a word.

You can then replace with $1$3 which will replace any repeated letter matched by the regex with just a single copy of itself.

Regex demo on regex101

In JavaScript:

console.log('aaa baaab xxx fjjj'.replace(/\b([a-z])\1+(?!\1|\b)|(?<=([a-z]))((?!\2)[a-z])\3+/g, '$1$3'))
console.log('ahhj aaab cc iiik'.replace(/\b([a-z])\1+(?!\1|\b)|(?<=([a-z]))((?!\2)[a-z])\3+/g, '$1$3'))
console.log('bbb ahhj aaab cc iiik xxx fjjj baaaaaab yyyaaa'.replace(/\b([a-z])\1+(?!\1|\b)|(?<=([a-z]))((?!\2)[a-z])\3+/g, '$1$3'))

PHP demo on 3v4l.org

**Casimir et Hippolyte** · Answer 2 · 2024-03-30T03:10:15+00:00

Casimir et Hippolyte On 30 March 2024 at 03:10

let result = str.replace(/\b((.)\2+)\b|(.)\3+/g, '$1$3');

demo

**The fourth bird** · Answer 3 · 2024-03-30T10:45:33+00:00

You could rewrite (?<=^| ) as (?:\s|^) and keep that match in the replacement instead of using \K which is not supported in JavaScript.

You could write the pattern as:

((?:\s|^)([a-z])\2+)(?=\s|$)|([a-z])(?=\3)

The pattern matches:

( Capture group 1
- (?:\s|^) Match either a whitespace char or assert the start of the string
- ([a-z])\2+ Capture a single char a-z in group 2 and repeat matching that same char 1 or more times
) Close group 1
(?=\s|$) Positive lookahead, assert either a whitespace char or the end of the string to the right
| Or
([a-z])(?=\3) Capture a single char a-z in group 3 while asserting the same character directly to the right

Regex demo

const regex = /((?:\s|^)([a-z])\2+)(?=\s|$)|([a-z])(?=\3)/g;

[
  "aaa baaab xxx",
  "ahhj aaab cc iiik",
  "#$aa$aa#aaa bbb"
].forEach(s =>
  console.log(s.replace(regex, "$1"))
)

If you want to match any letter:

const regex = /((?: |^)([\p{L}\p{M}])\2+(?= |$))|([\p{L}\p{M}])(?=\3)/gu;

See another regex demo

TechQA.

How to migrate this regex to JavaScript

There are 3 answers

Related Questions in JAVASCRIPT

Related Questions in REGEX

Popular Questions

Trending Questions