Regular expression does not match simple case

118 views Asked by At

What I currently have... ^((([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+)*\.)+$

Rules:

  1. The first char must either be a "." or [a-zA-Z] (it can only be "." if the string is of length 1)
  2. It must end in a "."
  3. before any "." there can only be [a-zA-Z0-9]
  4. other than a-zA-Z0-9 and . there can be - (hyphens) that is the only otther character set value
  5. after any "." there can not be a "-"

examples that should match: . a. a-9. abc. abc.a-c.abc.

that should not match: -. -a. a-. a abc.-bc ab-.abc abc.a-@c ..

currently it does not match a. which is one of the simplest cases. Do you have any suggestions on how to fix it?

3

There are 3 answers

2
The fourth bird On BEST ANSWER

As an alternative solution without lookarounds, you can start the string by matching a-zA-Z.

Then use an optional pattern that matches zero or more repetitions of the character class including the hyphen, and ends with matching without the hyphen to prevent it to be present before the dot in the repetition or at the end of the string.

With case insensitive enabled:

^(?:[a-z](?:[a-z0-9-]*[a-z0-9])?(?:\.[a-z0-9-]*[a-z0-9])*)?\.$

In parts

  • ^ Start of string
  • (?: Non capture group
    • [a-z] Match a single char a-z
    • (?: Non capture group
      • [a-z0-9-]* Match 0+ times any of a-z0-9-
      • [a-z0-9] End with a-z-9 so that there can not be a - before the .
    • )? Close group and make it optional
    • (?: Non capture group
      • \.[a-z0-9-]* Match a . and 0+ times any of a-z0-9-
      • [a-z0-9] End with a-z-9 so that there can not be a - before the .
    • )* Close group and repeat it 0+ times
  • )? Close group and make it optional to also allow a single dot
  • \. Match a single dot
  • $ End of string

Regex demo

3
rzwitserloot On

taking it left to right:

^ - start of stream, so far so good

(([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+)* - in other words, attempt to match ([A-Za-z])+([A-Za-z0-9\-])*([a-zA-Z0-9])+ as many times as we can; 0 is also acceptable.

Let's try to match it once:

([A-Za-z])+ - okay, that'll match the a.

([A-Za-z0-9\-])* - that'll match nothing.

([a-zA-Z0-9])+ - the match fails here. This does not match .

Therefore, it doesn't match even once, and we fast forward right after that giant blob, after the *, and get to:

\. - this doesn't match; we're on a.

0
Roko C. Buljan On
/^(?!\-)([A-Z0-9]|[\-\.](?!\.))*\.$/i

This will also handle the .. and -- case.
Give it a try.

Live demo on Regex101

Let's break it down:

/
^              Line start
(?!\-)         Must not start with -
(              Start of matching group
  [A-Z0-9]     Match list
  |            OR
  [-.](?![-.]) A - or . not followed by - or .
)*             End group matching 0 or more times
\.             Must end in . 
$              Line end
/i             Treat as case insensitive