Get substring using either perl or sed

Asked by At

I can't seem to get a substring correctly.

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[A-Z0-9]\+/\1/g')

That still returns bugfix/US3280841-something-duh.

If I try an use perl instead:

declare BRANCH_NAME="bugfix/US3280841-something-duh";

# Trim it down to "US3280841"
TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9]|[A-Z0-9])+/; print $1');

That outputs nothing.

What am I doing wrong?

7 Answers

anubhava On Best Solutions

You may use this sed:

sed -E 's~^.*/|-.*$~~g' <<< "$BRANCH_NAME"


Ot this awk:

awk -F '[/-]' '{print $2}' <<< "$BRANCH_NAME"

tso On
sed 's:[^/]*/\([^-]*\)-.*:\1:'<<<"bugfix/US3280841-something-duh"
Paul Hodges On

Using bash parameter expansion only:

$: # don't use caps; see below.
$: declare branch="bugfix/US3280841-something-duh"
$: tmp="${branch##*/}"
$: echo "$tmp"
$: trimmed="${tmp%%-*}" 
$: echo "$trimmed"

Which means:

$: tmp="${branch_name##*/}"
$: trimmed="${tmp%%-*}" 

does the job in two steps without spawning extra processes.

In sed,

$: sed -E 's#^.*/([^/-]+)-.*$#\1#' <<< "$branch"

This says "after any or no characters followed by a slash, remember one or more that are not slashes or dashes, followed by a not-remembered dash and then any or no characters, then replace the whole input with the remembered part."

Your original pattern was


This says "remember any number of anything followed by a slash, then a lowercase letter or a digit, then a pipe character (because those only work with -E), then a capital letter or digit, then a literal plus sign, and then replace it all with what you remembered."

GNU's manual is your friend. I look stuff up all the time to make sure I'm doing it right. Sometimes it still takes me a few tries, lol.

An aside - try not to use all-capital variable names. That is a convention that indicates it's special to the OS, like RANDOM or IFS.

UjinT34 On

Perl version just has + in wrong place. It should be inside the capture brackets:

TRIMMED=$(echo $BRANCH_NAME | perl -nle 'm/^.*\/([a-z0-9A-Z]+)/; print $1');
Barbaros Özhan On

Just use a ^ before A-Z0-9

TRIMMED=$(echo $BRANCH_NAME | sed -e 's/\(^.*\)\/[a-z0-9]\|[^A-Z0-9]\+/\1/g')

in your sed case.

Alternatively and briefly, you can use

TRIMMED=$(echo $BRANCH_NAME | sed "s/[a-z\/\-]//g" )


abdan On

type on shell terminal

$ BRANCH_NAME="bugfix/US3280841-something-duh"

$ echo $BRANCH_NAME| perl -pe 's/.*\/(\w\w[0-9]+).+/\1/'

use s (substitute) command instead of m (match)
perl is a superset of sed so it'd be identical 'sed -E' instead of 'perl -pe'

G. Cito On

Another variant using Perl Regular Expression Character Classes (see perldoc perlrecharclass).

echo $BRANCH_NAME | perl -nE 'say m/^.*\/([[:alnum:]]+)/;'