The beginning of a 'heredoc' string in bash usually looks like
cat <<EOF or cat << EOF
i.e. there may or may not be a space between the two less-than characters and the token word "EOF". I would like to catch the token word, so I try the following
$ pcretest
PCRE version 8.45 2021-06-15
re> "^\s*cat.*[^<]<{2}[^<](.*)"
data> cat << EOF
0: cat << EOF
1: EOF
data> cat <<EOF
0: cat <<EOF
1: OF
As you can see in the string where there is no space between the << and EOF, I only catch "OF" and not "EOF". The expression must match exactly two less-than signs and fail if there are three or more. But why does it gobble up the 'E' so that only "OF" is returned?
In your pattern using are using a negated character class
[^<]which matches a single character other than<which in this case is theEchar in the string<<EOFFor your examples and using pcre, you could match the leading spaces and then match
<<without a following<The pattern matches:
^Start of string\h*Match optional horizontal whitespace charscat\h+Matchcatand 1+ horizontal whitespace chars<<(?!<)Match<<and assert not<directly to the right(.*)Capture optional chars in group 1See a regex demo