Here are 2 examples on string operation methods from Python data science handbook, that I am having troubles understanding.
str.extract()
monte = pd.Series(['Graham Chapman', 'John Cleese', 'Terry Gilliam',
'Eric Idle', 'Terry Jones', 'Michael Palin'])
monte.str.extract('([A-Za-z]+)')
This operation returns the first name of each element in the Series. I don't get the expression input in the extract function.
str.findall()
monte.str.findall(r'^[^AEIOU].*[^aeiou]$')
This operation returns the original element if it starts and ends with consonants, returns an empty list otherwise. I figure that the ^
operator stands for negation of vowels. *
operator combines the situations of upper and lower cases of vowels.
Yet I do not understand the rest of the operators.
Please help me with understanding these input expressions. Thanks in advance.
The first
^
means in the beginning of the string, whereas$
means in the end of the string, here is an example:This only prints one
a
because I have the^
sign which only finds in the begging of the string.This is the same for
$
,$
only finds stuff from the end of the string, here is an example:Edited:
The meaning of
r
is a raw string. Raw string it is what it looks like. For example, a backslash\
doesn't escape, it will just be a regular backslash.