I currently have the following regular expression:
^\s*(.+)(?:[-\._ ]+)(\d+)\s*[xX]\s*(\d+)
This will match show_3x01_ep. name
and retrieve show
, 3
, 01
. I would like to extend this so that multiple episodes can be captured. For example:
show _3x01_3x02 ep. name
should return:
show, 3, 01, 3, 02
Could someone please explain to me how this might be done?
You are expecting too much from your regular expression. The simplest way is to do this in two steps.
Note first of all though that the
(.+)
which matchesshow
in your example is too general. If you apply the pattern toshow _3x01_3x02 ep. name
then you will getshow
-- with a trailing space -- because the following[-._ ]+
(there is no need to escape the dot or enclose the character class in(?: ... )
) is satisfied with just one character.This will do as you ask. It finds the first string of alphabetic characters, and then all pairs of digit strings that are spearated by a single
x
.output