Python 2.7 Regex not matching desired pattern

262 views Asked by At

I am parsing all the rows of a .m3u file containing my IPTV playlist data. I am looking to isolate and print string sections within the file of the format:

tvg-logo="http//somelinkwithapicture.png"

..within a string that looks like:

#EXTINF:-1 catchup="default" catchup-source="http://someprovider.tv/play/dvr/${start}/2480.m3u8?token=%^%=&duration=3600" catchup-days=5 tvg-name="Sky Sports Action HD" tvg-id="SkySportsAction.uk" tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png" group-title="Sports",Sky Sports Action HD
http://someprovider.tv/play/2480.m3u8?token=465454=

My class looks like this:

import re

class iptv_cleanup():

    filepath = 'C:\\Users\\cg371\\Downloads\\vget.m3u'

    with open(filepath, "r") as text_file:
        a = text_file.read()
        b = re.search(r'tvg-logo="(.*?)"', a)
        c = b.group()
        print c

    text_file.close

iptv_cleanup()

All I am getting returned though is a string like this:

tvg-logo=""

I am a bit rusty with regexes, but I cannot see anything obviously wrong with this.

Can anyone assist?

Thanks

2

There are 2 answers

0
gdogg371 On

This worked in the end:

import re

class iptv_cleanup():

    filepath = 'C:\\Users\\cg371\\Downloads\\vget.m3u'

    with open(filepath, "r") as text_file:
        a = text_file.read()
        b = re.findall(r'tvg-logo="(.*?)"', a)

        for i in b:

            print i


    text_file.close

iptv_cleanup()

Thanks you for your input all...

8
lucas_7_94 On

Check (?:tvg-logo=\")[\w\W]*(?<=.png)

import re
reg = '(?:tvg-logo=\")[\w\W]*(?<=.png)'

string = '#EXTINF:-1 catchup="default" catchup-source="http://someprovider.tv/play/dvr/${start}/2480.m3u8?token=%^%=&duration=3600" catchup-days=5 tvg-name="Sky Sports Action HD" tvg-id="SkySportsAction.uk" tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png" group-title="Sports",Sky Sports Action HD http://someprovider.tv/play/2480.m3u8?token=465454='

print re.findall(reg,string, re.DOTALL)[0]

$python main.py
tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png