string.replace('the','') is leaving white space

333 views Asked by At

I have a string that is the name of an artist that I get from the MP3 ID3 tag

sArtist = "The Beatles"

What I want is to change it to

sArtist = "Beatles, the"

I have running into 2 different problems. My first problem is that I seem to be trading 'The' for ''.

if sArtist.lower().find('the') == 0:
    sArtist = sArtist.lower().replace('the','')
    sArtist = sArtist + ", the"

My second problem is that since I have to check for both 'The' and 'the' I use sArtist.lower(). However this changes my result from " Beatles, the" to " beatles, the". To solve that problem I just removed the .lower and added a second line of code to explicitly look for both cases.

if sArtist.lower().find('the') == 0:
    sArtist = sArtist.replace('the','')
    sArtist = sArtist.replace('The','')
    sArtist = sArtist + ", the"

So the problem I really need to solve is why am I replacing 'the' with <SPACE> instead of <NULL>. But if somebody has a better way to do this I would be glad for the education :)

2

There are 2 answers

1
Mark Tolonen On BEST ANSWER

One way:

>>> def reformat(artist,beg):
...   if artist.startswith(beg):
...     artist = artist[len(beg):] + ', ' + beg.strip()
...   return artist
...
>>> reformat('The Beatles','The ')
'Beatles, The'
>>> reformat('An Officer and a Gentleman','An ')
'Officer and a Gentleman, An'
>>>
3
unutbu On

Using

sArtist.replace('The','')

is dangerous. What happens if the artist's name is Theodore?

Perhaps use regex instead:

In [11]: import re
In [13]: re.sub(r'^(?i)(a|an|the) (.*)',r'\2, \1','The Beatles')
Out[13]: 'Beatles, The'