Returning first number in function if keyword is met

97 views Asked by At

I'm working with data and I have it set to spit out items I need. Example:

LOT OF 4  American motor vinegar 
Lot of (6) 808 metal/steel/G/N LWAP 
LOT 12 product number 57838290

What I want is to have it spit out the amount in each lot, could be lowercase or capitalized, if 'lot' is found in the text. I think I have my code half built, but since the value isn't in a set location I don't know how to retrieve it. Also, the list above is from a TEXT string so it doesn't recognize integers

def auction(title): 
    for word in title.split(): 
        if word.startswith('lot'): 
            return   # not sure what to return (from the example the answer would be 4 6 and 12)
4

There are 4 answers

2
MCSH On

You can re-write that in the following order:

def auction(title):
     found = False;
     for word in title.split():
         if word.upper().startswith('LOT'):
              found = True;
         if found:
               if word.isdigit():
                    return int(word)

The base is same as your own, we set the boolean value to True after we found the LOT value (in any upper or lower case). Then we check to see if the word is a digit and if it was, return it's value.

0
Kristian Damian On

you can use list comprehension to see if you need parse the string

num=['0','1','2','3','4','5','6','7','8','9']
t='this is a lot of 10'
if [e for e in t if e in num]!=[]:
    parse_the_string(t)

def parse_the_string(the_string):
     the_string=the_string.upper()
     the_number=''
     number_founded=False
     for n  in the_string[the_string.find("LOT"):]:
         if n.isdigit():
             the_number+=n
             number_founded=True
         elif number_founded:
             break;
     return the_number
0
dileep nandanam On

Can use regular expression

import re
def auction(title): 
for word in title.split(): 
    if word.startswith('lot'): 
        search_result = re.search('([0-9]+)', title)
        if search_result
            return int(search_result.groups()[0])
0
cbare On

Some folks don't like regular expressions, but in cases like this, they're pretty handy. I might try something like this:

import re

inputs = [
    "LOT OF 4 CISCO AIRONET 4800 AIR-LM4800 DSSS WLAN PC CARD",
    "Lot of (6) CISCO AIRONET AIR-LAP1252AG-A-K9 DUAL BAND 802.11A/G/N LWAP",
    "LOT 12 Cisco Systems Aironet 1200 Wireless Access Point AIR-AP1231G-A-K9 MP21G",
    "CISCO AIRONET 4800 AIR-LM4800 DSSS WLAN PC CARD lot of 4",
    "Ocelot 4800 AIR-LM4800"]

patterns = [
    r'\blot(?:\s+of|)\s+(\d+)',
    r'\blot(?:\s+of|)\s+\((\d+)\)']

for a in inputs:
    for pattern in patterns:
        m = re.search(pattern, a, flags=re.IGNORECASE)
        if m:
            print "lot size = ", m.group(1)
            break
    else:
        print "No lot size found!"

Outputs:

lot size =  4
lot size =  6
lot size =  12
lot size =  4
No lot size found!

The patterns here look a bit horrible, but they're just saying this: find the word 'lot', possibly followed (or not) by the word 'of' and then some digits. Or, in the second case some digits surrounded by literal parentheses.

Since this is free text you're parsing, you'll probably have some errors that might be have to be corrected either by hand or by adding more patterns.