Python: Is my except clause indeed too broad?

2.5k views Asked by At

My code is working fine, but PyCharm warns about the except clause in the following being too broad. And honestly, it also smells to me as the wrong implementation.

When scraping HTML<tr>s in bulk, which I also have to add to the database or update, I don't know in advance if a game is finished or not. Also, a game that hasn't been finished has different HTML tags in some <td>s, that require different handling.

Basically, if we look at match_score_string and match_relevant_bit we are telling BeautifulSoup to find a <td> with a certain class. If the game is finished already, this <td> will have a score-time sc class, if not, it will have a score-time st class. If it has one, it won't have the other.

It's been a while since I wrote the try-except clauses, but if I recall correctly, the reason (I felt) I had to use them is because BS would throw an error and halt all operations when row.find fails to locate the HTML object.

Is PyCharm complaining justifiably?

# Get Match Score
try:
    match_score_string = row.find("td", class_="score-time sc").get_text()
    match_string_split = match_score_string.split(" - ")
    team_a_score = int(match_string_split[0])
    team_b_score = int(match_string_split[1])
    print(team_a_score)
    print(team_b_score)
except:
    team_a_score = None
    team_b_score = None

# Get Match URL
try:
   match_relevant_bit = row.find("td", class_="score-time sc")
   match_url = match_relevant_bit.find("a").get("href")
   match_url_done = match_url.rsplit('?JKLMN')
   match_url = match_url_done[0]
   match_finished = True

except:

   match_relevant_bit = row.find("td", class_="score-time st")
   match_url = match_relevant_bit.find("a").get("href")
   match_url_done = match_url.rsplit('?JKLMN')
   match_url = match_url_done[0]
   match_finished = False
2

There are 2 answers

6
mgilson On BEST ANSWER

Your except is certainly too broad. I won't dispute whether or not you need exception handling in these cases -- Exceptions are quite pythonic so use them wherever you feel they are appropriate. However, you haven't argued that you need your code to handle every exception imaginable. A bare except can mask all sorts of bugs. e.g. If you misspell a variable name at a later revision:

try:
    match_score_string = row.find("td", class_="score-time sc").get_text()
    match_string_split = match_score_string.split(" - ")
    team_a_score = int(match_string_spilt[0])  # oops
    team_b_score = int(match_string_split[1])
    print(team_a_score)
    print(team_b_score)
except:
    team_a_score = None
    team_b_score = None

Bare excepts also prevent a user from sending KeyboardInterrupt to terminate the program ...

These are just 2 examples why you don't want a bare except -- Since the number of programming errors that could get masked by this are near infinite, I don't have enough time to demonstrate all of them ;-).

Instead, you should specify exactly what exceptions you expect to encounter:

try:
    match_score_string = row.find("td", class_="score-time sc").get_text()
    match_string_split = match_score_string.split(" - ")
    team_a_score = int(match_string_spilt[0])  # oops
    team_b_score = int(match_string_split[1])
    print(team_a_score)
    print(team_b_score)
except AttributeError:  # If no row is found, accessing get_text fails.
    team_a_score = None
    team_b_score = None

This way, the code executes as you expect and you don't mask (too many) programming bugs. The next thing you want to do to prevent hiding bugs is to limit amount of code in your exception handler as much as possible.

2
Vasili Syrakis On

Try/except clauses are great. It sounds like you don't use them too often.

try/except blocks can help you avoid complicated if statements, or code that checks for a condition before proceeding which can sometimes make it unclear what you are trying to do (see what I did there :P).

There are a few exceptions that you could possibly drag out of your code, and some of them might be more likely than the others. See below:

try:
    # This first line could have an AttributeError in two places
    #     Maybe the row didn't parse quite correctly? find() won't work then.
    #     Maybe you didn't find any elements? get_text() won't work then.
    match_score_string = row.find("td", class_="score-time sc").get_text()
                            ^                                  ^
    # Same as here, you could get an AttributeError if match_score_string isn't a string.
    #     I would say that isn't very likely in this case,
    #     Especially if you handle the above line safely.
    match_string_split = match_score_string.split(" - ")

    # If either of these split strings contains no matches, you'll get an IndexError
    # If either of the strings at the index contain letters, you may get a ValueError
    team_a_score = int(match_string_split[0])
    team_b_score = int(match_string_split[1])

    # Theoretically, if you caught the exception from the above two lines
    #     you might get a NameError here, if the variables above never ended up being defined.
    print(team_a_score)
    print(team_b_score)
except:
    # Can't really imagine much going wrong here though.
    team_a_score = None
    team_b_score = None

As you might imagine, it may be more useful to catch several types of exceptions. It depends on the complexity of your inputs.

Multiple try/except blocks would also be of benefit here, specifically to handle the case where you catch an error and need to continue executing the rest of the code inside that block.