What are the best practices to handle missing fileds from API response?

71 views Asked by At

I'm using Google Books API to get details about books using their ISBN numbers

ISBN - International Standard Book Number is a numeric commercial book identifier that is intended to be unique

When calling the API using different ISBNs, the response is not the same always as some books have certain fields missing

requests.get(f"https://www.googleapis.com/books/v1/volumes?q=isbn:{'8180315339'}").json() requests.get(f"https://www.googleapis.com/books/v1/volumes?q=isbn:{'938733077X'}").json()

O/p of both the responses will have different numbers of fields returned

I can use try & except to handle errors, but that continues to the next iteration in the loop, i.e calls the API with the next ISBN, how do save the info which is available, and add np.nan in data frame where data is missing


data = requests.get(f"https://www.googleapis.com/books/v1/volumes?q=isbn:{'938733077X'}").json()
# Loop through the items in the "items" field of the JSON data
for item in data['items']:
  # Extract the relevant fields from the item
    try:
        title = item['volumeInfo']['title']
        subtitle = item['volumeInfo']['subtitle']
        authors = item['volumeInfo']['authors']
        publisher = item['volumeInfo']['publisher']
        published_date = item['volumeInfo']['publishedDate']
        description = item['volumeInfo']['description']
        pageCount = item['volumeInfo']['pageCount']
        category = item['volumeInfo']['categories']
        imageS = item['volumeInfo']['imageLinks']['smallThumbnail']
        imageM = item['volumeInfo']['imageLinks']['thumbnail']
        language = item['volumeInfo']['language']
        textSnippet = item['searchInfo']['textSnippet']
    except KeyError:
        continue
# Add the fields to the results list as a tuple
results.append((title, subtitle, authors, publisher, published_date, description, pageCount, category, imageS, imageM, language, textSnippet))

# Create a DataFrame from the results list
df_ = pd.DataFrame(results, columns=['Title', 'Sub Title', 'Authors', 'Publisher', 'Published Date', 'Description', 'Page Count', 'Category', 'imageS', 'imageM', 'Language', 'Text'])
2

There are 2 answers

0
chepner On BEST ANSWER

First try to get item['volumeInfo'], and continue only if this succeeds. Using operator.itemgetter will make the code much more compact as well.

from operator import itemgetter


extract = itemgetter("title", 
                     "subtitle",
                     "authors",
                     "publisher",
                     "published_date",
                     "description",
                     "pageCount",
                     "categories",
                     "imageLinks",
                     "language",
                     "textSnippet")
get_thumbnails = itemgetter("smallThumbnail", "thumbnail")

for item in data["items"]:
    try:
        info = item["volumeInfo"]
    except KeyError:
        continue

    t = extract(info)
    results.append(t[:8] + get_thumbnails(t[8]) + t[9:])
4
Nick On

Try using this

title = item.get('volumeInfo', dict()).get('title') # this way if there is no such field you will get None instead of KeyError