I'm using isbnlib.meta
which pulls metadata (book title, author, year publisher, etc.) when you enter in an isbn. I have a dataframe with 482,000 isbns (column title: isbn13). When I run the function, I'll get an error like NotValidISBNError
which stops the code in it's tracks. What I want to happen is if there is an error the code will simply skip that row and move onto the next one.
Here is my code now:
list_df[0]['publisher_isbnlib'] = list_df[0]['isbn13'].apply(lambda x: isbnlib.meta(x).get('Publisher', None))
list_df[0]['yearpublished_isbnlib'] = list_df[0]['isbn13'].apply(lambda x: isbnlib.meta(x).get('Year', None))
#list_df[0]['language_isbnlib'] = list_df[0]['isbn13'].apply(lambda x: isbnlib.meta(x).get('Language', None))
list_df[0]
list_df[0]
is the first 20,000 rows since I'm trying to chunk through the dataframe. I've just manually entered in this code 24 times to handle each chunk.
I attempted try: and except: but all that ends up happening is the code stops and I don't get any meta data reported.
Traceback:
---------------------------------------------------------------------------
NotValidISBNError Traceback (most recent call last)
<ipython-input-39-a06c45d36355> in <module>
----> 1 df['meta'] = df.isbn.apply(isbnlib.meta)
e:\Anaconda3\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
4198 else:
4199 values = self.astype(object)._values
-> 4200 mapped = lib.map_infer(values, f, convert=convert_dtype)
4201
4202 if len(mapped) and isinstance(mapped[0], Series):
pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()
e:\Anaconda3\lib\site-packages\isbnlib\_ext.py in meta(isbn, service)
23 def meta(isbn, service='default'):
24 """Get metadata from Google Books ('goob'), Open Library ('openl'), ..."""
---> 25 return query(isbn, service) if isbn else {}
26
27
e:\Anaconda3\lib\site-packages\isbnlib\dev\_decorators.py in memoized_func(*args, **kwargs)
22 return cch[key]
23 else:
---> 24 value = func(*args, **kwargs)
25 if value:
26 cch[key] = value
e:\Anaconda3\lib\site-packages\isbnlib\_metadata.py in query(isbn, service)
18 if not ean:
19 LOGGER.critical('%s is not a valid ISBN', isbn)
---> 20 raise NotValidISBNError(isbn)
21 isbn = ean
22 # only import when needed
NotValidISBNError: (abc) is not a valid ISBN
dict
, as a separate operation.try-except
block is used to capture the error from invalid isbn values.dict
,{}
is returned, becausepd.json_normalize
won't work withNaN
orNone
.pd.json_normalize
is used to expand thedict
returned from.meta
.pandas.DataFrame.rename
to rename columns, andpandas.DataFrame.drop
to delete columns.lists
, such as the'Authors'
column, usedf_meta = df_meta.explode('Authors')
; if there is more than one author, a new row will be created for each additional author in the list.