I have a spreadsheet with the below structure (Data starts from Column B. Col A is empty)
A B C D
Name city salary
Jennifer Boston 100
Andrew Pittsburgh 1000
Sarah LA 100
Grand Total 1200
I need to filter out the row with the grand total before loading it into the database.
For this, I'm reading the Grand Total as:
import xlrd
import pymssql
#open workbook
book = xlrd.open_workbook("C:\_Workspace\Test\MM.xls")
print( "The number of worksheets is", book.nsheets)
#for each row in xls file loop
#skip last row
last_row = curr_sheet.nrows
print(last_row)
print(curr_sheet.ncols)
skip_val = curr_sheet.cell(last_row,1).value
print( skip_val)
if skip_val == "Grand Total":
last_row = last_row - 1
else:
last_row = last_row
for rx in range(last_row):
print( curr_sheet.row(rx))
However, I'm getting the below error:
Traceback (most recent call last): File "C:\_Workspace\Test\xldb.py", line 26, in <module> skip_val = curr_sheet.cell(last_row,1).value File "c:\Python34\lib\site-packages\xlrd-0.9.3- >py3.4.egg\xlrd\sheet.py", line 399, in cell self._cell_types[rowx][colx], IndexError: list index out of range
I'm not able to figure out what is wrong with the syntax above. Hoping someone here can spot why its throwing the error.
Thanks much in advance, Bee
I think your problem is not accounting for the zero-based index.
last_row = curr_sheet.nrows
returns the number of rows in the worksheet, so accessing the last row requires:The first element in Python is indexed by 0, so the first element of a list
mylist
would bemylist[0]
. The last element is notmylist[len(mylist)]
, instead it'smylist[len(mylist)-1]
, which should be written asmylist[-1]
. You can therefore write the following: