openpyxl returning empty cell values for formula series

2.6k views Asked by At

Vitals: python 3.4.3 | openpyxl 2.2.3 | Excel 2013

Like everyone knows you can use Excel's fill handle to quickly set up a number series down a column [1,2,3,4,5,6,etc] and this works with formula as well [=sum(B1,C1), =sum(B2,C2), =sum(B3,C3), etc]. However when I attempt to use openpyxl to nab formula added using the fill handle in Excel, those cells are reported to be empty.

It didn't work with my original code so I created a simple script containing this:

for row in ws.rows:
  for cell in row:
    print('my value is:',cell.value)

And ran it against this tiny (aww) Excel file. I used the fill handle on A1 to populate A2-A5 with formulas. (I'm just showing the formula for clarity, what I actually see in column A are the results.)

#      A       B   C
1 =SUM(B1:C1)  1   2
2 =SUM(B2:C2)  10  20
3 =SUM(B3:C3)  20  25
4 =SUM(B4:C4)  90  42
5 =SUM(B5:C5)  64  4

This was the output in console:

my value is: =SUM(B1:C1)
my value is: 1
my value is: 2
my value is: =SUM(B2:C2)
my value is: 10
my value is: 20
my value is: =
my value is: 20
my value is: 25
my value is: =
my value is: 90
my value is: 42
my value is: =
my value is: 64
my value is: 4

openpyxl reads the formula in A1 and A2 but not A3-A5. I tried this snippet against a couple of simple number series and openpyxl didn't have a problem reading those cells...the issue just seems to be with formulas. Am I missing something obvious?

1

There are 1 answers

6
Felipe On BEST ANSWER

Update 3/23/16

This issue has been resolved in recent versions of openpyxl (2.3, and beyond, I believe) PR request


This is a known shortcoming of openpyxl (bug report). The short answer is that when you drag a formula down or across, Excel uses a "shared formula," which cannot be parsed by openpyxl. If you really need this functionality, you may want to try xlrd.


Edit:

This was bugging me, so here's a very hackish workaround (Tested on Python 2.7). First you have to get Robin Macharg's Excel formula parser from here. Save it somewhere in your Python path as xlparse.py. Now, run the code from this gist, and you should be able to use the get_formula function to get the formula for any cell. In most cases, the function simply returns cell.value, but whenever the function encounters a shared formula, it gives the correct value.

Note: The monkeypatch to openpyxl.reader.worksheet.Worksheet is so that openpyxl.load_workbook uses our patched class. Make sure to run this code before calling load_workbook. If you are creating the workbook a different way, you'll have to patch Worksheet in the appropriate place.