Pandas: use of aggregate with a MultiIndex

871 views Asked by At

I have a question about the correct use of agg in pandas. The specific problem I am working on is in the field of finance and, more specifically, is to calculate a liquidity measure from the full limit order book.

My data contain the ask side of the order book (which represents how many shares people want to sell at a particular moment and at which price) and I want to calculate the hypothetical price for buying 50 shares at a specific moment in time. Assume for example that the order book for stock X at 9am looks like this:

example_data=pd.DataFrame({'price':[100.023,100.031,100.039,100.109,100.219 ],'avail_shares':      [40,1,20,23,15],'midpoint':[99.996 ,99.996 ,99.996 ,99.996,99.996 ]})

where price is the price at which shares are sold, avail_shares the number of shares available at each price and midpoint the average of the best ask and bid price in the order book. To get a liquidity measure that takes into account that a large order can hit multiple price levels at once (i.e. ‘walk the book’) I define the following cost-to-trade (ctt) function:

def ctt_ask(dfrm,level=50):
    dfrm['cumshares']=dfrm['avail_shares'].cumsum()
    dfrm['indicator']=0
    dfrm['indicator'].ix[dfrm.cumshares<level,]=dfrm.cumshares
    dfrm['indicator'].ix[(dfrm.cumshares>level) & (dfrm.cumshares.shift(1)<level),]=(level-  dfrm.cumshares.shift(1))
   liquidity_measure=((dfrm.price-dfrm.midpoint)*dfrm.indicator).sum()
    return liquidity_measure

This works just fine (i.e. ctt_ask(example_data) yields 2.90) for the above example but my real dataset has several stocks and many date times (it has a MultiIndex). When I use groupby and agg to apply this function to every stock-date time combination ( full_book_ask.groupby(level=[0,1]).agg(ctt_ask)) I get an error: KeyError: 'avail_shares'. This is strange because I do have a column named avail_shares in my actual dataset. I have also tried the same with the apply functionality but this raises the error message Exception: cannot handle a non-unique multi-index! . I can't seem to figure out what I'm doing wrong here. Any input would be much appreciated!

0

There are 0 answers