I am running a large Python program to optimize portfolio weights for (Markowitz) portfolio optimization in finance. When I Profile the code, 90% of the run time is spent calculating the portfolio return, which is done millions of times. What can I do to speed up my code? I have tried:
- vectorizing the calculation of returns: made the code slower, from 1.5 ms to 3 ms
- used the function autojit from Numba to speed up the code: no change
See example below - any suggestions?
import numpy as np
def get_pf_returns(weights, asset_returns, horizon=60):
'''
Get portfolio returns: Calculates portfolio return for N simulations,
assuming monthly rebalancing.
Input
-----
weights: Portfolio weight for each asset
asset_returns: Monthly returns for each asset, potentially many simulations
horizon: 60 months (hard-coded)
Returns
-------
Avg. annual portfolio return for each simulation at the end of 5 years
'''
pf = np.ones(asset_returns.shape[1])
for t in np.arange(horizon):
pf *= (1 + asset_returns[t, :, :].dot(weights))
return pf ** (12.0 / horizon) - 1
def get_pf_returns2(weights, asset_returns):
''' Alternative '''
return np.prod(1 + asset_returns.dot(weights), axis=0) ** (12.0 / 60) - 1
# Example
N, T, sims = 12, 60, 1000 # Settings
weights = np.random.rand(N)
weights *= 1 / np.sum(weights) # Sample weights
asset_returns = np.random.randn(T, sims, N) / 100 # Sample returns
# Calculate portfolio risk/return
pf_returns = get_pf_returns(weights, asset_returns)
print np.mean(pf_returns), np.std(pf_returns)
# Timer
%timeit get_pf_returns(weights, asset_returns)
%timeit get_pf_returns2(weights, asset_returns)
EDIT
Solution: Matmul was fastest on my machine:
def get_pf_returns(weights, asset_returns):
return np.prod(1 + np.matmul(asset_returns, weights), axis=0) ** (12.0 / 60) - 1
In my environment,
mutmul
(@
) has a modest time advantage overeinsum
anddot
:I think times are limited by the total number of calculations, more than the coding details. All of these pass the calculation to compiled numpy code. The fact that your original looped version is relatively fast probably has to do with the small number of loops (only 60), and memory management issues in the fuller
dot
.And
numba
is probably not replacing thedot
code.So a tweak here or there might speed up your code by a factor of 2, but don't expect an order of magnitude improvement.