Say I have a pandas series, and I want to take the mean of every set of 8 rows. I don't have prior knowledge of the size of the series, and the index may not be 0-based. I currently have the following
N = 8
s = pd.Series(np.random.random(50 * N))
n_sets = s.shape[0] // N
split = ([m * N for m in range(n_sets)],
[m * N for m in range(1, n_sets + 1)])
out_array = np.zeros(n_sets)
for i, (a, b) in enumerate(zip(*split)):
out_array[i] = s.loc[s.index[a:b]].mean()
Is there a shorter way to do this?
You could try with
groupby, by slicing the index inN(you can see here an explanation of the slicing), and then usepd.Series.mean():Output:
The difference it's because the number of decimals of each format, if you want to have only 8 decimals as the original
out_array, you could try tomapthe elements withroundfunction: