I'm trying to select some elements in a python list. The list represents a distribution of the sizes of some other elements, so it contains multiple repeated values.
After I find the average value on this list, I want to pick those elements which value lies between an upper bound and a lower bound around that average value. I can do that easily, but it selects too many elements (mainly because the distribution I have to work with is pretty much homogeneous). So I would like to be able to select the bounds where to chose the values, but also limit the spread of the search to like 5 elements below the average and 5 elements above.
I'll add my code (it is super simple).
avg_lists = sum_lists/len(lists)
num_list = len(list)
if (int(num_comm/10)%2 == 0):
window_size = int(num_list/10)
else:
window_size = int(num_list/10)-1
out_file = open('chosenLists', 'w+')
chosen_lists = []
for list in lists:
if ((len(list) >= (avg_lists-window_size)) & (len(list)<=(avg_lists+window_size))):
chosen_lists.append(list)
out_file.write("%s\n" % list)
If you are allowed to use median instead of average then you can use this simple solution:
The function
select
returnsn
elements closest to the median.