I have the following 606 x 274 table: see here
Goal:
For every date calculate lower and upper 20% percentiles and, based on the outcome, create 2 new variables, e.g. 'L' for "lower" and 'U' for "upper", which contain the ticker names as seen in the header of the table.
Step by step:
% Replace NaNs with 'empty' for the percentile calculation (error: input to be cell array)
T(cellfun(@isnan,T)) = {[]}
% Change date format
T.Date=[datetime(T.Date, 'InputFormat', 'eee dd-MMM-yyyy')];
% Take row by row
for row=1:606
% If Value is in upper 20% percentile create new variable 'U' that contains the according ticker names.
% If Value is in lower 20% percentile create new variable 'L' that contains the according ticker names.
end;
So far, experimenting with 'prctile' only yielded a numeric outcome, for a single column. Example:
Y = prctile(T.A2AIM,20,2);
Thanks for your help and ideas!
Generally speaking, if you have an array of numbers:
percentiles can be computed by first sorting the array and then attempting to access the index supplied in the percentile. So
prctile(a,20)
's functionality could in principle be replaced byHowever,
prctile
does a bit more of fancy magic interpolating the input vector to get a value that is less affected by array size. However, you can use the idea above to find the percentile splitting columns. If you chose to do it like I said above, what you want to do to get the headers that correspond to the 20% and 80% percentiles is to loop through the rows, remove the NaNs, get the indeces of the sort on the remaining values and get the particular index of the 20% or 80% percentile. Regrettably, I have an old version of Matlab that does not support tables so I couldn't verify if the header names are returned correctly, but the idea should be clear.If you want to use matlab's
prctile
, you would have to find the returned value's index doing something like this: