Excel - calculate quartile and median to generate box plot given a list of values and counts

1k views Asked by At

I have a rather simple task to achieve but I am not sure how to do that in excel.

I have a list of values and number of counts of each value (my variable is discrete). I am trying to calculate 1st and 3rd quartiles of the variable given the counts I have. Overall dataset will be too large to fit into excel in raw format so I am using matrix of value counts.

Example:

Value | Count
1 | 100000
2 | 300000
3 | 350000
4 | 100000
5 | 50000
6 | 1000
7 | 10

I need to generate 1st, 3rd quartile and median out of this dataset, but as far as I see excel's quartile accepts only raw values, not values and their counts.

2

There are 2 answers

0
XOR LX On BEST ANSWER

Assuming the table as you give it is in A1:B8 (with headers in row 1), for the median you can use the following array formula**:

=SUM(LOOKUP(INT(SUM(B2:B8)/2+{0.5,1}),MMULT(N(ROW(B2:B8)>=TRANSPOSE(ROW(B2:B8))),N(+B1:B7))+1,A2:A8))/2

Note the one offset range (B1:B7) here, where B1 is assumed, as mentioned, to contain a (text) header.

Edit: Re quartiles, that would depend on whether you are wishing to calculate these as per the Excel QUARTILE function or as per their 'standard' definition, i.e. as medians of the upper/lower half of the data (see here).

Regards

**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).

0
Pavel_V On

I dont think this is possible only using formulas. You can easily create formulas to have the separate lists to avoid problem of having more total lines than excel rows, but I dont know about any way to append these lists into one big list inside functions. MAybe someone else knows a way, but in the meantime there is another way in Excel:

What you can use is power query and power pivot. You start with the table that you have and add it as to Queries (Data - "Get & Transform Data" section - "From Table/Range". I have Excel 2016 so if you have a different version the path is most probably different or in older versions these tools are not available)

Now edit the query and add another column using List.Repeat to create your full dataset, expand the values and load to data model (when you are in Query Editor go "Home" - "Close & Load", click the small arrow down and select . "Close & Load To..." and select "Only Create Connection" and check "Add this data to the Data Model")

Inside the "Power Pivot" tab you can add a new Measure to calculate the quartile for this table. I used

quartile:=PERCENTILE.EXC(AllData[AllValues];0,25)

as I dont think there is really special quartile function.