Finding Most Common Word In A Tally/Ledger System

167 views Asked by At

I currently use the following array formula to find the most common word or number in a range, ignoring any blank cells:

{=(INDEX(D1:D10,MODE(IF((D1:D10<>"")*ISNA(MATCH(D1:D10,$A$1:$A1,0)),MATCH(D1:D10,D1:D10,0)))))}

I am now looking to do something slightly different. I still want to find the most common word or number in a range, however I now have 2 lists: the first is a list of 'positive' words/numbers and the second is a list of 'negative' words/numbers.

To illustrate using an example: the colour green appears in the 'positive' list 4 times and the colour blue appears twice in the 'positive list', but green appears 3 times in the 'negative' list and blue does not appear at all in the 'negative' list. Using the above formula on the first list would return green as the most common word. However I now want it to take into account that green is not the most common word given the combined lists (i.e. 4 positives - 3 negatives = 1 green, and 2 positives - 0 negatives = 2 blue).

In the below image, using the formula under each list shows green to be the most common word. I would like to combine these lists and cancel out any instances where the colour appears on both instances - so 3 of the greens on the positive list would be cancelled out with the 3 greens on the negative list, leaving only one left.

In essence, I suppose I am trying to create a tally or ledger of some kind where rather than numbers that add or subtract I have words whose frequency is added or subtracted.

enter image description here

Thanks for the help, and apologies if I haven't been too clear in the task!

2

There are 2 answers

8
XOR LX On BEST ANSWER

This should work:

=IF(SUMPRODUCT((MMULT(COUNTIF(OFFSET(B2:B11,,{0,1}),B2:B11),{1;-1})=MAX(MMULT(COUNTIF(OFFSET(B2:B11,,{0,1}),B2:B11),{1;-1})))/COUNTIF(B2:B11,B2:B11&""))>1,"No Favourite",INDEX(B2:B11,MATCH(MAX(MMULT(COUNTIF(OFFSET(B2:B11,,{0,1}),B2:B11),{1;-1})),MMULT(COUNTIF(OFFSET(B2:B11,,{0,1}),B2:B11),{1;-1}),0)))

And for non-contiguous, dynamically-defined ranges, assumed to be stored as Defined Names Positive and Negative, array formula**:

=IF(SUM((COUNTIF(Positive,Positive)-COUNTIF(Negative,Positive)=MAX(COUNTIF(Positive,Positive)-COUNTIF(Negative,Positive)))/COUNTIF(Positive,Positive&""))>1,"No Favourite",INDEX(Positive,MATCH(MAX(COUNTIF(Positive,Positive)-COUNTIF(Negative,Positive)),COUNTIF(Positive,Positive)-COUNTIF(Negative,Positive),0)))

Regards

**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).

8
Gary's Student On

First list your candidates in column D starting at D2

Then in E2 enter:

=COUNTIF(B$2:B$12,D2)-COUNTIF(C$2:C$12,D2)

and copy down.

Finally in F2 enter:

=INDEX(D:D,MATCH(MAX(E:E),E:E,0))

With your data:

demo