Frequency of elements in pairwise comparison of lists within a list in Python

125 views Asked by At

I have a list of lists like this:

my_list_of_lists = 
[['sparrow','sparrow','sparrow','junco','jay','robin'],
['sparrow','sparrow','junco', 'sparrow','robin','robin'],
['sparrow','sparrow','sparrow','sparrow','jay','robin']]

I would like to do a pairwise comparison at each position for all lists with the list like this:

#1 with 2
['sparrow','sparrow','sparrow','junco','jay','robin']
['sparrow','sparrow','junco', 'sparrow','robin','robin']

#1 with 3
['sparrow','sparrow','sparrow','junco','jay','robin']
['sparrow','sparrow','sparrow','sparrow','jay','robin']

#2 with 3
['sparrow','sparrow','junco', 'sparrow','robin','robin']
['sparrow','sparrow','sparrow','sparrow','jay','robin']

So the pairs for the 1 with 2:

pairs =[('sparrow','sparrow'), ('sparrow','sparrow'), ('sparrow','junco'),('junco','sparrow'),('junco','junco'), ('jay','robin'), ('robin','robin')]

I would like to get the counts and frequency of the pairs in each pairwise comparison:

pairs =[('sparrow','sparrow'), ('sparrow','sparrow'), ('sparrow','junco'),('junco','sparrow') ('junco','junco'), ('jay','robin'), ('robin','robin')]

sparrowsparrow_counts = 2
juncosparrow_counts = 2
jayrobin_counts = 1
robinrobin = 1

frequency_of_combos = [('sparrow', 'sparrow'):.333, ('sparrow', 'junco'):.333, ('jay', 'robin'):.167, ('robin', 'robin'): .167]

I've tried zipping but I end up zipping all of the lists (not the pairs) into tuples and I'm stumped on the rest.

I think it's somewhat related to How to calculate counts and frequencies for pairs in list of lists? but I can't figure out how to apply this to my data.

1

There are 1 answers

0
Matt Blaha On BEST ANSWER

Zip the two lists, then filter out the pairs that don't match, and use collections.Counter to count them:

from collections import Counter

a = ['sparrow','sparrow','sparrow','junco','jay','robin']
b = ['sparrow','sparrow','junco', 'sparrow','robin','robin']
c = Counter([ i for i in zip(a,b) if i[0] == i[1]])
print(c)


Counter({('sparrow', 'sparrow'): 2, ('robin', 'robin'): 1})

You seem to have the frequency part figured out, but that should clear up the use of zip and Counter.