I just began studying Python and I am currently trying to count the frequency of character sequences in a segmentation (segment into words). I have a problem with my count_seq
function. In this function I want in the first loop to go fetch each segment of the segmentation (in_object) and then in the second loop I am counting the sequences.
From what I tried to change and understand I think the problem definitely lies in the for idx in range
loop.
The problem is this, when I try to run the script I have this TypeError : " TypeError: '<' not supported between instances of 'dict' and 'int'" which I do not understand.
def main():
seq_length = 3
freq_dict = count_seq(in_object, seq_length)
print_freq(freq_dict)
def count_seq(segmentation, seq_length):
freq_dict = dict()
for idx in range(len(segmentation)):
freq_dict[idx] = dict()
segments = segmentation[idx].get_content()
for pos in range(len(segments)):
char = segments[pos:pos+seq_length]
if len(char) < seq_length:
continue
freq_dict[char] = freq_dict.get(char, 0)+1
return freq_dict
def print_freq(freq_dict):
"""Recuperer le contenu de la fonction de l'exercice 2..."""
for key in sorted(freq_dict, key=freq_dict.get, reverse=True):
print("%s:%s" %(key, freq_dict[key]))
if __name__ == "builtins":
if in_objects:
main()
This error is happening when you call
sorted
. The way items are sorted is by comparing them against each other. Some of the values of yourfreq_dict
aredict
s, and some of them areint
s, which cannot be compared against each other (which is less: 7 or a dictionary? It doesn't make sense).If you want to sort in this way, you need to make sure all values in
freq_dict
are comparable types.In this block you set a bunch of keys to an empty dict:
Then in your next loop, you set another set of keys to
int
s.What is your goal for
freq_dict
to be at the end?