I have this script that use a hash table:
#!/usr/bin/env perl
use strict; use warnings;
my $hash = {
'cat' => {
"félin" => '0.500000',
'chat' => '0.600000',
'chatterie' => '0.300000'
'chien' => '0.01000'
},
'rabbit' => {
'lapin' => '0.600000'
},
'canteen' => {
"ménagère" => '0.400000',
'cantine' => '0.600000'
}
};
my $text = "I love my cat and my rabbit canteen !\n";
foreach my $word (split "\s+", $text) {
print $word;
exists $hash->{$word}
and print "[" . join(";", keys %{ $hash->{$word} }) . "]";
print " ";
}
For now, I have this output:
I love my cat[chat;félin;chatterie;chien] and my rabbit[lapin] canteen[cantine;ménagère] !
I need to have the nbest key value according to the frequencies (stored in my hash). For example, I want to have the 3 best translations according to the frequencies like this:
I love my cat[chat;félin;chatterie] and my rabbit[lapin] canteen[cantine;ménagère] !
How can I change my code to take into account the frequencies of each values and also to print the nbest values ?
Thanks for your help.
The tidiest way to do this is to write a subroutine that returns the N most frequent translations for a given word. I have written
best_n
in the program below to do that. It usesrev_nsort_by
fromList::UtilsBy
to do the sort succinctly. It isn't a core module, and so may well need to be installed.I have also used an executable substitution to modify the string in-place.
output