I have the next algorithm prepared for matching two strings.
library(stringdist)
qgrams('perimetrico','perimetrico peri',q=2)
pe ri tr er im me o et ic co p
V1 1 2 1 1 1 1 0 1 1 1 0
V2 2 3 1 2 1 1 1 1 1 1 1
As far as Im concerned, this is the formal implementation for counting the number of ocurrencies.
stringdist('perimetrico','perimetrico peri', method='qgram', q=2)
5
But I am not comfortable with that solution. Thats why I want to count over the first result such as the following way:
pe=1
ri=1
tr=1
er=1
im=1
me=1
o=0
et=1
ic=1
co=1
p=0
So, the final result would be 9/11 = 82% match
Use apply (for each row) to count how many occurrences are 0 and subtract that number from
1
.Or you can round (for
0.82
) and multiply by 100 (for82
percent)