text mining with r library stringdist

Question

text mining with r library stringdist

184 views Asked by lolo At 07 September 2017 at 21:37

I have the next algorithm prepared for matching two strings.

library(stringdist)

qgrams('perimetrico','perimetrico peri',q=2)

   pe ri tr er im me o  et ic co  p
V1  1  2  1  1  1  1  0  1  1  1  0
V2  2  3  1  2  1  1  1  1  1  1  1

As far as Im concerned, this is the formal implementation for counting the number of ocurrencies.

stringdist('perimetrico','perimetrico peri', method='qgram', q=2)

5

But I am not comfortable with that solution. Thats why I want to count over the first result such as the following way:

pe=1
ri=1
tr=1
er=1
im=1
me=1
o=0
et=1
ic=1
co=1
p=0

So, the final result would be 9/11 = 82% match

Original Q&A

There are 1 answers

**pogibas** · Accepted Answer · 2017-09-07T21:40:46+00:00

Use apply (for each row) to count how many occurrences are 0 and subtract that number from 1.

library(stringdist)
foo <- qgrams('perimetrico','perimetrico peri',q=2)
apply(foo, 1, function(x) 1 - mean(x == 0))

       V1        V2 
0.8181818 1.0000000

Or you can round (for 0.82) and multiply by 100 (for 82 percent)

apply(a, 1, function(x) round(1 - mean(x == 0), 2) * 100)

 V1  V2 
 82 100

TechQA.

text mining with r library stringdist

There are 1 answers

Related Questions in R

Related Questions in STRINGDIST

Popular Questions

Popular Tags

Trending Questions