How can I find the best resemblance between one particular row and the rest of the rows in a dataframe?
I try to explain what I mean. Take a look at this dataframe:
df <- structure(list(person = 1:5, var1 = c(1L, 5L, 2L, 2L, 5L), var2 = c(4L,
4L, 3L, 2L, 2L), var3 = c(5L, 4L, 4L, 3L, 1L)), .Names = c("person",
"var1", "var2", "var3"), class = "data.frame", row.names = c(NA,
-5L))
How can I find the best resemblance between person 1 (row 1) and the rest of the rows (persons) in the data frame. The output should be something like: person 1 still in row 1 and the rest of the rows in order of best resemblance. The simmilarity algorithm I want to use is cosine or pearson. I tried to solve my problem with functions from the arules package, but it didn't match well with my needs.
Any ideas someone?
Another idea is to define the cosine function manually, and apply it on your data frame, i.e.
which gives,