My TeamName column does not reflect unique team names. Therefore, I have to find a way to identify unique teams through the unique RaterID & RateeID columns. My data consists of dyadic information within a team. Therefore, if a number in the RaterID column appears in the RateeID column, both people are in the same team. I am trying to create a unique team ID but the only way to distinguish between teams is when the RaterID also appears in the RateeID column. This is dyadic data collected in a round robin style within a team. I figured I could create a new column that combines the RaterID & RateeID then create a value (maybe using the rank function?) that would help me distinguish between teams. My data contains over 3000 teams so I thought I would first group_by team name then examine the dyads for commonality in order to create a new column that I could later paste with the TeamName to make a unique team ID. This is my first question on here, so hopefully I am articulating this well…
I am new to r and have no idea what to try...
df<-data.frame(RaterID = c(1, 1, 1, 2, 2, 2, 3, 3, 3, 5, 5, 6, 6, 8, 8, 9, 9, 10, 10), RateeID = c(2, 3, 4, 1, 3, 4, 1, 2, 4, 6, 7, 5, 7, 9, 10, 8, 10, 8, 9), TeamName = c('A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B'))
group by TeamName to ease calculating unique team ID for a big data
library(dplyr) df %>% group_by (TeamName)
Here is where I am lost… How do I write a function that says if RaterID also occurs in RateeID within a group (i.e. TeamName) then create a unique identifier. Perhaps use the rank function? Then I could use that to combine it with TeamName and finally get a unique team ID.
My desired result is:
RaterID RateeID TeamName UniqueTeamID 1 2 A A1 1 3 A A1 1 4 A A1 2 1 A A1 2 3 A A1 2 4 A A1 3 1 A A1 3 2 A A1 3 4 A A1 5 6 A A2 5 7 A A2 6 5 A A2 6 7 A A2 8 9 B B1 8 10 B B1 9 8 B B1 9 10 B B1 10 8 B B1 10 9 B B1