I want to annotate a scatterplot with several points that have the same label. I want to label all of them (not just a part of them) but it is a mess with so many redundant labels. Is there any way to have one single label pointing to all the points with the same label with ggrepel::geom_text_repel?
I attach the simplest possible situation:
df <- data.frame(
group = c("A", "A", "B", "B"),
x = c(1, 2, 3, 4),
y = c(2, 3, 4, 5)
)
ggplot(df, aes(x, y)) +
geom_point() +
geom_text_repel(data=df, aes(label=group), box.padding = 0.5, force = 4)
PS: @user2862862 posted the same question in 2019 but there was no proper answer in One label for multiple points
Here's an approach to achieve what you want. However, as @AllanCameron alluded to in the comments, the appropriateness of this approach is heavily data dependent. I have included extra examples to illustrate potential issues.
This method involves computing the mean xy for each group, then creating two more dataframes: one for the lines (df1), and one for the labels (df2):
Now consider these two other example dataframes:
Example1 and Example2 look ok, but your proposed method doesn't scale nicely for data like Example3. Lines for each group cross and this makes it difficult to interpret. If your full data are more complex and contain lots of points like Example3, using colour (or shape) is much more effective at communicating what is going on in your data: