doing simple word-analysis from this df
# A tibble: 22,959 x 4
word n proportion.A proportion.B
<chr> <int> <dbl> <dbl>
1 globe 1100 0.00792 NA
2 people 954 0.00687 NA
3 world 900 0.00648 NA
4 flag 719 0.00518 NA
5 american 646 0.00465 NA
6 program 634 0.00456 NA
7 travel 609 0.00438 NA
8 time 561 0.00404 NA
9 economic 556 0.00400 NA
10 sociology 529 0.00381 NA
# ... with 22,949 more rows
I'm trying to create a plot with geom_abline, & jitter to show a similar frequency in both text A & B through this code
ggplot(frequency, aes(x = proportion.A, y = proportion.B, color = abs(proportion.B - proportion.A))) +
geom_abline(color = "yellow", lty = 2) +
geom_jitter(alpha = 0.1, size = 2.5, width = 0.3, height = 0.3) +
xlim(0,0.1) +
ylim(0,0.1) +
geom_text(aes(label = word), check_overlap = TRUE, vjust = 1.5) +
scale_color_gradient(limits = c(0, 0.01), low = "blue", high = "red")
When I plot this I only get the plot with the abline and this warning message
Warning messages:
1: Removed 22959 rows containing missing values (geom_point).
2: Removed 22959 rows containing missing values (geom_text).
I know that this warning can occur when there's a limit issue, I've tried using scale_""_continuous & scale_log10 but to no avail. Any idea where I need to look? Thank you