Here is some data to work with.
df <- data.frame(x1=c(234,543,342,634,123,453,456,542,765,141,636,3000),x2=c(645,123,246,864,134,975,341,573,145,468,413,636))
If I plot these data, it will produce a simple scatter plot with an obvious outlier:
plot(df$x2,df$x1)
Then I can always write the code below to remove the y-axis outlier(s).
plot(df$x2,df$x1,ylim=c(0,800))
So my question is: Is there a way to exclude obvious outliers in scatterplots automatically? Like ouline=F
would do if I were to plot, say, boxplots for an example. To my knowledge, outline=F
doesn't work with scatterplots.
This is relevant because I have hundreds of scatterplots and I want to exclude all obvious outlying data points without setting ylim(...)
for each individual scatterplot.
You could write a function that returns the index of what you define as an obvious outlier. Then use that function to subset your data before plotting.
Here all observations with "a" exceeding 5 * median of "a" are excluded.