Test for a logistic distribution in R

2.1k views Asked by At

I have a set of data and I'd like to know whether this data set has a logistic distribution. When I made a histogram of my data set (see the histogram on http://imageshack.us/photo/my-images/593/histogram.png/) it seems to have a logistic distribution, but to be sure I'd like to test for a logistic distribution in R. So my question is: Is there a way to test your data for a logistic distribution and how do you do this?

Additional information: The data set consists of 8544 items. The data are horizontal distances in km between 2 geographical points.

Thanks for your attention

Sander

1

There are 1 answers

0
Greg Snow On

In R you can use the ks.test or chisq.test functions (and probably others) to test against a hypothesized distribution. Note that these tests (and others) are all rule out tests, a non-significant result does not guarentee that the data come from the given distribution, just that you cannot rule it out. Also note that with a sample size of 8544 these tests are likely to be way overpowered, meaning that it will have power to find slight meaningless differences and you are likely to reject the null hypothesis even though it is "close enough". Also the fact that you decided on a distribution based on looking at the data first could bias results.

Another approach that may give you a better feel for if a logistic distribution is "close enough" rather than exactly is to use the vis.test function in the TeachingDemos package (be sure to read the paper referenced in the help page to understand the test and what assumptions you are making).

Most importantly is understanding the science that leads to the data, does a logistic distribution make sense scientifically? what other distributions could be reasonble? Also understand what question(s) you are trying to answer with the data and what is the effect on those answers of the distribution (e.g. the CLT will let you use the normal to answer some questions, but not others, using a normal distribution even though the data comes from a logistic or something similar).