Ryan Joiner Normality Test P-value

1.1k views Asked by At

enter image description here

i tried a lot to calculate the Ryan Joiner(RJ) P-Value. is there any method or formula to calculate RJ P-Value. i found how to calculate RJ value but unable to find the manual calcuation part for RJ P-Value. Minitab is calculating by some startegy . i want to know that how calculate it in manually.

please support me on this.

2

There are 2 answers

0
Anderson Marcos Dias Canteli On

It appears that Minitab simply uses linear interpolation with the critical values to estimate the p-value. I tested with 17 different datasets and all the results are similar, although not exactly the same. I believe the small difference is due to rounding of the values reported by Minitab. However, when I calculate the test statistics and compare only the p-value, the results are always identical (rounded to the third decimal place).

For example, for N=40, the critical values are:

  • For alpha = 0.10, R_critical =0.97670384
  • For alpha = 0.05, R_critical =0.97148399
  • For alpha = 0.01, R_critical =0.95968573

Now just interpolate the test statistics to find alpha. As Ra=0.968, the corresponding alpha is between Rcritical=0.95968573 (0.01) and Rcritical=0.97148399 (0.05). I calculated the critical values with the equation mentioned by @brittohalloran.

Just replace the values into the linear interpolation equation (https://en.wikipedia.org/wiki/Linear_interpolation). See the image below for a step by step formula

step by step calculation

The result is a value close to what you obtained (p_value = 0.040).

For p-value less than 1%, minitab returns "p < 0.010" and for p-value greater than 10%, it returns "p > 0.100".

The script below estimates the p-value (I wrote out in python for convenience)

import numpy as np
from scipy import interpolate

def rj_p_value(statistic, n):
    alphas = np.array([0.10, 0.05, 0.01])
    criticals = np.array(
        [1.0071 - (0.1371 / np.sqrt(n)) - (0.3682 / n) + (0.7780 / np.square(n)), # alpha = 0.10
         1.0063 - (0.1288 / np.sqrt(n)) - (0.6118 / n) + (1.3505 / np.square(n)), # alpha = 0.05
         0.9963 - (0.0211 / np.sqrt(n)) - (1.4106 / n) + (3.1791 / np.square(n)), # alpha = 0.01
        ])
    f = interpolate.interp1d(criticals, alphas)
    if statistic > max(criticals):
        return "p > 0.100"
    elif statistic < min(criticals):
        return "p < 0.010"
    else:
        return f(statistic) 
    
print(rj_p_value(statistic=0.968, n=40))
>>> 0.03818810764442493
0
brittohalloran On

The test statistic RJ needs to be compared to a critical value CV in order to make a determination of whether to reject or fail to reject the null hypothesis.

The value of CV depends on the sample size and confidence level desired, and the values are empirically derived: generate large numbers of normally distributed datasets for each sample size n, calculate RJ statistic for each, then CV for a=0.10 is the 10th percentile value of RJ.

Sidenote: For some reason I'm seeing a 90% confidence level used many places for Ryan-Joiner, when a 95% confidence is commonly used for other normality tests. I'm not sure why.

I recommend reading the original Ryan-Joiner 1976 paper: https://www.additive-net.de/de/component/jdownloads/send/70-support/236-normal-probability-plots-and-tests-for-normality-thomas-a-ryan-jr-bryan-l-joiner

In that paper, the following critical value equations were empirically derived (I wrote out in python for convenience):

def rj_critical_value(n, a=0.10)
    if a == 0.1:
        return 1.0071 - (0.1371 / sqrt(n)) - (0.3682 / n) + (0.7780 / n**2)
    elif a == 0.05:
        return 1.0063 - (0.1288 / sqrt(n)) - (0.6118 / n) + (1.3505 / n**2)
    elif a == 0.01: 
        return 0.9963 - (0.0211 / sqrt(n)) - (1.4106 / n) + (3.1791 / n**2)
    else:
        raise Exception("a must be one of [0.10, 0.05, 0.01]")

The RJ test statistic then needs to be compared to that critical value:

  • If RJ < CV, then the determination is NOT NORMAL.
  • If RJ > CV, then the determination is NORMAL.

Minitab is going one step further - working backwards to determine the value of a at which CV == RJ. This value would be the p-value you're referencing in your original question.