I would like to know if my codes make sense. A colleague of mine sent them it to me and I have been using them but I have the feeling some things are not correct.
In the next codes, I attempted to analyse Shapiro and Levene tests to know if could perform an ANOVA or if I had to use a non-parametric test (like Kruskal-Wallis).
The data belongs to a subset of a column named Sediment_type
, where I want to analyse just the data belonging to the "sandy" sediment. Hence why I made the object subset_sandy
. The excel file is named "Morphology_Lab_T1_D57_AGB" and the factor I want to statistically combine with the subset_sandy
is a column called Wave_type
.
First step) Find if there is normality (Shapiro) and homogeneity (Levene) among the data:
subset_sandy \<- subset(Morphology_Lab_T1_D57_AGB, Sediment_type == "Sandy")
subset_sandy$Wave_type \<- as.factor(subset_sandy$Wave_type)
AN_sandy \<- lm(Mean_L.\_leaf_length \~ Wave_type, data = subset_sandy)
AN_sandy
par(mfrow = c(2, 2))
plot(AN_sandy)
shapiro.test((AN_sandy$res)) # Normality (α \< 0.2477)
leveneTest(AN_sandy$res, subset_sandy$Wave_type) # Homogeneity (α \< 0.3243)
Second step) In this case, p-value was higher in both tests than alpha, so I proceeded to do a pairwise test (I guess equal to ANOVA, please correct me if I am wrong):
pairwise_result_sandy \<- pairwise.t.test(subset_sandy$Mean_L.\_leaf_length,
subset_sandy$Wave_type,
p.adjust.method = "bonferroni")
print("Pairwise comparisons for Sandy Sediment")
print(pairwise_result_sandy)
Doubts:
- My doubts with regards these codes is the function
lm()
I have been using in the first step, since I realized it makes linear regression. However, my colleague has been doing this for long time. - Is
par(mfrow = c(2, 2))
important for my analysis? I do not understand what it means. - I was told that
p.adjust.method
is not equivalent as "Post-hoc", but in my codes I add (= "bonferroni"
) to thisp.adjust.method
. I would like to perform a Post-hoc too.