I want to compare linear models based on dependent data.
In my real data, I conducted monthly fish sampling in a river, recording their length and weight (that I named 'unstratified data'). This data was subjected to a linear model.
I then subsampled this data, stratifying it by month and size class. This 'stratified data' also was subjected to a linear model.
Now I want to compare if parameters obtained in the models based on unstratified and stratified data are different.
I produced a scrit to ilustrate the steps that I did. In this script, parameters obtained in both linear model were very similar... However, in my real data they were not.
library(dplyr)
library(detectnorm)
# set months
date = seq(as.Date("2022-09-01"), as.Date("2023-08-01"), by = "1 month")
# create dataframe of unstratified data
dat.uns = NULL
for(data in date) {
length = rnonnorm(n = 20, mean = 25, sd = 3, skew = 1, kurt = 2)
df = data.frame(Date = data, Length = length)
dat.uns = bind_rows(dat.uns, df)
}
dat.uns$Date = as.Date(dat.uns$Date)
dat.uns$Weight = 0.001*dat.uns$Length.dat^3 #create weight data based in the cube-law
dat.uns$Class = floor(dat.uns$Length.dat/2)*2 #create size classes
str(dat.uns)
# Linear regression for unstratified data
lwr.uns = lm(log(dat.uns$Weight)~log(dat.uns$Length.dat))
summary(lwr.uns)
# subsampling unstratified data by month and size class
for(i in 1){
dat.str = dat.uns %>%
group_by(Date, Class) %>%
sample_n(min(5, n()), replace = FALSE)
}
# Linear regression for stratified data
lwr.str = lm(log(dat.str$Weight)~log(dat.str$Length.dat))
summary(lwr.str)
Since data are dependent (stratified data derives from unstratified data), how is the better way/method to compare the obtained parameters/curves?
Thank you in advance for your assistance.