significant differences between means

715 views Asked by At

Considering the picture below

enter image description here

each values X could be identified by the indeces X_g_s_d_h

g = group g=[1:5]
s = subject number (variable for each g)
d = day number (variable for each s)
h = hour h=[1:24]

so X_1_3_4_12 means that the value X is referred to the

12th hour 
of 4th day
of 3rd subject
of group 1

First I calculate the mean (hour by hour) over all the days of each subject. Doing that the index d disappear and each subject is represented by a vector containing 24 values.

X_g_s_h will be the mean over the days of a subject.

Then I calculate the mean (subject by subject) of all the subjects belonging to the same group resulting in X_g_h. Each group is represented by 1 vector of 24 values

Then I calculate the mean over the hours for each group resulting in X_g. Each group now is represented by 1 single value

I would like to see if the means X_g are significantly different between the groups.

Can you tell me what is the proper way?

ps

The number of subjects per group is different and it is also different the number of days for each subject. I have more than 2 groups

Thanks

2

There are 2 answers

1
ASantosRibeiro On BEST ANSWER

Ok so I am posting an answer to summarize some of the problems you may have.

Same subjects in both groups

Not averaging:

1-First if we assume that you have only one measure that is repeated every hour for a certain amount of days, that is independent on which day you pick and each hour, then you can reshape your matrix into one column for each subject, per group and perform a ttest with repetitive measures.

2-If you cannot assume that your measure is independent on the hour, but is in day (lets say the concentration of a drug after administration that completely vanish before your next day measure), then you can make a ttest with repetitive measures for each hour (N hours), having a total of N tests.

3-If you cannot assume that your measure is independent on the day, but is in hour (lets say a measure for menstrual cycle, which we will assume stable at each day but vary between days), then you can make a ttest with repetitive measures for each day (M days), having a total of M tests.

4-If you cannot assume that your measure is independent on the day and hour, then you can make a ttest with repetitive measures for each day and hour, having a total of NXM tests.

Averaging:

In the cases where you cannot assume independence you can average the dependent variables, therefore removing the variance but also lowering you statistical power and interpretation.

In case 2, you can average the hours to have a mean concentration and perform a ttest with repetitive measures, therefore having only 1 test. Here you lost the information how it changed from hour 1 to N, and just tested whether the mean concentration between groups within the tested hours is different.

In case 3, you can average both hour and day, and test if for example the mean estrogen is higher in one group than in another, therefore having only 1 test. Again you lost information how it changed between the different days.

In case 4, you can average both hour and day, therefore having only 1 test. Again you lost information how it changed between the different hours and days.

NOT same subjects in both groups

Paired tests are not possible. Follow the same ideology but perform an unpaired test.

9
Kostya On

You need to perform a statistical test for the null hypothesis H0 that the data in different groups comes from independent random samples from distributions with equal means. It's better to avoid sequential 'mean' operation, but just to regroup data on g. If you assume normality and independence of observations (as pointed out by @ASantosRibeiro below), that you can perform ttest (http://www.mathworks.nl/help/stats/ttest2.html)

clear all;
X = randn(6,5,4,3); %dummy data in g_s_d_h format
Y = reshape(X,5*4*3,6); %reshape data per group

h = zeros(6,6);
for i = 1 : 6 
    for j = 1 : 6
        h(i,j)=ttest2(Y(:,i),Y(:,j));
    end
end

If you want to take into account the different weights of the observations, you need to calculate t-value yourself (e.g., see here http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000126.htm)