Wrong Denominator in Proportion of multiple factors using svyby svyciprop

400 views Asked by At

I want to get the proportion and confidence intervals from a contingency table I extracted from a complex sample survey. I am using svy svyciprop (package survey).

Problem: My code is not adding the numbers I expected in the denominator, so I am getting the 'wrong' proportion. I present my concrete example below. Any idea on how to solve this?

The proportion I would like to compute

In the following example, I want to estimate the 'Proportion of Men with actv_30==1 in each category of P040 in relation to ALL men.'

Let's create a contingency table and calculate the proportion 'outside' R.

ftable(svytable(~actv_30+v0302+P040, design = sample.pns13.18y)) #PNS 2013
enter code here
                         P040    Maybe       No      Yes
actv_30          v0302                                
0                  Men         3465091 32738241  5663912
                   Women       3793623 20721490  5961574
1                  Men         2826317        0  6761130
                   Women       2594562        0  5525180

In this example the computation would be as follows: 1.Men.Maybe / ALL men

1.Men.Yes == 13.14% == 6761130 / (6761130 + 2826317 + 3465091 + 32738241 + 5663912)

or

1.Men.Maybe == 5.49% == 2826317 / (6761130 + 2826317 + 3465091 + 32738241 + 5663912)

The 'wrong' proportion I am getting

Here is my R code and the output. The problem is that my code is computing the proportions WITHIN each category of P040. So it is computing: 1.Men.Maybe / 1.Men.Maybe + 0.Men.Maybe , where the denominators are the summ of each category of P040.

When in fact, I wanted: 1.Men.Maybe / ALL men, where the denominators are the summ of each category of v0302.

svyby(~factor( actv_commutetime30==1 ) ,
                   ~v0302+P040,
                   design = sample.pns13.18y ,
                   vartype="ci",
                   level = 0.95,
                   svyciprop)

            v0302  P040     factor(actv_30 == 1)    ci_l     ci_u
Men.Maybe     Men Maybe                     0.45     0.41    0.48 

Women.Maybe Women Maybe                     0.41     0.38    0.44 

Men.No        Men    No                     0.00     0.00    0.00 

Women.No    Women    No                     0.00     0.00    0.00 

Men.Yes       Men   Yes                     0.54     0.52    0.57 

Women.Yes   Women   Yes                     0.48     0.46    0.51 

Any idea on how to solve this?

1

There are 1 answers

2
Anthony Damico On BEST ANSWER

be a dear and use a reproducible example

library(survey)
data(api)
## one-stage cluster sample
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

# you subset by all men, i subset by
y <- subset( dclus1 , yr.rnd == 'No' )

# you have actv_30 == 1 i have cnum == 1
# and your p040 is equivalent to stype
svymean(~factor(interaction( stype , cnum == 1 ) ),y)