Generating Variable in Stata?

83 views Asked by At

I have a categorical variable and am trying to calculate a new variable that multiplies each response by its frequency. Ex:

      total |        Freq.     
------------+---------------
          1 |          6        
          2 |          12        
          3 |          9        
          5 |          5        
          6 |          10        

I would like to have a variable that presents the sum n for each response (i.e. 1=6, 2=24, 3=27, etc.). I tried a few calculations using egen, but they did not seem to work. Please let me know if anyone has any insight.

2

There are 2 answers

0
Eric HB On

I think that this example should show you the general tactic:

sysuse auto, clear

bysort rep78: egen count_rep78 = count(rep78)
gen freq_x_val = rep78*count_rep78

browse rep78 count_rep78 freq_x_val

In this example rep78 is the categorical variable.

Essentially, you create a count variable that is the category's frequency in the bysort step. Then you multiply your new count variable by the categorical variable and you're done.

0
dimitriy On

It's not clear whether you want to have the data in the original dataset or you want a new one. This code does both:

clear

input catvar n
          1          6        
          2          12        
          3          9        
          5          5        
          6          10   
end

/* create fake catvar data */
expand n
drop n

/* store desired data in a variable in your data */
bysort catvar: gen sum = _N
replace sum = sum*catvar
list in 1/6, clean noobs
table catvar, c(mean sum freq)

/* or get a new dataset with desired data  */
contract catvar sum, freq(n)
list, clean noobs