Replace values within group

125 views Asked by At

I have a following data where I am trying to replace income for years 1980 and 1981 with that of year 1979 (no change for year 1978) [for each state].

state   year    size    income
1   1978    1   1000
1   1978    1.5 100
1   1978    2   5000
1   1979    1   3779.736
1   1979    1.5 3779.736
1   1979    2   4878.414
1   1980    1   4290
1   1980    1.5 4290
1   1980    2   5537
1   1981    1   4729
1   1981    2   6111
2   1978    1   1000
2   1978    1.5 100
2   1978    2   5000
2   1979    1   1000
2   1979    1.5 2000
2   1979    2   3000
2   1980    1   2000
2   1980    1.5 3000
2   1980    2   1000
2   1981    1   4000
2   1981    2   5000

gen income1=income
replace income1=3779.736 if size==1 & year>1979 & state==1
replace income1=3778.736 if size==1.5 & year >1979 & year!=1981 & state==1
replace income1=4878.41 if size=2 & year>1979 & state==1

I can also do the same for other states. However, I was wondering whether there are other ways of doing the same thing. In my actual data, for each state (there are 50 states), I have 10 sizes (1,1,5,2,3,4,5,6,7,8,9,10) for years 1979 and 1980 and 9 sizes (1,2,3,4,5,6,7,8,9) for year 1981 and more than 10 variables

The expected output is as follows:

state   year    size    income
1   1978    1   1000
1   1978    1.5 100
1   1978    2   5000
1   1979    1   3779.736
1   1979    1.5 3779.736
1   1979    2   4878.414
1   1980    1   3779.736
1   1980    1.5 3779.736
1   1980    2   4878.414
1   1981    1   3779.736
1   1981    2   4878.414
2   1978    1   1000
2   1978    1.5 100
2   1978    2   5000
2   1979    1   1000
2   1979    1.5 2000
2   1979    2   3000
2   1980    1   1000
2   1980    1.5 2000
2   1980    2   3000
2   1981    1   1000
2   1981    2   3000

If the sizes were the same (say 1,1.5, and 3) for each year, I could simply do the following. However, sizes are not the same for each year.

gen inc1=income 
  local yr 1979 1980 1981 
  local mult 3 6 9
  local n: word count `mult'
  forvalues i=1/`n'{
  local a: word `i' of `yr'
  local b: word `i' of `mult'
bys state (year size): replace inc1=income[_n-`b'] if year>`a'
}
2

There are 2 answers

0
Roberto Ferrer On BEST ANSWER

A first approximation is

clear
set more off

input ///
state   year    size    income
1   1978    1   1000
1   1978    1.5 100
1   1978    2   5000
1   1979    1   3779.736
1   1979    1.5 3779.736
1   1979    2   4878.414
1   1980    1   4290
1   1980    1.5 4290
1   1980    2   5537
1   1981    1   4729
1   1981    2   6111
2   1978    1   1000
2   1978    1.5 100
2   1978    2   5000
2   1979    1   1000
2   1979    1.5 2000
2   1979    2   3000
2   1980    1   2000
2   1980    1.5 3000
2   1980    2   1000
2   1981    1   4000
2   1981    2   5000
end

list, sepby(state)

*----- what you want -----

bysort state size : egen inc2 = total(income * (year == 1979))

clonevar inc3 = income
replace inc3 = inc2 if year >= 1980

sort state year size
list, sepby(state)
1
AudioBubble On

Here's a start to one approach; I'm sure it could be improved upon.

. input year size income

          year       size     income
  1. 1978 1 1000
  2. 1978 2 2000
  3. 1978 3 3000
  4. 1979 1 1001
  5. 1979 2 2001
  6. 1979 3 3001
  7. 1980 1 1002
  8. 1980 2 2002
  9. 1980 3 3002
 10. end

. generate i79 = 0

. replace i79 = income if year==1979
(3 real changes made)

. egen repinc = max(i79), by(size)

. generate newincome = income

. replace newincome = repinc if year==1980
(3 real changes made)

. list, clean noobs

    year   size   income    i79   repinc   newinc~e  
    1978      1     1000      0     1001       1000  
    1978      2     2000      0     2001       2000  
    1978      3     3000      0     3001       3000  
    1979      1     1001   1001     1001       1001  
    1979      2     2001   2001     2001       2001  
    1979      3     3001   3001     3001       3001  
    1980      1     1002      0     1001       1001  
    1980      2     2002      0     2001       2001  
    1980      3     3002      0     3001       3001  

.