How do I generate age category? My PATIENT_YOB is given as 01jan1956 and I want to get exact age

101 views Asked by At

I'm trying to use the following code but it gives error

 01jan1986 
05jan2001 
07mar1983 
and so on I need to get the exact age of them
gen agecat=1
if age 0-20==1
if age 21-40==2
if age 41-60==3
if age 61-64==4```
3

There are 3 answers

0
JR96 On

To build off of Wouter's answer, you could do something like this to calculate the age to the tenth of a year:

clear
set obs 12
set seed 12352

global today = date("18Jun2021", "DMY")

* Sample Data
gen dob = runiformint(0,17000) // random Dates
format dob %td

* Create Age
gen age = round((ym(year(${today}),month(${today})) - ym(year(dob), month(dob)))/ 12,0.1)

* Correct age if dob in current month, but after today's date
replace age = age - 0.1 if (month(${today}) == month(dob)) & (day(dob) > day(${today}))

* age category
gen age_cat = cond(age <= 20, 1, cond(age <= 40, 2, cond(age <= 60, 3, cond(age <= 64, 4, .))))

The penultimate step is important as it decrements the age if their DOB is in the same month as the comparison date but has yet to be realised.


     +----------------------------+
     |       dob    age   age_cat |
     |----------------------------|
  1. | 30jan2004   17.4         1 |
  2. | 14aug1998   22.8         2 |
  3. | 06aug1998   22.8         2 |
  4. | 31aug1994   26.8         2 |
  5. | 27mar1990   31.3         2 |
     |----------------------------|
  6. | 12jun1968     53         3 |
  7. | 05may1964   57.1         3 |
  8. | 06aug1994   26.8         2 |
  9. | 21jun1989   31.9         2 |
 10. | 10aug1984   36.8         2 |
     |----------------------------|
 11. | 22oct2001   19.7         1 |
 12. | 03may1972   49.1         3 |
     +----------------------------+

Note that the decimal is just approximate as it uses the month of the birthday and not the actual date.

0
Wouter On

Here's one way:

gen age_cat = cond(age <= 20, 1, cond(age <= 40, 2, cond(age <= 60, 3, cond(age <= 64, 4, .))))

You might also want to look into egen, cut, see help egen.

0
Nick Cox On

You got some good advice in other answers, but this can be as simple as you want.

Consider this example, noting that presenting data as code we can run is a really helpful detail.

* Example generated by -dataex-. For more info, type help dataex
clear
input str9 sdate float dob
"01jan1986"  9497
"05jan2001" 14980
"07mar1983"  8466
end
format %td dob

The age at end 2020 is just 2020 minus the year people were born. Use any other year if it makes more sense.

. gen age = 2020 - year(dob)

. l

     +-----------------------------+
     |     sdate         dob   age |
     |-----------------------------|
  1. | 01jan1986   01jan1986    34 |
  2. | 05jan2001   05jan2001    19 |
  3. | 07mar1983   07mar1983    37 |
     +-----------------------------+

For 20 year bins, why not make them self-describing. Thus with this code, 20, 40 etc. are the upper limit of each bin. (You might need to tweak that if you have children under 1 year old in your data.)

. gen age2 = 20 * ceil(age/20)

. l

     +------------------------------------+
     |     sdate         dob   age   age2 |
     |------------------------------------|
  1. | 01jan1986   01jan1986    34     40 |
  2. | 05jan2001   05jan2001    19     20 |
  3. | 07mar1983   07mar1983    37     40 |
     +------------------------------------+

This paper is a review of rounding and binning using Stata.