I have the following data:
head(MS.data.in)
encounter_id patient_nbr race gender age weight admission_type_id
1 2278392 8222157 Caucasian Female [0-10) ? 6
2 149190 55629189 Caucasian Female [10-20) ? 1
3 64410 86047875 AfricanAmerican Female [20-30) ? 1
4 500364 82442376 Caucasian Male [30-40) ? 1
5 16680 42519267 Caucasian Male [40-50) ? 1
6 35754 82637451 Caucasian Male [50-60) ? 2
I wud like to change the obs of 'age' column by taking the upper 2 digits of the given interval something as shown below:
head(MS.data.in$age)
[1] 10 20 30 40 50 60
We can use
sub
to extract the values by matching characters until the-
(.*-
) followed by numbers inside a capture group ((\\d+)
) followed by characters until the end of string (.*
) and replace with the backreference (\\1
) of the capture group.