I have the following data:
head(MS.data.in)
encounter_id patient_nbr race gender age weight admission_type_id
1 2278392 8222157 Caucasian Female [0-10) ? 6
2 149190 55629189 Caucasian Female [10-20) ? 1
3 64410 86047875 AfricanAmerican Female [20-30) ? 1
4 500364 82442376 Caucasian Male [30-40) ? 1
5 16680 42519267 Caucasian Male [40-50) ? 1
6 35754 82637451 Caucasian Male [50-60) ? 2
I wud like to change the obs of 'age' column by taking the upper 2 digits of the given interval something as shown below:
head(MS.data.in$age)
[1] 10 20 30 40 50 60
We can use
subto extract the values by matching characters until the-(.*-) followed by numbers inside a capture group ((\\d+)) followed by characters until the end of string (.*) and replace with the backreference (\\1) of the capture group.