Suppose I have a table of around 1M rows with the following format:
id paid_2000 paid_2001 paid_2002 paid_2003 censor_yr
1 10 20 10 20 2001
2 15 25 15 15 2003
What would be an efficient way to set each observation to NA where the paid year is greater than or equal to the censor year? In particular, I would like the table to look like this:
id paid_2000 paid_2001 paid_2002 paid_2003 censor_yr
1 10 NA NA NA 2001
2 15 25 15 NA 2003
Using
dplyr
:Throw a
%>% select
chain at the bottom to movecensor_yr
back to the end.Can explain how it works if you'd like. May or may not be easier to read than akrun's answers.