Add a prefix and a suffix only to some of the rownames

64 views Asked by At

I have this dataframe:

structure(list(Treatnent.state = c("PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy (On dabrafenib+trametinib)", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy", 
"PRE Immune Checkpoint Blockade Therapy", "PRE Immune Checkpoint Blockade Therapy (On dabrafenib+trametinib)", 
"PRE Immune Checkpoint Blockade Therapy (On dabrafenib+trametinib)"
), timepoint = c(-6, 0, 0, 0, 0, -1, 0, -3, -2, 0, 0, -1, 0, 
0), Patient = c(115, 148, 208, 208, 272, 39, 42, 422, 62, 208, 
208, 39, 42, 42)), class = "data.frame", row.names = c("115-031814          ", 
"148-6-5-14_S9       ", "208-3-11-15_S13     ", "208-9-10-14_S11     ", 
"272-121914          ", "39-3-31-14_S15      ", "42-10-17-14_S3      ", 
"422-092815          ", "62-10-2-13_S6       ", "MGH208_031115-1.bam ", 
"MGH208_031115-2.bam ", "MGH39_033114.bam    ", "MGH42_101714.bam    ", 
"MGH42_101714_1.bam  "))

with rownames:

 [1] "115-031814          " "148-6-5-14_S9       " "208-3-11-15_S13     " "208-9-10-14_S11     "
 [5] "272-121914          " "39-3-31-14_S15      " "42-10-17-14_S3      " "422-092815          "
 [9] "62-10-2-13_S6       " "MGH208_031115-1.bam " "MGH208_031115-2.bam " "MGH39_033114.bam    "
[13] "MGH42_101714.bam    " "MGH42_101714_1.bam  "

I want to add a prefix "X" and suffix ".bam", only for the rownames that don't start with MGH.

So for example: The rowname of the first row, 115-031814, would become X115-031814.bam, and the rowname MGH208_031115-1.bam would not change at all.

1

There are 1 answers

0
Maël On BEST ANSWER

Use grepl to check whether a string starts with 'MGH', then ifelse to apply paste "X" and ".bam" if it does not start with 'MGH'. I used trimws because some of your rownames has whitespace.

ifelse(!grepl("^MGH" , rownames(df)),
       paste0("X", trimws(rownames(df)), ".bam"),
       trimws(rownames(df)))

output

 [1] "X115-031814.bam"      "X148-6-5-14_S9.bam"   "X208-3-11-15_S13.bam"
 [4] "X208-9-10-14_S11.bam" "X272-121914.bam"      "X39-3-31-14_S15.bam" 
 [7] "X42-10-17-14_S3.bam"  "X422-092815.bam"      "X62-10-2-13_S6.bam"  
[10] "MGH208_031115-1.bam"  "MGH208_031115-2.bam"  "MGH39_033114.bam"    
[13] "MGH42_101714.bam"     "MGH42_101714_1.bam"