R Creating a Character Column from a Numeric Column w/o using For Loop

56 views Asked by At

I am trying to create a column of characters based on an existing column of numbers, preferably without using a for loop. I have come up with a variety of ways to do this, but I keep feeling like I'm making this far more complicated than it needs to be.

Here is the dreaded and time-consuming for loop version (edited):

joe <- as.data.frame(matrix(round(runif(10,1,5),0),nrow=10,ncol=1))
chr <- function(n) { rawToChar(as.raw(n)) }
m = ncol(joe)+1
for (i in 1:nrow(joe)){
  joe[i,m] <- chr(joe[i,m-1]+64)
}
joe

    V1 V2
1   2  B
2   1  A
3   3  C
4   2  B
5   1  A
6   3  C
7   2  B
8   5  E
9   3  C
10  4  D

Yeah. That works. As expected. But for a real dataset, that operation will take a very long time.

How about using an ASCII conversion function like rawToChar?

joe <- as.data.frame(matrix(round(runif(10,1,5),0),nrow=10,ncol=1))
joe[,ncol(joe)+1] <- rawToChar(as.raw(64+joe[,1]))
joe

    V1        V2
1   1 ACCAACACDC
2   3 ACCAACACDC
3   3 ACCAACACDC
4   1 ACCAACACDC
5   1 ACCAACACDC
6   3 ACCAACACDC
7   1 ACCAACACDC
8   3 ACCAACACDC
9   4 ACCAACACDC
10  3 ACCAACACDC

That totally didn't work. The output of rawToChar isn't a vector, but a string. Is there anything that takes vector numeric input and outputs a character list?

Abandoning that approach for a bit, I was able to get two other ways to work... but they aren't particularly elegant, taking quite a few lines of code to implement. First the lookup table approach:

library("dplyr")
grades <- as.data.frame(matrix(seq(1,5,by=1),nrow=5,ncol=1))
grades <- cbind(grades, c("A","B","C","D","E"))
colnames(grades) <- c("num","ltr")
joe <- as.data.frame(matrix(round(runif(10,1,5),0)),nrow=10,ncol=1)
colnames(joe) <- c("num")
left_join(joe, grades, by="num")

    num ltr
1    3   C
2    4   D
3    2   B
4    5   E
5    4   D
6    2   B
7    2   B
8    1   A
9    5   E
10   2   B

And now using factors and levels:

joe <- as.data.frame(matrix(round(runif(10,1,5),0)),nrow=10,ncol=1)
joe$V2 <- as.factor(joe$V1)
levels(joe$V2) <- c("A","B","C","D","E")
joe$V2 <- as.character(joe$V2)
joe

   V1 V2
1   4  D
2   1  A
3   3  C
4   2  B
5   3  C
6   3  C
7   5  E
8   2  B
9   4  D
10  4  D

So my question is really... are there other simpler and more elegant ways of doing this that I haven't thought of yet? Because it sure seems like I have invented some pretty complex ways to do a pretty simple operation.

Thanks in advance for your input.

1

There are 1 answers

0
C8H10N4O2 On BEST ANSWER

I am not really sure what you are trying to do, but just looking at your last example it looks like you are trying to do something like this:

set.seed(123) # good practice for reproducible answer
joe <- data.frame( V1 = sample.int(5,10,replace=TRUE) ) # simpler way
joe$V2 <- LETTERS[joe$V1]
joe
#    V1 V2
# 1   2  B
# 2   4  D
# 3   3  C
# 4   5  E
# 5   5  E
# 6   1  A
# 7   3  C
# 8   5  E
# 9   3  C
# 10  3  C