I have data of the following format
Reg_No Subject
AA11 Physics
AA11 Chemistry
AA12 English
AA12 Maths
AA12 Physics
I am trying to transform this data into row wise
Physics Chemistry
English Maths Physics
I know that each student can take maximum of 8 subjects
I am trying to create a matrix that can store the above data as variable rows (each student has different number of subjects)
I have written the following code
# read csv file
Term4 <- read.csv("Term4.csv")
# Find number of Students
Matrix_length <- length(unique(Term4$Reg_No))
# Uniquely store their reg number
Student <- unique(Term4$Reg_No)
# create matrix to be inserted as csv
out <- matrix(NA, nrow=Matrix_length , ncol=8) # max subjects = 8 so ncol =8
# iterate to get each reg number's subjects
for (n in 1:Matrix_length) {
y <- Term4[Term4[,"Reg_No"] == Student[n],]$Subject
# transpose Courses as a single column into row and insert it in the matrix
out[n,] <- t(y)
}
I am getting the following error
Error in out[n, ] <- t(y) :
number of items to replace is not a multiple of replacement length
Could anyone please tell me how to work on this error
Thanks and Regards
reshape()
can do this:This will generate a data.frame with as many columns as are necessary to cover all subjects.
The reason your code is failing is that you've preallocated your matrix with 8 columns, but the RHS of each assignment will only contain as many subjects as the current student
n
has in the original data.frame. R rejects index-assignments whose target length is not divisible by the RHS length (actually for plain vectors it will just be a warning, but for matrices it seems to be an error; regardless, it's probably never the right thing to do).In general, if you ever do need to carry out such a non-divisible assignment, you can do it by extending the RHS to sufficient length by appending NAs. This could be done with
rep()
andc()
, but there's actually an elegant and easy way to do it using out-of-bounds indexing. Here's a demo: