Inserting Variable length row in R for a 2D matrix

337 views Asked by At

I have data of the following format

Reg_No     Subject
  AA11     Physics
  AA11   Chemistry
  AA12     English
  AA12       Maths
  AA12     Physics

I am trying to transform this data into row wise

Physics   Chemistry
English       Maths   Physics

I know that each student can take maximum of 8 subjects

I am trying to create a matrix that can store the above data as variable rows (each student has different number of subjects)

I have written the following code

# read csv file
Term4 <- read.csv("Term4.csv")
# Find number of Students
Matrix_length <- length(unique(Term4$Reg_No))
# Uniquely store their reg number
Student <- unique(Term4$Reg_No)
# create matrix to be inserted as csv
out <- matrix(NA, nrow=Matrix_length , ncol=8) # max subjects = 8 so ncol =8
# iterate to get each reg number's subjects
for (n in 1:Matrix_length) {
    y <- Term4[Term4[,"Reg_No"] == Student[n],]$Subject
    # transpose Courses as a single column into row and insert it in the matrix
    out[n,] <- t(y)
}

I am getting the following error

Error in out[n, ] <- t(y) :
    number of items to replace is not a multiple of replacement length

Could anyone please tell me how to work on this error

Thanks and Regards

1

There are 1 answers

0
bgoldst On

reshape() can do this:

df <- data.frame(Reg_No=c('AA11','AA11','AA12','AA12','AA12'), Subject=c('Physics','Chemistry','English','Maths','Physics') );
reshape(transform(df,time=ave(c(Reg_No),Reg_No,FUN=seq_along)),dir='w',idvar='Reg_No');
##   Reg_No Subject.1 Subject.2 Subject.3
## 1   AA11   Physics Chemistry      <NA>
## 3   AA12   English     Maths   Physics

This will generate a data.frame with as many columns as are necessary to cover all subjects.

The reason your code is failing is that you've preallocated your matrix with 8 columns, but the RHS of each assignment will only contain as many subjects as the current student n has in the original data.frame. R rejects index-assignments whose target length is not divisible by the RHS length (actually for plain vectors it will just be a warning, but for matrices it seems to be an error; regardless, it's probably never the right thing to do).


In general, if you ever do need to carry out such a non-divisible assignment, you can do it by extending the RHS to sufficient length by appending NAs. This could be done with rep() and c(), but there's actually an elegant and easy way to do it using out-of-bounds indexing. Here's a demo:

m <- matrix(NA_character_,2,8);
m;
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] NA   NA   NA   NA   NA   NA   NA   NA
## [2,] NA   NA   NA   NA   NA   NA   NA   NA
m[1,] <- letters[1:3]; ## fails; indivisible
## Error in m[1, ] <- letters[1:3] :
##   number of items to replace is not a multiple of replacement length
m[2,] <- letters[1:3][1:ncol(m)]; ## works
m;
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
## [1,] NA   NA   NA   NA   NA   NA   NA   NA
## [2,] "a"  "b"  "c"  NA   NA   NA   NA   NA