Converting character to factor using lapply with no melting

467 views Asked by At

I have a list of character matrices and would like to convert two of the columns (lat, lon) to factor. I've tried using lapply for this and it works, but it also reshapes my data frames. I've tried using as.factor two ways: one on just the two desired columns (not good, returns all other columns as NA) and one on the entire data frame but reshaping occurs in both instances. I then tried to melt my list of matrices back to the original, desired shape, but thought that it might be better to not create the original problem rather than trying to fix it after the fact. Any ideas on how to convert to factor without the reshaping occurring?

Attempt on just the cols:

ix <- 5:6
mytest[ix] <- lapply(mytest[ix], as.factor)

Attempt on whole df

lapply(mytest, as.factor)

sample data:

list(structure(c("study1", "study1", "study1", "study1", "study1", 
"study1", "study1", "study1", "study1", "study1", "study1", "study1", 
"study1", "study1", "study1", "58", "58", "58", "58", "58", "58", 
"58", "58", "58", "58", "58", "58", "58", "58", "58", "2011-07-13", 
"2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", 
"2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", 
"2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", "321", 
"329", "323", "324", "61", "326", "6", "60", "49", "10", "7", 
"59", "57", "56", "11", "32.884720435", "32.8841969254545", "32.8835599674286", 
"32.88419565", "32.8837771221667", "32.88411147", "32.883244695", 
"32.8837003266667", "32.8838778530086", "32.8853723146154", "32.8027296698536", 
"32.9164754136842", "32.8853777533333", "32.8854051", "32.802755201875", 
"-117.24062533", "-117.240416713636", "-117.240532619714", "-117.24070002", 
"-117.24038866075", "-117.24022087", "-117.240140015", "-117.239834913333", 
"-117.240522195673", "-117.240133633077", "-117.210527201581", 
"-117.236141991053", "-117.24063566", "-117.23989078", "-117.210382870833"
), .Dim = c(15L, 6L), .Dimnames = list(NULL, c("study", "ID", 
"locDate", "locNumb", "meanLat", "meanLon"))), structure(c("Study2", 
"Study2", "Study2", "Study2", "Study2", "Study2", "Study2", "Study2", 
"Study2", "Study2", "Study2", "Study2", "Study2", "Study2", "59", 
"59", "59", "59", "59", "59", "59", "59", "59", "59", "59", "59", 
"59", "59", "2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", 
"2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", 
"2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", 
"429", "418", "422", "432", "430", "426", "420", "354", "67", 
"419", "425", "427", "421", "428", "32.86543857", "32.867004565", 
"32.8694241808955", "32.8651107616667", "32.868857725", "32.8693627126536", 
"32.8696329253571", "32.86955278", "32.869014345", "32.8692111971429", 
"32.8694814566667", "32.8696187847619", "32.8698972233333", "32.868283279", 
"-117.254194355", "-117.25283091", "-117.25050148", "-117.254406255417", 
"-117.25133879", "-117.235585179972", "-117.250467514464", "-117.25014399", 
"-117.25006813", "-117.235456126857", "-117.235959423333", "-117.250773722857", 
"-117.250450876667", "-117.2512085715"), .Dim = c(14L, 6L), .Dimnames = list(
NULL, c("study", "ID", "locDate", "locNumb", "meanLat", "meanLon"
    ))))
2

There are 2 answers

2
Sven Hohenstein On BEST ANSWER

You can transfrom the list of two matrices with

lapply(mytest, as.data.frame)

The result is a list of two data frames. All of their columns are factors.

0
sparrow On
# something  <- your data

The problem is that you're not dealing with data frames:

sapply(something, class)

So you need to convert your data into actual data frames:

something2 = lapply(something, function(x) as.data.frame(x, stringsAsFactors = F))

Note, that if you don't mind that your other variables will be also converted into factors, then you just leave out the stringsAsFactors part, and you're done. I assumed however, that you wanted to keep the other variables as characters. Then convert only the variables you want:

for (i in 1:length(something2)) {
  something2[[i]]$meanLat = factor(something2[[i]]$meanLat)
  something2[[i]]$meanLon = factor(something2[[i]]$meanLon)
}

so now the two variables were converted to a factor, let's check the first one:

str(something2[[1]])