Having a strange issue here with apply and R 3.0.1.
I have a huge dataframe with text, numbers and logical values. The logical values are converted to chr when I use apply, but because R allows something like TRUE == "TRUE" that isn't a problem.
But to some logical values, apply seems to prepend a space, and TRUE == " TRUE" returns NA. Of course, I could do
sapply(cuelist[,4],FUN=function(logicalvalue) as.logical(sub("^ +", "", logicalvalue)))
but that isn't nice and I still don't know why R does that.
df <- data.frame(test=c("a","b","<",">"),logi=c(TRUE,FALSE,FALSE,TRUE))
apply(df, MARGIN=1, function(listelement) print(listelement) )
Interestlingly, the spaces only appear in this example on [2,1] and [2,4]
version _
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 3
minor 0.1
year 2013
month 05
day 16
svn rev 62743
language R
version.string R version 3.0.1 (2013-05-16) nickname Good Sport
Edit: same behaviour on R version 2.15.0 (2012-03-30)
Edit2: My dataframe lools like this
> df
test logi
1 a FALSE
2 b FALSE
3 < TRUE
4 > TRUE
> str(df)
'data.frame': 4 obs. of 2 variables:
$ test: Factor w/ 4 levels "<",">","a","b": 3 4 1 2
$ logi: logi FALSE FALSE TRUE TRUE
In a way, the problem is with
apply
, but more appropriately, the problem is withas.matrix
, and how it is handlinglogical
values.Here are a few examples to help elaborate on the query I had for Karl.
First, let's create four
data.frame
s to do some tests on.data.frame
to demonstrate the behavior:data.frame
with varying number of characters in the "test" column to look into Karl's explanation of what's going on.data.frame
with some numbers to help us start to understand what actually seems to be going on.data.frame
where your "logi" column is explicitly createdas.character
.Now, let's use
as.matrix
on each of them.This has a space before
TRUE
.This has a space before
TRUE
, but the "test" column remains unaffected. Hmm.Ahh... This has a space before
TRUE
and spaces before shorter numbers. So it seems that perhaps R is considering the numeric underlying value ofTRUE
andFALSE
, but calculating the width of the number of characters inTRUE
andFALSE
. Again, the first "test" column remains unaffected.Things seem fine here, if you tell R that the
logi
column is a character column.For what it's worth,
sapply
doesn't seem to have that problem.Update
In the R Public chat room, Joshua Ulrich points to
format
being the culprit.as.matrix
usesas.vector
for factors, which converts them to character (trystr(as.vector(df1$test))
to see what I mean; for everything else, it usesformat
, but unfortunately, doesn't have an option to include any of the arguments fromformat
, one of which istrim
(which is by default set toFALSE
).Compare the following:
So, how to sort of easily convert logical columns to character? Maybe something like this (though I would suggest creating a backup of your data first):