How to remove white space from a data frame in R, when importing from SPSS

2k views Asked by At

I'm using read.spss in the "foreign" package to read in a .sav file to R.

This is survey data coming from an online survey. However, the results (via the SPSS file) contains large areas of white space in fields (I assume from text entry fields on the online form) these appear when I use write.csv.

For reference, this is the code I'm using:

dataset <- read.spss(file.choose(), to.data.frame=TRUE)

csv <- write.csv(dataset, file=file.choose(), append=FALSE, na="NA", row.names=FALSE, fileEncoding="UTF-8") 

Can I adjust this to replace the whitespace in the data frame with NA for my final csv output?

3

There are 3 answers

0
jmk On

Resolved: discovered that using the memisc package and substituting my original read.spss function with

dataset <- as.data.set(spss.system.file(file.choose())) or dataset <- as.data.set(spss.portable.file(file.choose()))

avoids inputting large space character strings automatically. More here: Read SPSS file into R

0
Anthony Damico On
# if your data.frame object is `x`
library(stringr)

# convert all factor columns to character
facs <- sapply( x , is.factor )
x[ facs ] <- sapply( x[ facs ] , as.character )

# trim all character columns,
# removing leading and trailing whitespace
chars <- sapply( x , is.character )
x[ chars ] <- sapply( x[ chars ] , str_trim )
0
fabou On

Litte mistake i guess :

x[ facs ] <- sapply( x[ facs ] , as.character )

should be :

x[ facs ] <- lapply( x[ facs ] , as.character )

lapply instead of sapply.

( don't know why i've been learning R language since a few days ).