SPSS Syntax to import RFC 4180 CSV file with escaped double quotes

404 views Asked by At

How do I read an RFC4180-standard CSV file into SPSS? Specifically, how to handle string values that have embedded double quotes which are (properly) escaped with a second double quote?

Here's one instance of a record with a problematic value:

2985909844,,3,3,3,3,3,3,1,2,2,"I recall an ad for ""RackSpace"", but I don't recall if this was here or in another page.",200,1,1,1,0,1,0,Often

The SPSS syntax I used is as follows:

GET DATA
  /TYPE=TXT
  /FILE="/Users/pieter/Work/Stackoverflow/2013_StackOverflowRecoded.csv"
  /IMPORTCASE=ALL
  /ARRANGEMENT=DELIMITED
  /DELCASE=LINE
  /FIRSTCASE=2
  /DELIMITERS=","
  /QUALIFIER='"'
  /VARIABLES=  ... list of column names...

The import succeeds, but gets off track and throws warnings after encountering such values.

2

There are 2 answers

1
mirirai On BEST ANSWER

I'm afraid this is a bug in SPSS and therefore not possible to solve.

You might want to ask the IBM Support team about this issue and post their answer here, if you find it helpful.

One Workaround would be to change the escaped double quotes in your *.csv file(s) to some other quote type. This should be only little work if you use an advanced text editor such as notepad++ or the "sed" command line tool on UNIX like operation systems.

2
JKP On

Trying an example in the current version of Statistics (22) doubled identifiers are handled correctly, however, if you generate the syntax with the Text Wizard, the fields are too short in the generated syntax, so you would need to increase the widths.