First of all This is not a duplicate of this SO question here
.I have a csv file encoded in Shift-JIS
this is my script to parse the file
require 'csv'
str1 = '社員番号'
str2 = 'メールアドレス'
str1.force_encoding("Shift_JIS").encode!
str2.force_encoding("Shift_JIS").encode!
file=File.open("SyainInfo.csv", "r:Shift_JIS")
csv = CSV.read(file, headers: true)
p csv[str1]
p csv [str2]
but even after specifying enconding, I am getting invalid byte sequence in UTF-8 (ArgumentError)
. Any thoughts? My ruby is 2.3.0
First of all, your encoding doesn't look right:
force_encoding
takes the bytes fromstr1
and interprets them as Shift JIS, whereas you probably want to convert the string to Shift JIS:Next, you can pass a filename to
CSV.read
, so instead of:You can just write:
That said, you could either work with Shift JIS encoded strings:
Or – and that's what I would do – you could work with UTF-8 strings by specifying a second encoding:
encoding: 'Shift_JIS:UTF-8'
instructsCSV
to read Shift JIS data and transcode it to UTF-8. It's equivalent to passing'r:Shift_JIS:UTF-8'
toFile.open