Relatively new to the Hadoop world so apologies if this is a no-brainer but I haven't found anything on this on SO or elsewhere.
In short, I have an external table created in Hive that reads data from a folder of CSV files in HDFS. The issue is that while the first column has the string datatype it isn't reading the entire value for a values that contain a hyphen.
For example:
Hive table definition:
CREATE EXTERNAL TABLE BG (
`Id` string,
`Someint` int,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LOCATION '/Path/To/Folder'
TBLPROPERTIES ("skip.header.line.count"="1");
Example Data:
506253-A,10
506254-C,19
506233,8
When I query the data in Hive or browse in HCat this is what I see:
Id Someint
506253 10
506254 19
506233 8
Any ideas? Thanks in advance!
What version of Hive are you using? On Amazon EMR, Hive version 0.13.1 I run your code and get the following