I have a bunch of parquet files and I'm trying to load them into hive with the following query:
CREATE EXTERNAL TABLE `events` (
// ... fields ...
)
PARTITIONED BY (... partition columns ...)
SORED AS PARQUET
LOCATION '/path/to/parquet/files';
It creates the table, but it doesn't load the data at the location. Is there something wrong with the query?
The location you specified in
CREATE TABLE
statement should be HDFS location.CREATE TABLE
does not load files.Use
LOAD DATA LOCAL INPATH './local/file/path' OVERWRITE INTO TABLE invites PARTITION (key='value');
See manual here.Alternatively, you can put files in the HDFS into locations with partitions subdirectories using
hadoop fs -copyFromLocal /path/in/linux /hdfs/path
command, then create table and useMSCK REPAIR TABLE
to create partitions, see manual here.