I am new on hadoop. I have data in tsv format with 50 columns and I need to store the data into hive. How can I create and load the data into table on the fly without manually creating table using create table statementa using schema on read?
Schema on read in hive for tsv format file
1.3k views Asked by Priyanka Shekhawat At
2
There are 2 answers
0
phaneendra kumar
On
you can use Hue :
http://gethue.com/hadoop-tutorial-create-hive-tables-with-headers-and/
or with Spark you can infer the schema of csv file and you can save it as a hive table.
val df=spark.read
.option("delimiter", "\t")
.option("header",true)
.option("inferSchema", "true") // <-- HERE
.csv("/home/cloudera/Book1.csv")
Related Questions in HADOOP
- How do I get all the attributes of an XML element using Go?
- Type cast custom types to base types
- Why are Revel optional func parameters in controller not working? CRUD code redundancy
- Streaming commands output progress
- single ampersand between 2 expressions
- golang goroutine use SSHAgent auth doesn't work well and throw some unexpect panic
- How do I do a literal *int64 in Go?
- Emulating `docker run` using the golang docker API
- How to print contents of channel without changing it
- Golang time zone parsing not returning the correct zone on ubuntu server
Related Questions in HIVE
- How do I get all the attributes of an XML element using Go?
- Type cast custom types to base types
- Why are Revel optional func parameters in controller not working? CRUD code redundancy
- Streaming commands output progress
- single ampersand between 2 expressions
- golang goroutine use SSHAgent auth doesn't work well and throw some unexpect panic
- How do I do a literal *int64 in Go?
- Emulating `docker run` using the golang docker API
- How to print contents of channel without changing it
- Golang time zone parsing not returning the correct zone on ubuntu server
Related Questions in HIVE-TABLE
- How do I get all the attributes of an XML element using Go?
- Type cast custom types to base types
- Why are Revel optional func parameters in controller not working? CRUD code redundancy
- Streaming commands output progress
- single ampersand between 2 expressions
- golang goroutine use SSHAgent auth doesn't work well and throw some unexpect panic
- How do I do a literal *int64 in Go?
- Emulating `docker run` using the golang docker API
- How to print contents of channel without changing it
- Golang time zone parsing not returning the correct zone on ubuntu server
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
Hive requires you to run a CREATE TABLE statement because the Hive metastore must be updated with the description of what data location you're going to be querying later on.
Schema-on-read doesn't mean that you can query every possible file without knowing metadata beforehand such as storage location and storage format.
SparkSQL or Apache Drill, on the other hand, will let you infer the schema from a file, but you must again define the column types for a TSV if you don't want everything to be a string column (or coerced to unexpected types). Both of these tools can interact with a Hive metastore for "decoupled" storage of schema information