schedule and automate sqoop import/export tasks

Question

schedule and automate sqoop import/export tasks

3k views Asked by ableHercules At 08 June 2015 at 23:22

I have a sqoop job which requires to import data from oracle to hdfs.

The sqoop query i'm using is
sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '1' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test1 --fields-terminated-by '\t'

I am re-running the same query again and again with change in partitionid from 1 to 96. so I should execute the sqoop import command manually 96 times. The table 'ORDERS' contains millions of rows and each row has a partitionid from 1 to 96. I need to import 10001 rows from each partitionid into hdfs.

Is there any way to do this? How to automate the sqoop job?

Original Q&A

There are 2 answers

Rajesh N On 09 June 2015 at 03:40

Use crontab for scheduling purposes. Crontab documentation can be found here or you could use man crontab in terminal.

Add your sqoop import command in shell script and execute this shell script using crontab.

**vijay kumar** · Accepted Answer · 2015-06-09T17:28:17+00:00

Run script : $ ./script.sh 20 //------- for 20th entry

ramisetty@HadoopVMbox:~/ramu$ cat script.sh
#!/bin/bash

PART_ID=$1
TARGET_DIR_ID=$PART_ID
echo "PART_ID:" $PART_ID  "TARGET_DIR_ID: "$TARGET_DIR_ID
sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test/$TARGET_DIR_ID --fields-terminated-by '\t'

For all 1 to 96 - single shot

ramisetty@HadoopVMbox:~/ramu$ cat script_for_all.sh
#!/bin/bash

for part_id in {1..96};
do
 PART_ID=$part_id
 TARGET_DIR_ID=$PART_ID
 echo "PART_ID:" $PART_ID  "TARGET_DIR_ID: "$TARGET_DIR_ID
 sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test/$TARGET_DIR_ID --fields-terminated-by '\t'
done

TechQA.

schedule and automate sqoop import/export tasks

There are 2 answers

Related Questions in SHELL

Related Questions in HADOOP

Related Questions in AUTOMATION

Related Questions in HIVE

Related Questions in SQOOP

Popular Questions

Popular Tags

Trending Questions