schedule and automate sqoop import/export tasks

3k views Asked by At

I have a sqoop job which requires to import data from oracle to hdfs.

The sqoop query i'm using is
sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '1' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test1 --fields-terminated-by '\t'

I am re-running the same query again and again with change in partitionid from 1 to 96. so I should execute the sqoop import command manually 96 times. The table 'ORDERS' contains millions of rows and each row has a partitionid from 1 to 96. I need to import 10001 rows from each partitionid into hdfs.

Is there any way to do this? How to automate the sqoop job?

2

There are 2 answers

0
vijay kumar On BEST ANSWER

Run script : $ ./script.sh 20 //------- for 20th entry

ramisetty@HadoopVMbox:~/ramu$ cat script.sh
#!/bin/bash

PART_ID=$1
TARGET_DIR_ID=$PART_ID
echo "PART_ID:" $PART_ID  "TARGET_DIR_ID: "$TARGET_DIR_ID
sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test/$TARGET_DIR_ID --fields-terminated-by '\t'

For all 1 to 96 - single shot

ramisetty@HadoopVMbox:~/ramu$ cat script_for_all.sh
#!/bin/bash

for part_id in {1..96};
do
 PART_ID=$part_id
 TARGET_DIR_ID=$PART_ID
 echo "PART_ID:" $PART_ID  "TARGET_DIR_ID: "$TARGET_DIR_ID
 sqoop import --connect jdbc:oracle:thin:@hostname:port/service --username sqoop --password sqoop --query "SELECT * FROM ORDERS WHERE orderdate = To_date('10/08/2013', 'mm/dd/yyyy') AND partitionid = '$PART_ID' AND rownum < 10001 AND \$CONDITIONS" --target-dir /test/$TARGET_DIR_ID --fields-terminated-by '\t'
done
1
Rajesh N On

Use crontab for scheduling purposes. Crontab documentation can be found here or you could use man crontab in terminal.

Add your sqoop import command in shell script and execute this shell script using crontab.