Apache Falcon data backup

536 views Asked by At

I am not able to backup the data from one Hadoop cluster to another using Apache Falcon.

What are the methods to data backup from one cluster to another?

Is there any process entity or oozie workflow that is needed to do data backup from one cluster to another using Apache Falcon?

2

There are 2 answers

0
Samarth Gupta On

Apache falcon provides option to back up data to another hadoop cluster and amazon s3. microsoft azure was in plan, but i am not sure about its current status.

Data backup can be done by using the Replication feature of feed. Pls refer to http://falcon.apache.org/FalconDocumentation.html#Replication for more details.

You will need to submit cluster xmls and one feed xml for replication ( backup in your case to take place). cluster xmls will have details of the clusters from where and to you want to copy data.

0
user7364171 On

Apache Falcon provides direct replication from one hadoop cluster to another using Feed replication. Define Clusters (each for each hadoop cluster) and Define a feed having these 2 clusters along with marking one as type="source" and one as type="target"(source cluster to target cluster replication). Submit and Schedule the feed and your replication will kick off.