I need some help regarding how to run a MapReduce Program/Job with Cloudera Docker Container.
I am using a Linux (ElementaryOS) high config. laptop (24GB RAM, i7 Processor).
I am able to install Cloudera docker image, ran it and also did the following without issues:
1. Seeing # prompt and run HDFS command (hadoop fs -ls) though it doesn't return anything.
2. Able to access Hue Editor
3. Able to run Cloudera manager and start all services (Everything).
4.In my Local Environment, I am able to create a WordCount MapReduce program (jar), downloaded all Maven dependencies for this program (not inside docker container).
Now my question is:
How to submit this WordCount JAR to running Docker Container?
How to run this MapReduce program/job (WordCount) with uploaded text file (HDFS)?
How to Execute MapReduce Job/JAR with Cloudera Quickstart Docker container
781 views Asked by Srikanth At
1
If you start your container with port mapping for the 8888 port, you will be able to access Hue that contains a file brower. So you will be able to easily put HDFS files in you cluster.
To launch a map/reduce job, you will need to copy your jar inside the container, as Cloudera didn't provide any volumes in it's container (at least, not documented here : http://www.cloudera.com/documentation/enterprise/latest/topics/quickstart_docker_container.html) it can be challenging. Maybe you can try adding it via scp.
I myself create some cloudera containers, I provide one container by node type (masternode, datanode, edgenode) and I just add a volume in the edgenode as iy seems to be a good think to provide. You can find my container in the docker hub : https://hub.docker.com/r/loicmathieu/cloudera-cdh-edgenode/