My scalding job is translated into 9 map reduce jobs (m/r jobs). It's not easy for me to understand which part of code each m/r job represents. Is there anything that could help me understand my job better?
//this has been copy&pasted from our internal wiki at Tapad. Feel free to share your experience!
Scalding can generate a job graph in .dot format. It's triggered by this code. Here are the steps:
You should have 2 files generated ending with .dot. They are text files. One is very detailed graph of all Cascading functions used by your job. The other file that ends with _steps.dot is a graph of m/r jobs. Open them in your favorite editor and try to find nodes and their connections.
It's possible to generate pdf or png files from .dot using graphviz. Here are the steps:
Bonus tip: it could be still hard to figure out where each m/r job is in your code. Adding descriptions to your code will add them to the myjob_steps.dot file. Experiment with this function and regenerate the .dot file. This is where generating a .pdf file is not necessary. You can just open myjob_steps.dot in your favorite editor and use search to find descriptions you put to markup the code. You can find examples in the scalding repo.