In this presentation at slides 36 and 37 - the author of Cascalog asserts that given a data set of names and ages like: [name age] that the query to return all the results that are greater than the average age is 300 lines of PIG.
Is this a valid assertion? How many lines of PIG is it really?
Or is the problem he's describing bigger than what I've described?
(Disclaimer - I'm a big fan of Nathan's work, of Clojure and Cascalog - I'm just trying to get some facts straight).
You've done a misinterpretation of what he says in this presentation. What he means is that the implementation de "average" in PIG is 300 lines de java code, versus the 5 lines of cascalog implemented by macro predicate functionality. He wants to emphasize the power of the composition.
PD: Sorry for my bad english, I'm learning ;-)