List comprehension using take

76 views Asked by At

I am having trouble understanding they way the next two lines of code work.

ghci> take 6 [[(i,j) | i<-[2,4]] | j<-[1,3,5]]
[[(2,1),(4,1)],[(2,3),(4,3)],[(2,5),(4,5)]]

ghci> take 6 [[(i,j)] | i<-[2,4], j<-[1,3,5]]
[[(2,1)],[(2,3)],[(2,5)],[(4,1)],[(4,3)],[(4,5)]]

I understand what take does but i dont understand the difference between the two lines and the order of the iterration and what the paranthesis do for the code. I am new at Haskell sorry if my question is easy.

2

There are 2 answers

1
willeM_ Van Onsem On

The difference is not in the take, the difference is the list comprehension. The first one has two list comprehensions, with j <- [1,3,5] the outer list comprehension:

--      vvvvvvvvvvvvvvvvvv inner list comprehension
take 6 [[(i,j) | i<-[2,4]] | j<-[1,3,5]]

whereas the latter has one list comprehension, since j <- [1,3,5] is the second "generator" in the list comprehsnion, that is the "inner" generator.

This thus means that in the first one, the i is the fastest changing variable, and j only changes after i is fully exhausted, whereas for the second one, it is the opposite way.

If we thus would convert it to some generator syntax in Python, it would look like:

# first example
def inner(j):
    for i in [2,4]:
        yield (i, j)

for j in [1,3,5]:
    yield inner(j)

# second example
for i in [2,4]:
    for j in [1,3,5]:
        yield [(i, j)]

so this means the order how the same elements are generated changes.

0
Ben On

Firstly, take 6 is totally irrelevant here. Neither of your list comprehensions evaluates to a list with more than 6 elements, so you're just getting the whole list in either case.

Your second example is structurally simpler, so I'm going to walk you through it first:

[[(i,j)] | i<-[2,4], j<-[1,3,5]]

Translating the pieces of this into English descriptions goes something like this:

  1. [ <expr> | ... ] means "A list of all possible elements of the form <expr>"
    • the ... part has to define the variables that are used in <expr> and what possibilities they can range over for this to be meaningful, of course
  2. In this case the <expr> is [(i, j)]; that is a list containing one element, where that element is a tuple/pair, and the 2 members of that pair are i and j. The parentheses in (i, j) are part of the tuple syntax, they have nothing to do with the list comprehension
  3. i <- [2, 4] means "where i is an element of the list [2, 4]
  4. j <- [1, 3, 5] means "where j is an element of the list [1, 3, 5]"
  5. The , between the i <- ... and j <- ... parts might simply translated "and", but more precisely it means that every possibility of the generator on the left will be paired with every possibility of the generator on the right.

So putting the whole thing together it's:

A list of all possible elements of the form [(i, j)], where i is an element of the list [2, 4] and where j is an element of the list [1, 3, 5].

One possible element is when i is 2 (the first element of the list it ranges over) and j is 1, so [(2, 1)] is an element of the list described by the whole list comprehension. Another possibility is when i is still 2, and j is 3, establishing that [(2, 3)] is an element of the final list. You should be able to see the pattern; all the elements of the list are going to be singleton lists containing a single pair (which is exactly what parts 1 & 2 of my translation told you).

The only thing my translation doesn't explicitly define is the order; what actually happens is that each possibility of the left most generator is held fixed while all possibilities of generator(s) to the right are explored, before moving on to the next element of the leftmost generator and restarting the generator(s) to the right. So in this case you get all the possible elements where i is 2 (and j is 1, then 3, then 5), followed by all the elements where i is 4 (and j is 1, then 3, then 5).


Your first example only uses the same translation concepts, but there are two expressions that look like [ <expr> | <generators> ], so you're going to be doing this sort of translation twice. The <expr> part of the first list comprehension is another list comprehension. So the elements of this list are going to be lists, like the other example, but they won't be singleton lists unless the inner generator has only one possibility. So that's immediately going to be a difference

[[(i,j) | i<-[2,4]] | j<-[1,3,5]]

Describing this in English would be something like:

  1. [[(i,j) | i<-[2,4]] | ...] means "A list of all possible elements of the form [(i, j) | i <- [2, 4]]
    1. [(i, j) | ... means "A list of all possible elements of the form (i, j)
    2. i <- [2, 4] means "where i is an element of the list [2, 4]
    3. Note that j is not defined in the inner list comprehension, but we are inside the list comprehension that does define a generator for j we have a variable j in scope.
  2. The j <- [1, 3, 5] part means "where j is an element of the list [1, 3, 5]

So we end up with something like:

A list of all possible elements that are (a list of all possible elements of the form (i, j) where i is an element of the list [2, 4]), where j is an element of the list [1, 3, 5].

Pretty clunky as an English sentence; that's why we do things like code and maths in better notation than English!

But still you should be able to reason from that. One possible element of the final list will be when j is 1, and that element will be "a list of all possible elements of the form (i, j) (i.e. (i, 1)) where i comes from [2, 4]. So that inner list of possibilities is going to be [(2, 1), (4, 1)].

Effectively, the first example I discussed (your second example) iterates over j "within" each iteration of i, while the other example iterates over i "within" each iteration of j. But you really should not think about this as syntax for changing the order of iteration, you've simply written two completely different expressions that result in completely different lists. They only look like they're lists of the same elements in a different order if you're just looking at the numbers and mentally filtering out the square brackets. One is a list of 6 lists, each of which has only 1 element, which is a pair. The other is a list of 3 lists, each of which has 2 elements, and each of those is a pair. The fact that the order in which i and j are considered is quite a minor difference compared to the very different structure of the sub-lists.

You should choose between the two expressions based on whether you want to generate sub-lists of possibilities, not to control the order of elements. If you want to swap the nesting order in the simpler single-generator expression, you can just swap the order of generators:

ghci> [ [(i,j)] | j <- [1, 3, 5], i <- [2, 4] ]
[[(2,1)],[(4,1)],[(2,3)],[(4,3)],[(2,5)],[(4,5)]]