How does compiler figure out fixed point of a functor and how cata work at leaf level?

658 views Asked by At

I feel like understanding the abstract concept of fixed point of a functor, however, I am still struggling to figure out the exact implementation of it and its catamorphism in Haskell.

For example, if I define, as according to the book of "Category Theory for Programers" -- page 359, the following algebra

-- (Int, LiftF e Int -> Int)

data ListF e a = NilF | ConsF e a

lenAlg :: ListF e Int -> Int
lenAlg (ConsF e n) -> n + 1
lenAlg NilF = 0

by definition of catamorphism, the following function could be applied to ListF's fixed point, which is a List, to calculate its length.

cata lenAlg :: [e] -> Int
cata lenAlg = lenAlg . fmap (cata lenAlg) . unFix

I have two confusions. First, how does Haskell compiler know that List is THE fixed point of ListF? I know conceptually it is, but how does the compiler know, i.e., what if we define another List' that is everything the same as List, I bet compiler does not automatically infer that List' is the fixed point of ListF too, or does it? (I'd be amazed).

Second, due to the recursive nature of cata lenAlg, it always tries to unFix the outer layer of data constructor to expose the inner layer of functor (is this interpretation of mine even correct by the way?). But, what if we are already at the leaf, how could we invoke this function call?

fmap (cata lenAlg) Nil

As an example, could someone help write an execution trace for the below function call to clarify?

cata lenAlg Cons 1 (Cons 2 Nil)

I am probably missing something that is obvious, however I hope this question still makes sense for other people that share similar confusions.


answer summary


@n.m. answered my first question by pointing out that in order for Haskell compiler to figure out Functor A is a fixed point of Functor B, we need to be explicit. In this case, it is

type List e = Fix (ListF e)

@luqui and @Will Ness pointed out that calling fmap (cata lenAlg) on the leaf, which is NilF in this case, will return NilF back, because of the definition of fmap.

-- f :: a -> b
fmap f (ConsF e a) = ConsF e (f b)
fmap f NilF        = NilF

I'd accept @n.m.'s answer as it directly addressed my first (bigger) question, but I do like all three answers. Thanks a lot for all your help!

3

There are 3 answers

0
n. m. could be an AI On BEST ANSWER

"List is the fixed point of ListF" is a fast-and-loose figure of speech. While fast and loose reasoning is morally correct, you always need to keep in mind the boring correct thing. Which is as follows:

any least fixpoint of ListF e is isomorphic to [e].

Now "the compiler" (that's a fast-and-loose word for "the Haskell language" by the way) is not aware of isomorphisms of this kind. You can write isomorphic types all day

data []    x = []   | (:)   x ([]    x)    -- the imaginary built-in definition of []
data ListA x = NilA | ConsA x (ListA x)
data ListB x = NilB | ConsB x (ListB x)
data ListC x = NilC | ConsC x (ListC x)

and the compiler will never treat them as "the same type". Nor will it know that the fixpoint of ListF e is the same as [e], or indeed what a fixpoint is. If you want to use these isomorphisms, you need to teach the compiler about them by writing some code (e.g. by defining instances of Data.Types.Isomorphic), and then specifying the isomorphism explicitly each time you want to use it!

Having this is mind, let's look at cata you have defined. The first thing that mets the eye is that the attempt at defining the type signature is a syntax error. Let's remove it and just define the function in GHCi (I changed the name of the parameter to f to avoid confusion, and fixed a few typos in the definition of ListF)

Main> data ListF e a = NilF | ConsF e a
Main> let lenAlg :: ListF e Int -> Int
Main|     lenAlg (ConsF e n) = n + 1
Main|     lenAlg NilF = 0
Main|
Main>
Main> :m + Data.Fix
Main Data.Fix> cata f = f . fmap (cata f) . unFix
Main Data.Fix> :t cata
cata :: Functor f => (f b -> b) -> Fix f -> b
Main Data.Fix> :t cata lenAlg
cata lenAlg :: Functor (ListF e) => Fix (ListF e) -> Int

This says that in order to use lenAlg, you need to:

  • define an instance of Functor for ListF e
  • use Fix (ListF e) (which is a fixpoint of ListF) explicitly. Wishing "the fixpoint of ListF" into existence just doesn't work. There's no magic.

So let's do this:

Main Data.Fix> instance Functor (ListF e) where
Main Data.Fix|     fmap f NilF = NilF
Main Data.Fix|     fmap f (ConsF e a) = ConsF e (f a)
Main Data.Fix>
Main Data.Fix> :t cata lenAlg
cata lenAlg :: Fix (ListF e) -> Int

Great, now we can calculate the length of a ListF-based Fix-wrapped list. But let's define a few helper definitions first.

Main Data.Fix> type List e = Fix (ListF e)
Main Data.Fix> nil = Fix (NilF)
Main Data.Fix> let infixr 7 ~~                   -- using an infix operator for cons
Main Data.Fix|     h ~~ t = Fix (ConsF h t)
Main Data.Fix|
Main Data.Fix>
Main Data.Fix> myList = (1::Int) ~~ 2 ~~ 3 ~~ 4 ~~ nil
Main Data.Fix> :t myList
myList :: Fix (ListF Int)

Here's our "list" (a member of a type that is isomorphic to [Int] to be precise). Let's cata lenAlg it:

Main Data.Fix> cata lenAlg myList
4

Success!

Bottom line: the compiler knows nothing,you need to explain it everything.

7
Will Ness On

No, unFix exposes the structure, and then fmap f applies a function f on it. If it's a Leaf, fmap f will do its thing that it is defined to do for leaves - i.e., nothing. It's fmap that "knows" i.e. is defined to handle each case, as usual in Haskell's case-based definitions.

Fix (ListF e) = ListF e (Fix (ListF e)) 
              = NilF | ConsF e (Fix (ListF e))

so renaming Fix (ListF e) to Listof e we get

Listof     e  = NilF | ConsF e (Listof e) 

Listof e is a recursive type. ListF e r is a non-recursive type. Fix makes a recursive type out of it. ListF e being a Functor means fmap works on the r "meat", ListF e being the "hard outer shell" of this "fruit".

data ListF e a = NilF | ConsF e a

newtype Fix f = X { unFix    :: (f       (Fix f)) }

-- unFix ::    Fix f          -> f       (Fix f        )
-- unFix (_ :: Fix (ListF e)) :: ListF e (Fix (ListF e))

lenAlg :: ListF e Int -> Int
lenAlg (ConsF e n) = n + 1
lenAlg NilF        = 0

instance Functor (ListF e) where
    -- fmap :: (a -> b) -> (ListF e a) -> (ListF e b)
    fmap f NilF        = NilF
    fmap f (ConsF e r) = ConsF e (f r)

cata :: (ListF e Int -> Int) -> Fix (ListF e) -> Int
cata lenAlg x = (lenAlg . fmap (cata lenAlg) . unFix) x
              =  lenAlg ( fmap (cata lenAlg) ( unFix  x ))
--------
        x       ::       Fix (ListF e)
        unFix x ::            ListF e (Fix (ListF e))
        fmap (cata lenAlg) :: ListF e (Fix (ListF e)) -> ListF e (Int)
              cata lenAlg  ::          Fix (ListF e)  ->          Int

        fmap (cata lenAlg)   (unFix  x              ) :: ListF e  Int 
 lenAlg (_                                            :: ListF e  Int ) :: Int

See? All the wires go to their proper places. fmap transforms the inner part r recursively, and then lenAlg algebra applies one last transformation, one last step in collapsing the structure where all its inner parts were already collapsed into a recursive result. Thus producing the final result.


As a concrete example, for a list of two numbers, 1 and 2, we have

-- newtype Fix f  = X { unFix :: (f      (Fix f         )) }
--             _\_______       ____\____      _\________
onetwo :: Fix (ListF Int) -- ~ ListF Int (Fix (ListF Int))
onetwo = X (ConsF 1
                  (X (ConsF 2
                            (X NilF))))
cata lenAlg :: Fix (ListF e) -> Int
cata lenAlg    onetwo
  = {- definition of cata   -}
    lenAlg . fmap (cata lenAlg) . unFix  $ onetwo
  = {- definition of onetwo -}
    lenAlg . fmap (cata lenAlg) . unFix  $ 
                           X (ConsF 1 (X (ConsF 2 (X NilF))))
  = {- definition of unFix  -}
    lenAlg . fmap (cata lenAlg) $ 
                              ConsF 1 (X (ConsF 2 (X NilF)))
  = {- definition of fmap   -}
    lenAlg $ ConsF 1 (cata lenAlg     (X (ConsF 2 (X NilF))))
  = {- definition of lenAlg -}
    (+ 1)  $          cata lenAlg     (X (ConsF 2 (X NilF)))
  = {- definition of cata   -}
    (+ 1)  $ lenAlg . fmap (cata lenAlg) . unFix $ 
                                       X (ConsF 2 (X NilF))
  = {- definition of unFix  -}
    (+ 1) $ lenAlg . fmap (cata lenAlg) $ ConsF 2 (X NilF)
  = {- definition of fmap   -}
    (+ 1) $ lenAlg $ ConsF 2 $ cata lenAlg        (X NilF)
  = {- definition of lenAlg -}
    (+ 1) $ (+ 1)  $           cata lenAlg        (X NilF)
  = {- definition of cata   -}
    (+ 1) $ (+ 1)  $ lenAlg . fmap (cata lenAlg) . unFix $ 
                                                  (X NilF)
  = {- definition of unFix  -}
    (+ 1) $ (+ 1)  $ lenAlg . fmap (cata lenAlg) $   NilF
  = {- definition of fmap   -}
    (+ 1) $ (+ 1)  $ lenAlg $                        NilF
  = {- definition of lenAlg -}
    (+ 1) $ (+ 1)  $ 0
  = (+ 1) $ 1
  = 2

Also,

squaringAlg :: ListF Int [Int] -> [Int]
squaringAlg (ConsF e r) = e*e : r
squaringAlg NilF        = []

filteringAlg :: (e -> Bool) -> ListF e [e] -> [e]
filteringAlg p (ConsF e r) | p e       = e : r
                           | otherwise = r
filteringAlg _ NilF                    = []

etc.

0
luqui On

The only way the compiler can know about the relationship between ListF e and [e] is if you tell it. You haven't provided enough context to answer exactly how, but I can infer that unFix has type

unFix :: [e] -> ListF e [e]

that is, it unrolls the top layer. unFix may be more general, for example in recursion-schemes a type family is used to associate data types with their underlying functors. But this is where the two types are connected.


As for your second question, you need to be clearer about when you have a list and when you have a ListF. They are completely different.

fmap (cata lenAlg) Nil

Here the functor you are mapping over is ListF e for whatever e you like. That is, it's this fmap:

fmap :: (a -> b) -> ListF e a -> ListF e b

If you implement instance Functor (ListF e) yourself (always a good exercise) and get it to compile, you will find that Nil must map to Nil, so the cata lenAlg didn't matter at all.


Let's look at the type of Cons 1 (Cons 2 Nil):

Nil                 :: ListF e a
Cons 2 Nil          :: ListF Int (ListF e a)
Cons 1 (Cons 2 Nil) :: ListF Int (ListF Int (ListF e a))

Something is awry here. The trouble is that we are forgetting to do the opposite of unFix to roll the ListF back up into a regular list. I will call this function

roll :: ListF e [e] -> [e]

Now we have

Nil                                      :: ListF e a
roll Nil                                 :: [e]
Cons 2 (roll Nil)                        :: ListF Int [Int]
roll (Cons 2 (roll Nil))                 :: [Int]
Cons 1 (roll (Cons 2 (roll Nil)))        :: ListF Int [Int]
roll (Cons 1 (roll (Cons 2 (roll Nil)))) :: [Int]

The types are staying nice and small now, it's a good sign. For the execution trace, let's just note that unFix . roll = id, however they work. It's important here to notice that

fmap f (Cons a b) = Cons a (f b)
fmap f Nil        = Nil

That is, fmap on ListF just applies the function on the "recursive part" of the type.

I'm going to switch the Cons case to lenAlg (ConsF e n) = 1 + n to make the trace a tiny bit more readable. Still kind of a mess, good luck.

cata lenAlg (roll (Cons 1 (roll (Cons 2 (roll Nil)))))
(lenAlg . fmap (cata lenAlg) . unFix) (roll (Cons 1 (roll (Cons 2 (roll Nil)))))
lenAlg (fmap (cata lenAlg) (unFix (roll (Cons 1 (roll (Cons 2 (roll Nil)))))))
lenAlg (fmap (cata lenAlg) (Cons 1 (roll (Cons 2 (roll Nil)))))
lenAlg (Cons 1 (cata lenAlg (roll (Cons 2 (roll Nil)))))
1 + cata lenAlg (roll (Cons 2 (roll Nil)))
1 + (lenAlg . fmap (cata lenAlg) . unFix) (roll (Cons 2 (roll Nil)))
1 + lenAlg (fmap (cata lenAlg) (unFix (roll (Cons 2 (roll Nil)))))
1 + lenAlg (fmap (cata lenAlg) (Cons 2 (roll Nil)))
1 + lenAlg (Cons 2 (cata lenAlg (roll Nil)))
1 + 1 + cata lenAlg (roll Nil)
1 + 1 + (lenAlg . fmap (cata lenAlg) . unFix) (roll Nil)
1 + 1 + lenAlg (fmap (cata lenAlg) (unFix (roll Nil)))
1 + 1 + lenAlg (fmap (cata lenAlg Nil))
1 + 1 + lenAlg Nil
1 + 1 + 0
2

See also my other answer about catamorphisms.