Is there any reason why the Profunctor instance of (->) defines both dimap and lmap/rmap?

132 views Asked by At

In the source code on Hackage I read this:

instance Profunctor (->) where
  dimap ab cd bc = cd . bc . ab
  {-# INLINE dimap #-}
  lmap = flip (.)
  {-# INLINE lmap #-}
  rmap = (.)
  {-# INLINE rmap #-}

but the default implementations of dimap/lmap/rmap for the Profunctor would require one to just define either both lmap and rmap, or dimap; defining all of them is unnecessary.

Is there a reason why they are all defined, instead?

1

There are 1 answers

3
K. A. Buhr On

As @FyodorSoikin comments, the intention was probably that the lmap and rmap hand-coded definitions would be more efficient than the default definitions based on dimap.

However, using the test program below, I tried defining the instance with all three of dimap/rmap/lmap, dimap only, and rmap/lmap only, and the core for the test functions l, r, and b was precisely the same in all three cases when compiled with -O2:

b = \ x -> case x of { I# x1 -> I# (+# 15# (*# 6# x1)) }
r = \ x -> case x of { I# x1 -> I# (+# 15# (*# 3# x1)) }
l = \ x -> case x of { I# x1 -> I# (+# (*# x1 2#) 5#) }

While it's possible that for more complicated examples the compiler will fail to optimize the default definitions of lmap f = dimap f id and rmap = dimap id, it strikes me as exceedingly unlikely, and so the hand-coded lmap and rmap don't make any difference.

I think the explanation is that even extremely skilled Haskell programmers like Edward Kmett still underestimate the compiler and perform unnecessary hand-optimizations of their code.

Update: In a comment, @4castle asked what happens without optimization. With the caveat that "because it improves -O0 code" doesn't strike me as a sound argument for anything, I took a look.

In unoptimized code, the explicit rmap definition produces better Core by avoiding an extra composition with id:

-- explicit `rmap`
r = . (let { ds = I# 3# } in \ ds1 -> * $fNumInt ds1 ds)
      (let { ds = I# 5# } in \ ds1 -> + $fNumInt ds1 ds)

-- default `rmap`
r = . (let { ds = I# 3# } in \ ds1 -> * $fNumInt ds1 ds)
  (. (let { ds = I# 5# } in \ ds1 -> + $fNumInt ds1 ds) id)

while the explicit lmap definition ends up producing Core that's about the same, or arguably worse.

-- explicit `lmap`
$clmap = \ @ a @ b1 @ c -> flip .
l = $clmap
      (let { ds = I# 2# } in \ ds1 -> * $fNumInt ds1 ds)
      (let { ds = I# 5# } in \ ds1 -> + $fNumInt ds1 ds)

-- default `lmap`
l = . id
      (. (let { ds = I# 5# } in \ ds1 -> + $fNumInt ds1 ds)
         (let { ds = I# 2# } in \ ds1 -> * $fNumInt ds1 ds))

As a consequence of the above definitions, the explicit dimap is better than the default because of the extra flip.

-- explicit `dimap`
b = . (let { ds = I# 3# } in \ ds1 -> * $fNumInt ds1 ds)
      (. (let { ds = I# 5# } in \ ds1 -> + $fNumInt ds1 ds)
         (let { ds = I# 2# } in \ ds1 -> * $fNumInt ds1 ds))

-- default `dimap`
$clmap = \ @ a @ b1 @ c -> flip .
b = . ($clmap (let { ds = I# 2# } in \ ds1 -> * $fNumInt ds1 ds))
      (. (let { ds = I# 3# } in \ ds1 -> * $fNumInt ds1 ds))
      (let { ds = I# 5# } in \ ds1 -> + $fNumInt ds1 ds)

In another comment, @oisdk scolded me for my unrealistic test. I will point out that failure to inline recursion isn't really an issue here, since none of dimap, lmap, or rmap is recursive. In particular, simply "using" one of these in a recursive manner, like foo = foldr rmap id doesn't interfere with inlining or optimization, and the generated code for foo is the same with the explicit and default rmap.

Also, splitting the class/instance from the l/r definitions into separate modules makes no difference to my test program, nor does splitting it up into three modules, the class, the instance, and l/r, so it doesn't seem like inlining across module boundaries is much of a problem here.

For unspecialized polymorphic usage, I guess it'll come down to the Profunctor (->) dictionary that's generated. I see the following which seems to show that an explicit dimap with default lmap and rmap produces better code than the alternatives. The problem seems to be that flip (.) isn't being properly optimized here, so the explicit lmap definition is counterproductive.

-- explicit `dimap`, `rmap`, and `lmap`
$fProfunctor->
  = C:Profunctor $fProfunctor->_$cdimap $fProfunctor->_$clmap .
$fProfunctor->_$cdimap
  = \ @ a @ b @ c @ d ab cd bc x -> cd (bc (ab x))
$fProfunctor->_$clmap = \ @ a @ b @ c x y -> . y x

-- explicit `lmap`, `rmap`, default `dimap`
$fProfunctor->
  = C:Profunctor $fProfunctor->_$cdimap $fProfunctor->_$clmap .
$fProfunctor->_$cdimap
  = \ @ a @ b @ c @ d eta eta1 x eta2 -> eta1 (x (eta eta2))
$fProfunctor->_$clmap = \ @ a @ b @ c x y -> . y x

-- explicit `dimap`, default `lmap`, `rmap`
$fProfunctor->
  = C:Profunctor
      $fProfunctor->_$cdimap $fProfunctor->_$clmap $fProfunctor->1
$fProfunctor->_$cdimap
  = \ @ a @ b @ c @ d ab cd bc x -> cd (bc (ab x))
$fProfunctor->_$clmap = \ @ a @ b @ c eta bc x -> bc (eta x)
$fProfunctor->1 = \ @ b @ c @ a cd bc x -> cd (bc x)

If someone has an example where these explicit definitions generate better -O2 code, it would make a great alternative answer.

Here's my test program. I compiled with ghc -O2 Profunctor.hs -fforce-recomp -ddump-simpl -dsuppress-all -dsuppress-uniques.

module Profunctor where

class Profunctor p where
  dimap :: (a -> b) -> (c -> d) -> p b c -> p a d
  dimap f g = lmap f . rmap g
  {-# INLINE dimap #-}

  lmap :: (a -> b) -> p b c -> p a c
  lmap f = dimap f id
  {-# INLINE lmap #-}

  rmap :: (b -> c) -> p a b -> p a c
  rmap = dimap id
  {-# INLINE rmap #-}

instance Profunctor (->) where
  -- same core if dimap is commented out or if lmap/rmap are commented out
  dimap ab cd bc = cd . bc . ab
  lmap = flip (.)
  rmap = (.)

l :: Int -> Int
l = lmap (*2) (+5)

r :: Int -> Int
r = rmap (*3) (+5)

b :: Int -> Int
b = dimap (*2) (*3) (+5)