Why does empty function composition work in Haskell?

554 views Asked by At

I spent a long time without programming Haskell, and decided to get back into it by taking on a relatively advanced project. I'm trying to program a Neural Network from scratch by following this guide. I've scratched my head around some of his most esoteric approaches to simple problems like creating a network of weights and biases, but when it comes to this:

feed :: [Float] -> [([Float], [[Float]])] -> [Float]
feed input brain = foldl' (((relu <$>) . ) . zLayer) input brain

I don't understand what he does. More specifically, I don't understand why use two .'s in the function composition here. He uses (relu <$>) . ). This . followed by a parentheses doesn't make sense to me. I understand it represents function composition, and in this case, the function zLayer takes in a layer of neurons, which is of type ([Float], [[Float]]) and the output of the previous layer, which is of type [Float], and produces a new output, also of type [Float]. I understand he's applying the relu <$> function to the result of zLayer, which makes sense. That is, you want to fold the brain (which is nothing but a list of layers) by applying zLayer on a layer of the brain, then applying relu <$> on the result of that, and finally pass that as the input to the next layer.

The seemingly empty composition is what bugs me. What I described above should, to me, be implemented like this:

feed :: [Float] -> [([Float], [[Float]])] -> [Float]
feed inp brain = foldl' (((sigmoid <$>) . computeLayer) inp brain

(I'm using the sigmoid function instead of the rectifier (ReLU), and computeLayer is just my implementation of zLayer.) Right? What I'm doing there is (supposedly) providing, as a function to foldl', this:

(sigmoid <$> (computeLayer))

When I add just .) between my . and computeLayer (and an open parentheses before of course), it works. Without them, this is the error:

net.hs:42:42: error:
    • Couldn't match type ‘[Float]’ with ‘Float’
      Expected type: [Float] -> ([Float], [[Float]]) -> Float
        Actual type: [Float] -> ([Float], [[Float]]) -> [Float]
    • In the second argument of ‘(.)’, namely ‘computeLayer’
      In the first argument of ‘foldl'’, namely
        ‘((sigmoid <$>) . computeLayer)’
      In the expression: foldl' ((sigmoid <$>) . computeLayer) inp brain
   |
42 | feed inp brain = foldl' ((sigmoid <$>) . computeLayer) inp brain
   |                                          ^^^^^^^^^^^^

Why does this seemingly empty composition of functions work?

This is the whole code so far, if it helps:

import System.Random
import Control.Monad
import Data.Functor

foldl' f z []     = z
foldl' f z (x:xs) = let z' = z `f` x
                    in seq z' $ foldl' f z' xs

sigmoid :: Float -> Float
sigmoid x = 1 / (1 + (exp 1) ** (-x))

-- Given a list, gives out a list of lists of length *each element of the list*
makeBiases :: [Int] -> Float -> [[Float]]
makeBiases x b = flip replicate b <$> x

-- Given a list, gives out, for each element X in the list, a list of length x + 1, of
-- x elements in any normal distribution
makeWeights :: [Int] -> Float -> [[[Float]]]
makeWeights xl@(_:xs) el = zipWith (\m n -> replicate n (replicate m el)) xl xs

-- Make initial biases and weights to give a list of tuples that corresponds to biases
-- and weights associated with each node in each layer
makeBrain :: [Int] -> Float -> Float -> [([Float], [[Float]])]
makeBrain (x:xs) b el = zip (makeBiases xs b)  (makeWeights (x:xs) el)

-- Given output of a layer, apply weights and sum for all nodes in a layer. For each list
-- of weights (each node has multiple inputs), there will be one output
sumWeightsL l wvs = sum . zipWith (*) l <$> wvs

-- Given output of a layer, apply weights to get tentative output of each node. Then
-- sum biases of each node to its output
computeLayer :: [Float] -> ([Float], [[Float]]) -> [Float]
computeLayer l (bs, wvs) = zipWith (+) bs (sumWeightsL l wvs)

feed :: [Float] -> [([Float], [[Float]])] -> [Float]
feed inp brain = foldl' ((sigmoid <$>) . computeLayer) inp brain

main = do
  putStrLn "3 inputs, a hidden layer of 4 neurons, and 2 output neurons:"
  print $ feed [0.1, 0.2, 0.3] (makeBrain [3,4,2] 0 0.22)
1

There are 1 answers

1
K. A. Buhr On BEST ANSWER

As @Bergi notes, the expression ((relu <$>) . ) is not "empty function composition" but rather something called a "section". (In fact, in this case, it's a section nested inside another section.)

You've undoubtedly seen this before, even if you've forgotten what it was called and/or didn't realize it applied to the function composition operator (.), but just to remind you...

In Haskell, for any binary operator (like (+)), you can write a left or right "section":

(1+)  -- short for   \x -> 1+x
(+1)  -- short for   \x -> x+1

so that something like map (2*) mylist can be used to double every element of a list instead of having to write map (\x -> 2*x) mylist.

It works the same way for function composition (.) and the fmap operator (<$>), so:

((sigmoid <$>) . )

is short for:

\f -> (sigmoid <$>) . f

which is short for:

\f -> (\xs -> sigmoid <$> xs) . f

which you could eta expand to:

\f z -> (\xs -> sigmoid <$> xs) (f z)

and then simplify to:

\f z -> sigmoid <$> f z    :: (a -> [Float]) -> a -> [Float]

Note that, in contrast, the expression (sigmoid <$>) that you wanted to use in its place is equivalent to:

\xs -> sigmoid <$> xs    :: [Float] -> [Float]

which obviously isn't the same.

Anyway, all this means that the folded function:

(((sigmoid <$>) .) . computeLayer)

can be eta expanded and simplified as follows:

\acc x -> (((sigmoid <$>) .) . computeLayer) acc x
\acc x -> ((sigmoid <$>) .) (computeLayer acc) x
\acc x -> (\f z -> sigmode <$> f z) (computeLayer acc) x
\acc x -> sigmoid <$> (computeLayer acc) x
\acc x -> sigmoid <$> computeLayer acc x

and you can quickly verify that the modified definition:

feed inp brain = foldl' (\acc x -> sigmoid <$> computeLayer acc x) inp brain

typechecks and gives the same result in your program.

At the end of the day, your intuition was mostly okay. You wanted the folded function to be a composition of the sigmoid and computeLayer functions, but the fact that computeLayer takes two arguments, instead of one, means that simple composition doesn't work.

For your amusement, the following works too:

feed inp brain = foldl' (((.).(.)) (sigmoid <$>) computeLayer) inp brain