I spent a long time without programming Haskell, and decided to get back into it by taking on a relatively advanced project. I'm trying to program a Neural Network from scratch by following this guide. I've scratched my head around some of his most esoteric approaches to simple problems like creating a network of weights and biases, but when it comes to this:
feed :: [Float] -> [([Float], [[Float]])] -> [Float]
feed input brain = foldl' (((relu <$>) . ) . zLayer) input brain
I don't understand what he does. More specifically, I don't understand why use two .
's in the function composition here. He uses (relu <$>) . )
. This .
followed by a parentheses doesn't make sense to me. I understand it represents function composition, and in this case, the function zLayer takes in a layer of neurons, which is of type ([Float], [[Float]])
and the output of the previous layer, which is of type [Float]
, and produces a new output, also of type [Float]
. I understand he's applying the relu <$>
function to the result of zLayer
, which makes sense. That is, you want to fold the brain (which is nothing but a list of layers) by applying zLayer
on a layer of the brain, then applying relu <$>
on the result of that, and finally pass that as the input
to the next layer.
The seemingly empty composition is what bugs me. What I described above should, to me, be implemented like this:
feed :: [Float] -> [([Float], [[Float]])] -> [Float]
feed inp brain = foldl' (((sigmoid <$>) . computeLayer) inp brain
(I'm using the sigmoid function instead of the rectifier (ReLU), and computeLayer is just my implementation of zLayer.) Right? What I'm doing there is (supposedly) providing, as a function to foldl'
, this:
(sigmoid <$> (computeLayer))
When I add just .)
between my .
and computeLayer
(and an open parentheses before of course), it works. Without them, this is the error:
net.hs:42:42: error:
• Couldn't match type ‘[Float]’ with ‘Float’
Expected type: [Float] -> ([Float], [[Float]]) -> Float
Actual type: [Float] -> ([Float], [[Float]]) -> [Float]
• In the second argument of ‘(.)’, namely ‘computeLayer’
In the first argument of ‘foldl'’, namely
‘((sigmoid <$>) . computeLayer)’
In the expression: foldl' ((sigmoid <$>) . computeLayer) inp brain
|
42 | feed inp brain = foldl' ((sigmoid <$>) . computeLayer) inp brain
| ^^^^^^^^^^^^
Why does this seemingly empty composition of functions work?
This is the whole code so far, if it helps:
import System.Random
import Control.Monad
import Data.Functor
foldl' f z [] = z
foldl' f z (x:xs) = let z' = z `f` x
in seq z' $ foldl' f z' xs
sigmoid :: Float -> Float
sigmoid x = 1 / (1 + (exp 1) ** (-x))
-- Given a list, gives out a list of lists of length *each element of the list*
makeBiases :: [Int] -> Float -> [[Float]]
makeBiases x b = flip replicate b <$> x
-- Given a list, gives out, for each element X in the list, a list of length x + 1, of
-- x elements in any normal distribution
makeWeights :: [Int] -> Float -> [[[Float]]]
makeWeights xl@(_:xs) el = zipWith (\m n -> replicate n (replicate m el)) xl xs
-- Make initial biases and weights to give a list of tuples that corresponds to biases
-- and weights associated with each node in each layer
makeBrain :: [Int] -> Float -> Float -> [([Float], [[Float]])]
makeBrain (x:xs) b el = zip (makeBiases xs b) (makeWeights (x:xs) el)
-- Given output of a layer, apply weights and sum for all nodes in a layer. For each list
-- of weights (each node has multiple inputs), there will be one output
sumWeightsL l wvs = sum . zipWith (*) l <$> wvs
-- Given output of a layer, apply weights to get tentative output of each node. Then
-- sum biases of each node to its output
computeLayer :: [Float] -> ([Float], [[Float]]) -> [Float]
computeLayer l (bs, wvs) = zipWith (+) bs (sumWeightsL l wvs)
feed :: [Float] -> [([Float], [[Float]])] -> [Float]
feed inp brain = foldl' ((sigmoid <$>) . computeLayer) inp brain
main = do
putStrLn "3 inputs, a hidden layer of 4 neurons, and 2 output neurons:"
print $ feed [0.1, 0.2, 0.3] (makeBrain [3,4,2] 0 0.22)
As @Bergi notes, the expression
((relu <$>) . )
is not "empty function composition" but rather something called a "section". (In fact, in this case, it's a section nested inside another section.)You've undoubtedly seen this before, even if you've forgotten what it was called and/or didn't realize it applied to the function composition operator
(.)
, but just to remind you...In Haskell, for any binary operator (like
(+)
), you can write a left or right "section":so that something like
map (2*) mylist
can be used to double every element of a list instead of having to writemap (\x -> 2*x) mylist
.It works the same way for function composition
(.)
and the fmap operator(<$>)
, so:is short for:
which is short for:
which you could eta expand to:
and then simplify to:
Note that, in contrast, the expression
(sigmoid <$>)
that you wanted to use in its place is equivalent to:which obviously isn't the same.
Anyway, all this means that the folded function:
can be eta expanded and simplified as follows:
and you can quickly verify that the modified definition:
typechecks and gives the same result in your program.
At the end of the day, your intuition was mostly okay. You wanted the folded function to be a composition of the
sigmoid
andcomputeLayer
functions, but the fact thatcomputeLayer
takes two arguments, instead of one, means that simple composition doesn't work.For your amusement, the following works too: