I'm new to the Julia programming language, and still learning it by writing code that I've already written in Python (or, at least, tried out in Python).

There is an article which explains how to make a very simple neural network: https://medium.com/technology-invention-and-more/how-to-build-a-simple-neural-network-in-9-lines-of-python-code-cc8f23647ca1.

I tried the code in this article out in Python, at it's working fine. However, I haven't used linear algebra things in Python before (like dot). Now I'm trying to translate this code to Julia, but there are some things I can't understand. Here is my Julia code:

using LinearAlgebra

synaptic_weights = [-0.16595599, 0.44064899, -0.99977125]::Vector{Float64}

sigmoid(x) = 1 / (1 + exp(-x))
sigmoid_derivative(x) = x * (1 -x)

function train(training_set_inputs, training_set_outputs, number_of_training_iterations)
    global synaptic_weights
    for (iteration) in 1:number_of_training_iterations
        output = think(training_set_inputs)

        error = training_set_outputs .- output

        adjustment = dot(transpose(training_set_inputs), error * sigmoid_derivative(output))

        synaptic_weights = synaptic_weights .+ adjustment

think(inputs) = sigmoid(dot(inputs, synaptic_weights))

println("Random starting synaptic weights:")

training_set_inputs = [0 0 1 ; 1 1 1 ; 1 0 1 ; 0 1 1]::Matrix{Int64}
training_set_outputs = [0, 1, 1, 0]::Vector{Int64}
train(training_set_inputs, training_set_outputs, 10000)

println("New synaptic weights after training:")

println("Considering new situation [1, 0, 0] -> ?:")
println(think([1 0 0]))

I've already tried to initialize vectors (like synaptic_weights) as:

synaptic_weights = [-0.16595599 ; 0.44064899 ; -0.99977125]

However, the code is not working. More exactly, there are 3 things that is not clear for me:

  1. Do I initialize vectors and matrixes in the right way (is it equal to what the original author does in Python)?
  2. In Python, the original author uses + and - operators where one operand is a vector and the other is a scalar. I'm not sure whether this means element-wise addition or subtraction in Python. For example, is (vector+scalar) in Python equal to (vector.+scalar) in Julia?
  3. When I try to run the Julia code above, I get the following error:

    ERROR: LoadError: DimensionMismatch("first array has length 12 which does not match the length of the second, 3.")
     [1] dot(::Array{Int64,2}, ::Array{Float64,1}) at C:\Users\julia\AppData\Local\Julia-1.0.3\share\julia\stdlib\v1.0\LinearAlgebra\src\generic.jl:702
     [2] think(::Array{Int64,2}) at C:\Users\Viktória\Documents\julia.jl:21
     [3] train(::Array{Int64,2}, ::Array{Int64,1}, ::Int64) at C:\Users\Viktória\Documents\julia.jl:11
     [4] top-level scope at none:0
    in expression starting at C:\Users\Viktória\Documents\julia.jl:28

    This error comes when the funtion think(inputs) tries to compute the dot product of inputs and synaptic_weights. In this case, inputs is a 4x3 matrix and synaptic weights is a 3x1 matrix (vector). I know that they can be multiplied, and the result will become a 4x1 matrix (vector). Doesn't this mean that they dot product can be computed?

    Anyway, that dot product can be computed in Python using the numpy package, so I guess there is a certain way that it can also be computed in Julia.

For the dot product, I also tried to make a function that takes a and b as arguments, and tries to compute their dot product: first, computes the product of a and b, then returns the sum of the result. I'm not sure whether it's a good solution, but the Julia code didn't produce the expected result when I used that function, so I removed it.

Can you help me with this code, please?

1 Answers

Bogumił Kamiński On Best Solutions

Here is the code adjusted to Julia:

sigmoid(x) = 1 / (1 + exp(-x))
sigmoid_derivative(x) = x * (1 -x)
think(synaptic_weights, inputs) = sigmoid.(inputs * synaptic_weights)

function train!(synaptic_weights, training_set_inputs, training_set_outputs,
    for iteration in 1:number_of_training_iterations
        output = think(synaptic_weights, training_set_inputs)
        error = training_set_outputs .- output
        adjustment =  transpose(training_set_inputs) * (error .* sigmoid_derivative.(output))
        synaptic_weights .+= adjustment

synaptic_weights = [-0.16595599, 0.44064899, -0.99977125]
println("Random starting synaptic weights:")

training_set_inputs = Float64[0 0 1 ; 1 1 1 ; 1 0 1 ; 0 1 1]
training_set_outputs = Float64[0, 1, 1, 0]
train!(synaptic_weights, training_set_inputs, training_set_outputs, 10000)

println("New synaptic weights after training:")

println("Considering new situation [1, 0, 0] -> ?:")
println(think(synaptic_weights, Float64[1 0 0]))

There are multiple changes so if some of them are not clear to you please ask and I will expand on them.

The most important things I have changed:

  • do not use global variables as they will significantly slow down the performance
  • make all arrays have Float64 element type
  • in several places you need to do broadcasting with . (e.g. sigmoid and sigmoid_derivative functions are defined in such a way that they expect to get a number as an argument, therefore when we call them . is added after their name to trigger broadcasting)
  • use standard matrix multiplication * instead of dot

The code runs around 30x faster than the original implementation in Python. I have not squeezed out maximum performance for this code (now it does a lot of allocations which can be avoided) as it would require to rewrite its logic a bit and I guess you wanted a direct reimplementation.