DimensionMismatch: matrix A has dimensions (100,7), matrix B has dimensions (10,7) with the following julia code

57 views Asked by At

I am new to julia and I am having trouble getting this training loop to work, my inputs are as follows, inputs is the 7-segments of a digital clock and target output is what the nn is trying to learn. My nn takes 7 inputs and has the output vector as 10. The training loop below is where the problem is occurring, but I cannot figure out how to reshape inputs or targetOutput to have the correct shape. Also the shuffle loop is meant to present a random targetOutput for each epoch in case you were wondering why that loop is there.

# Define training data
    inputs = [
        0 1 1 0 0 0 0;  # 1
        1 1 0 1 1 0 1;  # 2
        1 1 1 1 0 0 1;  # 3
        0 1 1 0 0 1 1;  # 4
        1 0 1 1 0 1 1;  # 5
        1 0 1 1 1 1 1;  # 6
        1 1 1 0 0 0 0;  # 7
        1 1 1 1 1 1 1; # 8
        1 1 1 1 0 1 1;  # 9
        1 1 1 1 1 1 0;   # 0

]

    # Define corresponding labels
    targetOutput = [
        0 1 0 0 0 0 0 0 0 0;  # represents "1"
        0 0 1 0 0 0 0 0 0 0;  # represents "2"
        0 0 0 1 0 0 0 0 0 0;  # represents "3"
        0 0 0 0 1 0 0 0 0 0;  # represents "4"
        0 0 0 0 0 1 0 0 0 0;  # represents "5"
        0 0 0 0 0 0 1 0 0 0;  # represents "6"
        0 0 0 0 0 0 0 1 0 0;  # represents "7"
        0 0 0 0 0 0 0 0 1 0;  # represents "8"
        0 0 0 0 0 0 0 0 0 1;  # represents "9"
        1 0 0 0 0 0 0 0 0 0;  # represents "0"
]
    mse(x,y) = sum((x .- y).^2)/length(x) # MSE will be our loss function
    
    using Random
    Random.seed!(54321) # for reproducibility
    
    twoLayerNeuralNet = Network(Layer(7,100,ReLu), Layer(100,10)) 
        # instantiate a two-layer network
end
begin

    Flux.@functor Layer     # set the Layer-struct as being differentiable
    Flux.@functor Network   # set the Network-struct as being differentiable 
    
    parameters = Flux.params(twoLayerNeuralNet) 
        # obtain the parameters of the layers (recurses through network)
    
    optimizer = ADAM(0.05) # from Flux-library

    netOutput = [] # store output for plotting
    lossCurve = [] # store loss for plotting
    
    for i in 1:1000
        
        for j in shuffle(1:10) 
            target = reshape(targetOutput[j,:], (1, 10))
            # Calculate the gradients for the network parameters
            gradients = Flux.gradient(
                () -> mse(twoLayerNeuralNet(inputs[j, :]), target),
                Flux.params(twoLayerNeuralNet)
            )

            # Update the parameters using the gradients and optimizer settings.
            Flux.Optimise.update!(optimizer, Flux.params(twoLayerNeuralNet),                    gradients)

            # Log the performance for later plotting
            actualOutput = twoLayerNeuralNet(inputs)
            push!(netOutput, actualOutput)
            push!(lossCurve, mse(actualOutput, target))
        end
    end
end 

I have tried transpose and reshape and selecting different rows of the code

0

There are 0 answers