what does that mean "vector augmented to 1"?

2k views Asked by At

I am new to machine learning and statistics (well, I've been learning math in my university but that was about 10-12 years ago) Could you please explain the meaning of following sentence from 4 page (in a book 5 page) from book here ( https://www.researchgate.net/publication/227612766_An_Empirical_Comparison_of_Machine_Learning_Models_for_Time_Series_Forecasting ):

The multilayer perceptron (often simply called neural network) is perhaps the most popular network architecture in use today both for classification and regression (Bishop [5]). The MLP is given as follows: N H y ˆ = v0 + j=1 X vj g(wj T x′ ) (1) where x′ is the input vector x, augmented with 1, i.e. x′ = (1, xT )T , wj is the weight vector for j th hidden node, v0 , v1 , . . . , vN H are the weights for the output node, and y ˆ is the network output. The function g represents the hidden node output, and it is given in terms of a squashing function, for example (and that is what we used) the logistic function: g(u) = 1/(1 + exp(−u)). A related model in the econometrics literature is

For instance, we have a vector x = [0.2, 0.3, 0.4, 0.5] How do I transform it to get a x′ vector augmented to 1 x′ = (1, x)

1

There are 1 answers

0
Prune On BEST ANSWER

This is part of the isomorphism between matrices and systems of equations. What you have at the moment is a row equivalent to a right-hand-side expression, such as

w1 = 0.2*x1 + 0.3*x2 + 0.4*x3 + 0.5*x4
w2 = ...
w3 = ...
w4 = ...

When we want to solve the system, we need to augment the matrix. This requires adding the coefficient of each w[n] variable. They are trivially all ones:

1*w1 = 0.2*x1 + 0.3*x2 + 0.4*x3 + 0.5*x4
1*w2 = ...
1*w3 = ...
1*w4 = ...

... and that's where we get the augmented matrix. When we assume the variables by position -- w by row, x by column -- what remains is the coefficients alone, in a nice matrix.