Came across Andrew Ng's non-linear hypothesis of neural networks where I had an MCQ to find the number of features for an image of resolution 100x100 of greyscale intensities.
And the answer was 50 million, 5 x 10^7.
However, earlier for a 50 x 50 pixel grey scale image, the number of features is 50x50 (2500) and for RGB image, it is 7500.
Why would it be 5 x 10^7 instead of 10,000?
He does however say including all quadratic terms (xi,xj) as features.
The question is:
Suppose you are learning to recognize cars from 100×100 pixel images (grayscale, not RGB). Let the features be pixel intensity values. If you train logistic regression including all the quadratic terms (xi,xj) as features, about how many features will you have?
And earlier he added that, if we were to use xi, xj ,we would end up with a total of 3 million features. Still I couldn't what relation is this?
For 50x50 pixel, the answer is 3,128,750
At first it is a combination:
$$C^2_n for x_ix_j$$
And this:
$$n for x_i^2$$
$$n for x_i$$
Number of features = C^2_n + n + n.
And the answer for 50x50 pixel is 50015000.