Can someone shade light on:
This means that the input into a network has 1 million dimensions. Even an aggressive reduction to 1,000 dimensions after the first layer means that we need 10^9 parameters.
I see input vector is 10^3. Output vector is of-course two. I don’t know for what sequential architecture we would need 10^9 param?
According to the text, the input vector is 1M (pixels) and the first (fully connected) layer reduces the number of dimensions to 1K (i.e. the layer has 1000 neurons) which sums up to 10^6 x 10^3 = 10^9 parameters.