Implementing a Multilayer Perceptron from Scratch

mli · November 27, 2018, 10:29pm

http://d2l.ai/chapter_multilayer-perceptrons/mlp-scratch.html

prasanth5reddy · April 7, 2019, 4:30am

We defined the network here from the inputs to hidden and then to output. My understanding is that while computing the gradients, first we compute for hidden layer and then input layer as in backpropagation algorithm. But here are we directly computing the loss between input layer and output layer and computing the gradients? Is this approach true? I am confused here.

mouryarishik · April 7, 2019, 12:13pm

Backpropagation is now just in theories. We study it because it was this algorithm that helped researchers to optimize networks in 1980s-1990s, this algorithm was simply based on chain rule of derivatives.
But nowadays we don’t use backpropagation, we calculate gradient of cost wrt parameters by using automatic differentiation. This new approach provides faster gradient calculation. In this approach we basically build a computation graph, and then calculate the gradients.

prasanth5reddy · April 7, 2019, 12:33pm

Alright! Last line makes sense. So instead of taking two layers at a time and computing their gradients backward, this will compute the entire network first using all existing parameters and then compute the gradients with respect to parameters accordingly, right?

mouryarishik · April 7, 2019, 12:40pm

Yes. Well as such I’d suggest you not to worry too much about it. Because calculating gradients is not that much important. We need gradients only to implement gradient descent algorithm, which made deep learning what it is today. But much important part is how do we forward propagate.

Topic		Replies	Views
Adding network gradient to the computational graph Gluon	3	1644	December 17, 2018
Computing per-class gradients	5	651	August 16, 2018
Computing gradients of intermediate values	3	612	April 23, 2019
How to implement the addtion of grad in the backback-propagating,how to add extra term (which is the gradient to middle net layer output) to the network	2	590	August 18, 2018
Gradients for Embedding layers in Gluon Gluon	3	792	September 21, 2018

Implementing a Multilayer Perceptron from Scratch

Related Topics