Linear regression implementation from scratch

http://d2l.ai/chapter_linear-networks/linear-regression-scratch.html

Re: exercise #5, I actually don’t understand why the reshape function is necessary. It seems like an implementation detail that leaked to the user API. What am I missing?

Edit: figured out why reshaping is necessary by playing around with small examples (broadcasting semantics). Although I ran into an issue where if I said attach_grad before reshaping my vector, I would get that the grad is None; If I said attach_grad after reshaping, I would get the correct grad. Is that a bug or is there a reason?