# Linear regression with dot product

#1

For educational purposes I want to have a linear regression example that is using `mx.sym.dot(X, w)` instead of
`mx.sym.FullyConnected(X, num_hidden=1)`, see code example below. Is there a way to do this?
I know I can do a similar thing with `nd` and autograd instead of `sym`, but then I also have to implement SGD by hand, which is not what I am looking for â€¦

``````m = 1000
batch_size = 100
nVars = 4
data = np.random.normal(0,1, (m, nVars))
labels = -10 * data[:,0] + data[:,1]*np.pi + 5 * np.sqrt(abs(data[:,2])) - data[:,3] + np.random.normal(0,1, m)*2

train_iter = mx.io.NDArrayIter(data={'data':data}, label={'labels':labels}, batch_size=batch_size)

X = mx.sym.Variable('data', shape=(batch_size, nVars))
y = mx.sym.Variable('labels', shape=(batch_size))
w = mx.sym.var(name='theta', shape=(nVars), init=mx.initializer.Normal())

# this works as expected
fc = mx.sym.FullyConnected(data=X, name='fc1', num_hidden=1)
yhat = mx.sym.LinearRegressionOutput(fc, label=y, name='yhat')
model = mx.mod.Module(symbol=yhat, data_names=['data'], label_names=['labels'])
train_iter.reset()
model.fit(train_iter, num_epoch=10)
pred = model.predict(train_iter).asnumpy().flatten()

# with this solution I cannot figure out how to make the optimizer improve w.
fc_dot = mx.sym.dot(X, w)
yhat_dot = mx.sym.LinearRegressionOutput(fc_dot, label=y, name='yhat_dot')
model_dot = mx.mod.Module(symbol=yhat_dot, data_names=['data'], label_names=['labels'])
train_iter.reset()
model_dot.fit(train_iter, num_epoch=10)
pred_dot = model_dot.predict(train_iter).asnumpy().flatten()

np.mean(pred_dot - labels)
``````

#2

For these two to be the same you should disable bias in FullyConnected layer.
`fc = mx.sym.FullyConnected(data=X, name='fc1', num_hidden=1, no_bias=True)`

#3

Or alternatively, add a bias term initialized to zero to your deconstructed example.

#4

Thanks for the quick reply! I am aware of the fact that `FullyConnected()` without bias is the same as the dot product.

I am currently writing up a small document where I implement Linear Regression from scratch and with various ML tools,. From scratch it is a simple dot product, then in tensorflow I could do the same thing with the dot product. So for simple reasons of consitancy withing this document I was trying to implement it with a dot product in mxnet too.

(Instead of explaining to the reader: "Hey, did you happen to know that Linear Regression is basically a 1 layer Fully Connected Neural Net without activation function, hence we use `FullyConnected()`". Which would come at a later pointâ€¦)

#5

It works if you define the weight as

``````w = mx.sym.var(name='theta', shape=(nVars, 1), init=mx.initializer.Normal())
``````

#6

Indeed I did not realize the default was to add a bias term in `FullyConnected()` and that it hence was NOT the same as my deconstructed exampleâ€¦ Further, because the fit without bias was so bad my brain wrongly assumed there was no optimization going on at all.
For sake of completeness here the code that finally does what I was looking for.

Thanks for the help everybody!

``````import numpy as np
import mxnet as mx

nEpochs = 100
m = 1000
batch_size = 100
nVars = 4
data = np.random.normal(0,1, (m, nVars))
labels = -10 * data[:,0] + data[:,1]*np.pi + 5 * np.sqrt(abs(data[:,2])) - data[:,3] + np.random.normal(0,1, m)*2

train_iter = mx.io.NDArrayIter(data={'data':data}, label={'labels':labels}, batch_size=batch_size)

X = mx.sym.Variable('data', shape=(batch_size, nVars))
y = mx.sym.Variable('labels', shape=(batch_size))
w = mx.sym.var(name='theta', shape=(nVars, 1), init=mx.initializer.Normal())
b = mx.sym.var(name='bias', shape=(1), init=mx.initializer.Zero())