LARS implementation is different from the paper

The l2norm is without sqrt and sqrt is done in the _get_lars. Is this a bug or a feature?

implementation in tensorflow is just the same as paper said.