Review Summary Variantional inference Proof that SGD minimizes a potential along with an entropic regularization term. However, this potential differs from the loss used to compute backpropagation gradients. They are only equal if the gradient noise were isotropic (i.