Momentum
Large learning rates often lead to oscillation of weight changes and learning never completes, or the model converges to a solution that is not optimum. One way to allow faster learning without oscillation is to make the weight change a function of the previous weight change to provide a smoothing effect. The momentum factor determines the proportion of the last weight change that is added into the new weight change.
|