XieResearchGroup

CtrlK

Better Generalization

Penalize the large weights with weight regularization

Sparse representation with activity regularization

large activations may indicate an over-fit model
there is a tension between the expressiveness and the generalization of the learned features
encourage small activations with additional penalty
track activation mean value

Force small weights with weight constraints

Decouple layers with dropout

Promote robustness with Noise

Halt training at the right time with early stopping

Issues Log

PreviousBetter Training NextBetter Prediction

Last updated 2 years ago

Was this helpful?