Dropout randomly sets neuron inputs/outputs to zero during training

Dropout (neural networks)

Dropout randomly sets neuron inputs/outputs to zero during training

Dropout is a regularization technique used to prevent overfitting in neural networks by randomly disabling neurons during training. This randomness helps the network learn more robust features that are not reliant on specific neurons.

Example

During training, if a neuron has a 50% chance of being dropped out, the input to that neuron will be set to zero for that training instance.

Dropout reduces the risk of overfitting by ensuring that the neural network does not become overly reliant on any single neuron, promoting better generalization.

Related concepts

dropout works as regularization: it approximates an ensemble of subnetworks

Dropout randomly deactivates neurons during training, simulating an ensemble of subnetworks, thus preventing co-adaptation and improving generalization

ill-conditioned matrices cause numerical instability: small input changes → large output changes

Ill-conditioned matrices amplify input perturbations, leading to significant output variability

AdaGrad's learning rate decays to zero

AdaGrad adjusts learning rate by accumulating squared gradients, causing it to decay to zero as denominator grows exponentially

gradient accumulation simulates larger batch sizes without more memory

Gradient accumulation reduces memory usage by dividing a large batch into smaller mini-batches, accumulating gradients before updating model weights

log-loss / cross-entropy loss penalizes: confident wrong predictions more heavily

Log-loss penalizes confident incorrect predictions more heavily

learning rate warmup does: starts small to avoid early training instability

Learning rate warmup gradually increases the learning rate from zero to a predefined value to stabilize training initially

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews