weight initialization matters: Xavier/He init keeps activation variance ≈ 1 across layers

Weight initialization stabilizes learning by maintaining consistent activation variance

Image: Billy69150 (voir les conditions d'utilisation / see licensing below), CC BY-SA 4.0, via Wikimedia Commons

weight initialization matters: Xavier/He init keeps activation variance ≈ 1 across layers

Weight initialization stabilizes learning by maintaining consistent activation variance

Related concepts

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews