Saveliy Baturin
This paper shows that in overparameterized one-hidden-layer ReLU networks, the loss landscape becomes smoother and flatter as the network width increases, resulting in smaller energy gaps between local and global minima.
Researchers have explored how the 'loss landscape' of neural networks changes as these networks become larger, focusing on a simple type of network called a one-hidden-layer ReLU network. They found that when these networks have more parameters than necessary (overparameterized), the landscape becomes smoother and flatter. This means that different configurations that result in the same performance can be connected with minimal increase in loss. In practical terms, this could make it easier to find optimal solutions when training large neural networks.