PaperPulse logo
FeedTopicsAI Researcher FeedBlogPodcastAccount

Stay Updated

Get the latest research delivered to your inbox

Platform

  • Home
  • About Us
  • Search Papers
  • Research Topics
  • Researcher Feed

Resources

  • Newsletter
  • Blog
  • Podcast
PaperPulse•

AI-powered research discovery platform

© 2024 PaperPulse. All rights reserved.

Tight Long-Term Tail Decay of (Clipped) SGD in Non-Convex Optimization

ArXivSource

Aleksandar Armacki, Dragana Bajović, Dušan Jakovetić, Soummya Kar, Ali H. Sayed

cs.LG
math.OC
|
Feb 5, 2026
141 views

One-line Summary

This paper establishes tight long-term tail decay rates for SGD and clipped SGD in non-convex optimization, showing significantly faster decay than previously known results.

Plain-language Overview

This research investigates how the stochastic gradient descent (SGD) algorithm behaves over time, particularly focusing on the likelihood of large errors occurring. While previous studies have looked at short-term error probabilities, this paper examines the long-term behavior, which is more relevant for algorithms run over many iterations. The authors find that the probability of large errors decreases much faster over time than previously believed, especially when using a version of SGD that handles certain types of noise. This means that the algorithm is more reliable over long periods than earlier research suggested.

Technical Details