PaperPulse logo
FeedTopicsAI Researcher FeedBlogPodcastAccount

Stay Updated

Get the latest research delivered to your inbox

Platform

  • Home
  • About Us
  • Search Papers
  • Research Topics
  • Researcher Feed

Resources

  • Newsletter
  • Blog
  • Podcast
PaperPulse•

AI-powered research discovery platform

© 2024 PaperPulse. All rights reserved.

Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning

ArXivSource

Zixiang Di, Jinyi Han, Shuo Zhang, Ying Liao, Zhi Li, Xiaofeng Ji, Yongqi Wang, Zheming Yang, Ming Gao, Bingdong Li, Jie Wang

cs.LG
cs.AI
|
Feb 3, 2026
5 views

One-line Summary

Plausible Negative Samples (PNS) improve the reasoning capabilities of Large Language Models by generating high-quality incorrect responses for training purposes.

Plain-language Overview

Large Language Models (LLMs) can become better at reasoning by learning from incorrect examples, but not all wrong answers are equally helpful. The new method, Plausible Negative Samples (PNS), creates sophisticated incorrect responses that look and feel like correct ones, except for the final answer. This approach helps LLMs learn more effectively by focusing on the quality of the negative examples. Testing on several mathematical reasoning tasks shows that this method improves model performance significantly compared to other techniques.

Technical Details