One Training Example Is All You Need for Reasoning
To improve reasoning ability, it may be enough to use only one training example in the post-training of an LLM. In this post, I explain a study on reinforcement learning that uses just a single training example, “Reinforcement Learning for Reasoning in Large Language Models with One Training Example” [Wang+ NeurIPS 2025]. Intuitively, the main […]
One Training Example Is All You Need for Reasoning Read Post »










