Robotics/Paper Summary/01/03/2026/6 min read

Offline Reinforcement Learning for Real-World Robotics

Learning robust manipulation policies from logged data without risky online exploration on physical hardware.

Kumar et al. · UC Berkeley / Google · CoRL 2025

Jon Bell

Research writer

Offline Reinforcement Learning for Real-World Robotics

Online reinforcement learning on real robots is slow and unsafe. Offline RL learns policies purely from previously collected trajectories, sidestepping the cost of live trial-and-error.

The distribution-shift problem

The central challenge is that a learned policy may prefer actions absent from the dataset, where value estimates are unreliable. The paper constrains the policy to stay close to the data distribution while still improving over it.

Field note: offline RL shines when you already have large logs from teleoperation or scripted controllers.

Results on manipulation

On a suite of grasping and stacking tasks, the conservative offline method matches online baselines while never touching the robot during training, a meaningful safety and cost improvement.

Citation

Kumar, A. et al. (2025). Conservative Q-Learning for Offline Reinforcement Learning. arXiv:2006.04779.

Source paper

Keep reading

Related papers

More Robotics

LLMs

Comments

Add a practical note, implementation detail, or question. Comments are saved for editorial review.

No approved comments are visible yet. Start the discussion below.

Offline Reinforcement Learning for Real-World Robotics

The distribution-shift problem

Results on manipulation

Citation

Related papers

Compute-Optimal Training: Scaling Laws Revisited

CSS Container Queries Explained

Sparse Mixture-of-Experts at Inference Scale

Comments

Offline Reinforcement Learning for Real-World Robotics

The distribution-shift problem

Results on manipulation

Citation

Related papers

Compute-Optimal Training: Scaling Laws Revisited

CSS Container Queries Explained

Sparse Mixture-of-Experts at Inference Scale

Comments

The papers that matter, summarized weekly.