Robust Autonomy Emerges from Self-Play

paper-review
autonomous-driving
Author

Daniel Pickem

Published

June 4, 2025

A few weeks ago I came across a paper titled “Robust Autonomy Emerges from Self-Play” in the TLDR newsletter (which is worth subscribing to if you want to stay on top of the latest news in AI, but that’s not the point of this post).

The above paper was interesting for many reasons not the least of which was a sentimental one. It was published by former colleagues of mine at Apple and appears to be the latest (and last?) public artifact of Apples self-driving efforts.

Over ten authors contributed to the paper, but I’ve only had the pleasure of (tangentially) working with the following:

In my opinion, the approach in the paper stands out for several reasons:

My favorite part about this paper, though, was the conditioning input C to the model which modulates the policy’s behavior and enables inference-time modifications of agent behavior by simply changing conditioning inputs. More aggressive driving? Simply modify the weights on some reward function components. You want a truck instead of a passenger vehicle behavior? Increase the agent’s dimensions and dynamics through C and it will behave like a truck. A single model can be used to simulate a diverse variety of agents and behaviors - which is a powerful capability for realistic agents in simulation (and possibly even for a policy that runs on-vehicle).

I enjoyed reading the paper to the point where I put together the following slide deck and presented it for an autonomous driving reading group.

Stay curious,

Daniel