A distribution-aware speculative decoding method targets rollout bottlenecks in RL post-training, with the team reporting up to 50% faster generation without reward loss. It is most relevant for builders optimizing agent training loops and inference throughput.

Rollout is the silent bottleneck in RL post-training. DAS fixes it with adaptive speculative decoding — up to 50% faster, zero degradation in reward quality.