Trajectory Models for Few-Step Diffusion (22 minute read)
This paper replaces standard diffusion denoising with conditional normalizing flows to get four-step image generation without giving up exact likelihood training. The…
This paper proposes Variational Linear Attention, an online least-squares formulation that stabilizes linear attention memory with an adaptive penalty matrix. It targets a core bottleneck in long-context transformers: reducing interference while keeping attention efficient.
This paper replaces standard diffusion denoising with conditional normalizing flows to get four-step image generation without giving up exact likelihood training. The…
State space models are moving from a niche alternative to a credible transformer competitor, with tradeoffs that matter for long-context efficiency and scaling. The piece is a…
This paper tightens the evaluation of diffusion-based OOD detectors by controlling for backbone choice and test-time budget, then proposes sparse internal feature snapshots as a…
A tight read on the deals, papers, and policy filings worth your time. No takes, no roundups of other people's tweets.