
Share
Researchers are uncovering a surprising link between diffusion and autoregressive models in the frequency domain, revealing new possibilities for image generation techniques and shedding light on their underlying similarities.
If you’ve been keeping up with recent advancements in machine learning, you might have noticed a growing interest in understanding how different model architectures relate to each other. One particularly intriguing insight is that diffusion models of images perform approximate autoregression in the frequency domain. This connection is not just theoretical but has practical implications for practitioners working with both diffusion and autoregressive models.
To start, let's consider how both diffusion models and autoregressive models handle iterative refinement:
The key insight here is that both models can be seen as performing iterative refinement, but in different domains. Diffusion models do this in the spatial domain, while autoregressive models can be understood as operating in the frequency domain.
To understand how diffusion models perform approximate autoregression in the frequency domain, let's dive into the spectral perspective:
This spectral view reveals that diffusion models are effectively performing a form of autoregressive modeling in the frequency domain. By iteratively refining the image in the spatial domain, they indirectly influence the frequency spectrum, which is a key characteristic of autoregressive processes.

The connection between diffusion and autoregressive models isn't limited to images. For sound data, the same principles apply:
One of the interesting aspects of this connection is the concept of an "unstable equilibrium":
The connection between diffusion models and autoregressive models in the frequency domain is more than just a theoretical curiosity. It provides a new perspective on how these models work and opens up possibilities for cross-pollination of ideas and techniques. For practitioners, this means that insights from one domain can be applied to improve or innovate in the other.
To explore this further, I’ve created a Python notebook using Google Colab. This notebook includes all the code used to produce the plots and animations discussed in this post. I encourage you to try it out, modify the parameters, and see how the models behave.
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
3 September 2024
88 articles
Related Articles
Related Articles
More Stories