
Share
Hugging Face's Modular Diffusers lets developers mix and match pre-built blocks to create custom AI pipelines, streamlining the process of tailoring models for specific applications without reinventing the wheel.
Hugging Face has introduced a new approach to building diffusion pipelines called Modular Diffusers. This framework allows developers to construct complex workflows by composing reusable blocks, making it easier to tailor models to specific needs without starting from scratch. This article will guide you through the basics of Modular Diffusers, how to use pre-built blocks, and how to create custom ones.
If you're familiar with Hugging Face's DiffusionPipeline, the transition to Modular Diffusers is straightforward. Here’s a quick example using the FLUX.2 Klein 4B model:
import torch
from diffusers import ModularPipeline
# Define the modular pipeline (model weights are not loaded yet)
pipe = ModularPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-4B"
)
# Load the model weights and configure dtype, quantization, etc.
pipe.load_components(torch_dtype=torch.bfloat16)
pipe.to("cuda")
# Generate an image
image = pipe(
prompt="a serene landscape at sunset",
num_inference_steps=4,
).images[0]
image.save("output.png")
This code produces the same output as a standard DiffusionPipeline, but under the hood, it uses composable blocks. Each block is a self-contained unit that can be mixed and matched to create custom pipelines.
One of the key features of Modular Diffusers is the ability to create and use custom blocks. This flexibility allows you to:
To create a custom block, you need to subclass DiffusionBlock and implement the necessary methods. Here’s a basic example:
from diffusers import DiffusionBlock
class CustomTextEncoder(DiffusionBlock):
def __init__(self, model_name: str):
self.model = SomeTextEncoderModel.from_pretrained(model_name)
def forward(self, prompt: str):
return self.model.encode(prompt)
Once you have your custom block, you can integrate it into a modular pipeline:
from diffusers import ModularPipeline

pipe = ModularPipeline() pipe.add_block("text_encoder", CustomTextEncoder(model_name="your-model"))
pipe.load_components(torch_dtype=torch.bfloat16) pipe.to("cuda")
image = pipe( prompt="a serene landscape at sunset", num_inference_steps=4, ).images[0]
image.save("output.png")
### Modular Repositories
Hugging Face has also introduced **Modular Repositories** to facilitate sharing and collaboration. These repositories contain pre-built blocks that you can easily integrate into your pipelines. You can find a growing collection of modular components on the Hugging Face Hub.
To use a block from a modular repository:
```python
from diffusers import ModularPipeline
# Define the pipeline and load a block from a modular repository
pipe = ModularPipeline()
pipe.load_block("text_encoder", "some-user/some-repo")
# Load other components and run inference
pipe.load_components(torch_dtype=torch.bfloat16)
pipe.to("cuda")
image = pipe(
prompt="a serene landscape at sunset",
num_inference_steps=4,
).images[0]
image.save("output.png")
The Modular Diffusers ecosystem is enriched by community contributions. Users can share their custom pipelines and blocks, making it easier for others to build on top of existing work. This collaborative approach accelerates innovation and ensures that the framework remains versatile and adaptable.
To explore community pipelines:
load_block method.For those who prefer a visual workflow, Modular Diffusers integrates seamlessly with Mellon, a node-based interface. Mellon allows you to drag and drop blocks to create complex workflows without writing code. This is particularly useful
Tags
Original Sources
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
6 March 2026
133 articles
Related Articles
Related Articles
More Stories