[docs] Modular diffusers #11931

stevhliu · 2025-07-15T18:14:09Z

Draft for a quickstart that should ideally briefly summarize everything a developer needs to know about Modular Diffusers without referencing other resources.

edit: expanding scope into other modular docs as well with additions like API references

(bit of a brain dump at the moment, not ready for review yet!)

HuggingFaceDocBuilderDev · 2025-07-15T18:21:15Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

stevhliu · 2025-07-23T21:01:08Z

Ok, I think I have an initial version of the refactored modular docs @huggingface/diffusers!

Added an API section for modular here. Let me know if there is anything useful that is missing!
A bit undecided about the Quickstart/End-to-end doc at the moment. The Quickstart tries cover everything through the lens of implementing Differential Diffusion. The End-to-end example focuses more on a 4-step process for implementing Differential Diffusion. I think the Quickstart is more comprehensive but I'm not sure and would appreciate your feedback!
Split Guiders out into a separate doc.
There were many practical examples scattered through that I have omitted for the time being because they made the docs really long. I'm thinking of adding a "Recipes" section to showcase these practical examples.

yiyixuxu

thanks a lot @stevhliu!

yiyixuxu · 2025-07-31T00:40:17Z

docs/source/en/modular_diffusers/quickstart.md

+
+[Differential Diffusion](https://differential-diffusion.github.io/) differs from standard image-to-image in its `prepare_latents` and `denoise` blocks. All the other blocks can be reused, but you'll need to modify these two.
+
+Create placeholder `PipelineBlocks` for `prepare_latents` and `denoise` by copying and modifying the existing ones.


I think the flow here is we start with rewriting these two special blocks and then assemble them
if we are going with this flow, we don't need to create placeholder here (my initial process is to create complete structure first, hence we needed the placeholder, I found it more efficient that way but less intuitive, so what you did here is more beginer friendly so let's go with this)

we can just jump to the prepare_latents section, and in denoise section, first go over the block's structure, and then how to rewrite the custom sub-block and finally assemble everything back together into a custom denoise before assemble the entire pipeline

I agree. But not opposed to showing the denoise block structure early for illustration (without the placeholders)

yiyixuxu · 2025-07-31T03:49:06Z

docs/source/en/modular_diffusers/quickstart.md

+dd_pipeline = dd_auto_blocks.init_pipeline("YiYiXu/modular-demo-auto", collection="diffdiff")
+dd_pipeline.load_default_components(torch_dtype=torch.float16)
+```
+


let's link to the full implementation for this example here https://huggingface.co/YiYiXu/modular-diffdiff/blob/main/block.py

yiyixuxu · 2025-07-31T03:54:43Z

docs/source/en/modular_diffusers/modular_diffusers_states.md

Suggested change

If a variable is modified in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`].

If a new variable is added in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`].

a bit confusing here I think, lol but it is what it is

@DN6 is doing some refactor in #11969; we will be able to simplify things a lot and make this section of doc more intuitive in a future PR

yiyixuxu · 2025-07-31T04:02:39Z

docs/source/en/modular_diffusers/pipeline_block.md

@@ -12,83 +12,63 @@ specific language governing permissions and limitations under the License.

 # PipelineBlock

-<Tip warning={true}>


if we don't want to show a complete example of a pipeline block here, maybe we can link to some source code ? https://github.com/huggingface/diffusers/blob/main/src/diffusers/modular_pipelines/stable_diffusion_xl/encoders.py

yiyixuxu · 2025-07-31T04:10:34Z

docs/source/en/modular_diffusers/loop_sequential_pipeline_blocks.md

-* No need to call `self.get_block_state()` or `self.set_block_state()`
+## Loop blocks
+
+A loop block is a [`~modular_pipelines.PipelineBlock`], but the `__call__` method behaves differently.


can we point out the look block is used to compose the self.loop_step in the LoopWrappeer example above?

yiyixuxu · 2025-07-31T04:14:42Z

docs/source/en/modular_diffusers/loop_sequential_pipeline_blocks.md


-Finally, assemble your loop by adding the block(s) to the wrapper:
+Use the [`~modular_pipelines.LoopSequentialPipelineBlocks.from_blocks_dict`] method to add the loop block to the loop wrapper to create [`~modular_pipelines.LoopSequentialPipelineBlocks`].


maybe we can mention that this loop take an initial value for x and add 1 each iteration;
and next example, it will add 2 at each iteration -
just so that it's easier to understand how the loop wrapper and loop block work together

yiyixuxu · 2025-07-31T04:21:32Z

docs/source/en/modular_diffusers/modular_pipeline.md


-🧪 **Experimental Feature**: Modular Diffusers is an experimental feature we are actively developing. The API may be subject to breaking changes.
+The main difference is to include an expected `output` argument in the pipeline.


for running the pipeline, the main difference is just the output argument - we are eliminating this difference soon #11944

so the main difference is really from the loading (in modular pipeline does not load components by default), since the code example here also included loading, I think we should also briefly mention it here.

yiyixuxu · 2025-07-31T04:30:25Z

docs/source/en/modular_diffusers/modular_pipeline.md


-When we create a `SequentialPipelineBlocks` from this preset, it instantiates each block class into actual block objects. Its `sub_blocks` attribute now contains these instantiated objects:
+## Adding blocks


can we add a transition from ModularPipeline to blocks here?

feel like something is missing: it is a bit sudden and unclear why we are talking about adding blocks right after introducing the ModularPipeline

we can just say that we need blocks to create pipeline, you can write your own blocks, or use the official ones from diffusers (the casee here)

and we really need to talk about how to create these ready-to-use blocks (if we want to talk about it in a different section, we can add a link here)

yiyixuxu · 2025-07-31T04:51:54Z

docs/source/en/modular_diffusers/guiders.md

+
+## Loading custom guiders
+
+Guiders that are already saved on the Hub with a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component.


Suggested change

Guiders that are already saved on the Hub with a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component.

Guiders that are already saved on the Hub and listed in a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component.

pcuenca

Looking good! I just went through the Overview and Quickstart for now.

In the overview section, I'm missing a bit of context about what this provides. We could perhaps state that this is a way to implement new diffusion pipelines based on existing library components, making it possible for the community to use our model with diffusers without having to open a PR to diffusers. (Assuming that's the overall goal).

pcuenca · 2025-07-30T16:13:41Z

docs/source/en/modular_diffusers/overview.md


-<Tip warning={true}>
+> [!WARNING]
+> ⚠︎ Modular Diffusers is still under active development and it's API may change.


Suggested change

> ⚠︎ Modular Diffusers is still under active development and it's API may change.

> ⚠︎ Modular Diffusers is still under active development and its API may change.

pcuenca · 2025-07-30T16:15:11Z

docs/source/en/modular_diffusers/overview.md


-**Assemble Like LEGO®**: You can mix and match between blocks in flexible ways. This allows you to write dedicated blocks unique to specific workflows, and then assemble different blocks into a pipeline that can be used more conveniently for multiple workflows. 
+- A [quickstart](./quickstart) start for implementing an example workflow with Modular Diffusers.


Suggested change

- A [quickstart](./quickstart) start for implementing an example workflow with Modular Diffusers.

- A [quickstart](./quickstart) guide for implementing an example workflow with Modular Diffusers.

pcuenca · 2025-07-30T16:17:20Z

docs/source/en/modular_diffusers/overview.md


+## ModularPipeline


From here on, it doesn't seem to match the structure in the toc tree.

pcuenca · 2025-07-31T07:48:33Z

docs/source/en/modular_diffusers/quickstart.md

+- [`LoopSequentialPipelineBlocks`] is a multi-block that runs iteratively and is designed for iterative workflows.
+- [`AutoPipelineBlocks`] is a collection of blocks for different workflows and it selects which block to run based on the input. It is designed to conveniently package multiple workflows into a single pipeline.
+
+[Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow. Start with the `IMAGE2IMAGE_BLOCKS` preset, a collection of `ModularPipelineBlocks` for image-to-image generation.


Suggested change

[Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow. Start with the `IMAGE2IMAGE_BLOCKS` preset, a collection of `ModularPipelineBlocks` for image-to-image generation.

This may still sound too abstract, but we can usually get started with block _presets_ provided by modular diffusers. In this case, [Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow, so we can adopt the `IMAGE2IMAGE_BLOCKS` preset – a collection of `ModularPipelineBlocks` for image-to-image generation.

pcuenca · 2025-07-31T07:52:00Z

docs/source/en/modular_diffusers/quickstart.md

+IMAGE2IMAGE_BLOCKS = InsertableDict([
+    ("text_encoder", StableDiffusionXLTextEncoderStep),
+    ("image_encoder", StableDiffusionXLVaeEncoderStep),
+    ("input", StableDiffusionXLInputStep),
+    ("set_timesteps", StableDiffusionXLImg2ImgSetTimestepsStep),
+    ("prepare_latents", StableDiffusionXLImg2ImgPrepareLatentsStep),
+    ("prepare_add_cond", StableDiffusionXLImg2ImgPrepareAdditionalConditioningStep),
+    ("denoise", StableDiffusionXLDenoiseStep),
+    ("decode", StableDiffusionXLDecodeStep)
+])


I don't understand this, are we overwriting the preset definition? I think we may just want to print IMAGE2IMAGE_BLOCKS to see what it contains:

>>> print(IMAGE2IMAGE_BLOCKS) InsertableDict([ 0: ('text_encoder', <class 'diffusers.modular_pipelines.stable_diffusion_xl.encoders.StableDiffusionXLTextEncoderStep'>), 1: ('image_encoder', <class 'diffusers.modular_pipelines.stable_diffusion_xl.encoders.StableDiffusionXLVaeEncoderStep'>), 2: ('input', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLInputStep'>), 3: ('set_timesteps', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLImg2ImgSetTimestepsStep'>), 4: ('prepare_latents', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLImg2ImgPrepareLatentsStep'>), 5: ('prepare_add_cond', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLImg2ImgPrepareAdditionalConditioningStep'>), 6: ('denoise', <class 'diffusers.modular_pipelines.stable_diffusion_xl.denoise.StableDiffusionXLDenoiseStep'>), 7: ('decode', <class 'diffusers.modular_pipelines.stable_diffusion_xl.decoders.StableDiffusionXLDecodeStep'>) ])

Also some brief commentary might be helpful, like the denoise block being made with a loop block. This comes later in the guide, but we could still anticipate a bit so the pieces start to click.

pcuenca · 2025-07-31T08:22:58Z

docs/source/en/modular_diffusers/quickstart.md

+
+### IP-Adapter
+
+Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline.


Suggested change

Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline.

Stable Diffusion XL already has an IP-Adapter block preset that you can use, and it doesn't require any changes to work with the existing Differential Diffusion pipeline we created.

pcuenca · 2025-07-31T08:23:33Z

docs/source/en/modular_diffusers/quickstart.md

+Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline.
+
+```py
+from diffusers.modular_pipelines.stable_diffusion_xl.encoders import StableDiffusionXLAutoIPAdapterStep


We are not using a "preset" in the same sense as in the previous instances, just a block.

pcuenca · 2025-07-31T08:24:35Z

docs/source/en/modular_diffusers/quickstart.md

+
+### AutoPipelineBlocks
+
+The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the Differential Diffusion.


Suggested change

The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the Differential Diffusion.

The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the standard Differential Diffusion implementation.

pcuenca · 2025-07-31T08:24:52Z

docs/source/en/modular_diffusers/quickstart.md

+
+### ControlNet
+
+Stable Diffusion XL already has a preset ControlNet block that can readily be used.


Same comments as in the ip-adapter case.

pcuenca · 2025-07-31T08:26:12Z

docs/source/en/modular_diffusers/quickstart.md

+
+components = ComponentsManager()
+
+diffdiff_pipeline = ModularPipeline.from_pretrained("YiYiXu/modular-diffdiff-0704", trust_remote_code=True, components_manager=components, collection="diffdiff")


Suggested change

diffdiff_pipeline = ModularPipeline.from_pretrained("YiYiXu/modular-diffdiff-0704", trust_remote_code=True, components_manager=components, collection="diffdiff")

diffdiff_pipeline = ModularPipeline.from_pretrained(

"YiYiXu/modular-diffdiff-0704",

trust_remote_code=True,

components_manager=components,

collection="diffdiff"

)

For more clarity. Also, a comment about trust_remote_code could be helpful.

stevhliu changed the title ~~[docs] Modular diffusers quickstart~~ [docs] Modular diffusers Jul 16, 2025

stevhliu force-pushed the modular-diffusers branch from e445322 to 0f5330b Compare July 18, 2025 23:40

stevhliu added 12 commits July 30, 2025 10:52

start

219bdf4

draft

5ffc8de

state, pipelineblock, apis

10b84fe

sequential

b49a338

fix links

cc71c0b

new

274289f

loop, auto

a5aa75c

fix

b9327d8

pipeline

ca2b9b3

guiders

6bf17dd

components manager

3d17125

reviews

5ee815b

stevhliu force-pushed the modular-diffusers branch from 2aa3ef4 to 5ee815b Compare July 30, 2025 19:03

stevhliu marked this pull request as ready for review July 30, 2025 20:13

stevhliu requested a review from yiyixuxu July 30, 2025 20:13

yiyixuxu approved these changes Jul 31, 2025

View reviewed changes

pcuenca reviewed Jul 31, 2025

View reviewed changes


		[Differential Diffusion](https://differential-diffusion.github.io/) differs from standard image-to-image in its `prepare_latents` and `denoise` blocks. All the other blocks can be reused, but you'll need to modify these two.

		Create placeholder `PipelineBlocks` for `prepare_latents` and `denoise` by copying and modifying the existing ones.

	If a variable is modified in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`].
	If a new variable is added in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`].

		@@ -12,83 +12,63 @@ specific language governing permissions and limitations under the License.

		# PipelineBlock

		<Tip warning={true}>


		Finally, assemble your loop by adding the block(s) to the wrapper:
		Use the [`~modular_pipelines.LoopSequentialPipelineBlocks.from_blocks_dict`] method to add the loop block to the loop wrapper to create [`~modular_pipelines.LoopSequentialPipelineBlocks`].


		🧪 Experimental Feature: Modular Diffusers is an experimental feature we are actively developing. The API may be subject to breaking changes.
		The main difference is to include an expected `output` argument in the pipeline.


		When we create a `SequentialPipelineBlocks` from this preset, it instantiates each block class into actual block objects. Its `sub_blocks` attribute now contains these instantiated objects:
		## Adding blocks


		## Loading custom guiders

		Guiders that are already saved on the Hub with a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component.

	> ⚠︎ Modular Diffusers is still under active development and it's API may change.
	> ⚠︎ Modular Diffusers is still under active development and its API may change.


		Assemble Like LEGO®: You can mix and match between blocks in flexible ways. This allows you to write dedicated blocks unique to specific workflows, and then assemble different blocks into a pipeline that can be used more conveniently for multiple workflows.
		- A [quickstart](./quickstart) start for implementing an example workflow with Modular Diffusers.

	- A [quickstart](./quickstart) start for implementing an example workflow with Modular Diffusers.
	- A [quickstart](./quickstart) guide for implementing an example workflow with Modular Diffusers.

	[Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow. Start with the `IMAGE2IMAGE_BLOCKS` preset, a collection of `ModularPipelineBlocks` for image-to-image generation.
	This may still sound too abstract, but we can usually get started with block _presets_ provided by modular diffusers. In this case, [Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow, so we can adopt the `IMAGE2IMAGE_BLOCKS` preset – a collection of `ModularPipelineBlocks` for image-to-image generation.


		### IP-Adapter

		Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline.

	Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline.
	Stable Diffusion XL already has an IP-Adapter block preset that you can use, and it doesn't require any changes to work with the existing Differential Diffusion pipeline we created.


		### AutoPipelineBlocks

		The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the Differential Diffusion.


		### ControlNet

		Stable Diffusion XL already has a preset ControlNet block that can readily be used.


		components = ComponentsManager()

		diffdiff_pipeline = ModularPipeline.from_pretrained("YiYiXu/modular-diffdiff-0704", trust_remote_code=True, components_manager=components, collection="diffdiff")

[docs] Modular diffusers #11931

Are you sure you want to change the base?

[docs] Modular diffusers #11931

Uh oh!

Conversation

stevhliu commented Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jul 15, 2025

Uh oh!

stevhliu commented Jul 23, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

stevhliu commented Jul 15, 2025 •

edited

Loading