-
Notifications
You must be signed in to change notification settings - Fork 6.2k
[docs] Modular diffusers #11931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[docs] Modular diffusers #11931
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
e445322
to
0f5330b
Compare
Ok, I think I have an initial version of the refactored modular docs @huggingface/diffusers!
|
2aa3ef4
to
5ee815b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot @stevhliu!
|
||
[Differential Diffusion](https://differential-diffusion.github.io/) differs from standard image-to-image in its `prepare_latents` and `denoise` blocks. All the other blocks can be reused, but you'll need to modify these two. | ||
|
||
Create placeholder `PipelineBlocks` for `prepare_latents` and `denoise` by copying and modifying the existing ones. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the flow here is we start with rewriting these two special blocks and then assemble them
if we are going with this flow, we don't need to create placeholder here (my initial process is to create complete structure first, hence we needed the placeholder, I found it more efficient that way but less intuitive, so what you did here is more beginer friendly so let's go with this)
we can just jump to the prepare_latents
section, and in denoise
section, first go over the block's structure, and then how to rewrite the custom sub-block and finally assemble everything back together into a custom denoise
before assemble the entire pipeline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. But not opposed to showing the denoise block structure early for illustration (without the placeholders)
dd_pipeline = dd_auto_blocks.init_pipeline("YiYiXu/modular-demo-auto", collection="diffdiff") | ||
dd_pipeline.load_default_components(torch_dtype=torch.float16) | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's link to the full implementation for this example here https://huggingface.co/YiYiXu/modular-diffdiff/blob/main/block.py
If a variable is modified in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a variable is modified in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`]. | |
If a new variable is added in `block_state` but not declared as an `intermediate_outputs`, it won't be added to [`~modular_pipelines.PipelineState`]. |
a bit confusing here I think, lol but it is what it is
@DN6 is doing some refactor in #11969; we will be able to simplify things a lot and make this section of doc more intuitive in a future PR
@@ -12,83 +12,63 @@ specific language governing permissions and limitations under the License. | |||
|
|||
# PipelineBlock | |||
|
|||
<Tip warning={true}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we don't want to show a complete example of a pipeline block here, maybe we can link to some source code ? https://github.com/huggingface/diffusers/blob/main/src/diffusers/modular_pipelines/stable_diffusion_xl/encoders.py
* No need to call `self.get_block_state()` or `self.set_block_state()` | ||
## Loop blocks | ||
|
||
A loop block is a [`~modular_pipelines.PipelineBlock`], but the `__call__` method behaves differently. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we point out the look block is used to compose the self.loop_step
in the LoopWrappeer example above?
|
||
Finally, assemble your loop by adding the block(s) to the wrapper: | ||
Use the [`~modular_pipelines.LoopSequentialPipelineBlocks.from_blocks_dict`] method to add the loop block to the loop wrapper to create [`~modular_pipelines.LoopSequentialPipelineBlocks`]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can mention that this loop take an initial value for x and add 1 each iteration;
and next example, it will add 2 at each iteration -
just so that it's easier to understand how the loop wrapper and loop block work together
|
||
🧪 **Experimental Feature**: Modular Diffusers is an experimental feature we are actively developing. The API may be subject to breaking changes. | ||
The main difference is to include an expected `output` argument in the pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for running the pipeline, the main difference is just the output
argument - we are eliminating this difference soon #11944
so the main difference is really from the loading (in modular pipeline does not load components by default), since the code example here also included loading, I think we should also briefly mention it here.
|
||
When we create a `SequentialPipelineBlocks` from this preset, it instantiates each block class into actual block objects. Its `sub_blocks` attribute now contains these instantiated objects: | ||
## Adding blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a transition from ModularPipeline
to blocks here?
feel like something is missing: it is a bit sudden and unclear why we are talking about adding blocks right after introducing the ModularPipeline
we can just say that we need blocks to create pipeline, you can write your own blocks, or use the official ones from diffusers (the casee here)
and we really need to talk about how to create these ready-to-use blocks (if we want to talk about it in a different section, we can add a link here)
|
||
## Loading custom guiders | ||
|
||
Guiders that are already saved on the Hub with a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guiders that are already saved on the Hub with a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component. | |
Guiders that are already saved on the Hub and listed in a `modular_model_index.json` file are considered a `from_pretrained` component now instead of a `from_config` component. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I just went through the Overview and Quickstart for now.
In the overview section, I'm missing a bit of context about what this provides. We could perhaps state that this is a way to implement new diffusion pipelines based on existing library components, making it possible for the community to use our model with diffusers without having to open a PR to diffusers. (Assuming that's the overall goal).
|
||
<Tip warning={true}> | ||
> [!WARNING] | ||
> ⚠︎ Modular Diffusers is still under active development and it's API may change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
> ⚠︎ Modular Diffusers is still under active development and it's API may change. | |
> ⚠︎ Modular Diffusers is still under active development and its API may change. |
|
||
**Assemble Like LEGO®**: You can mix and match between blocks in flexible ways. This allows you to write dedicated blocks unique to specific workflows, and then assemble different blocks into a pipeline that can be used more conveniently for multiple workflows. | ||
- A [quickstart](./quickstart) start for implementing an example workflow with Modular Diffusers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A [quickstart](./quickstart) start for implementing an example workflow with Modular Diffusers. | |
- A [quickstart](./quickstart) guide for implementing an example workflow with Modular Diffusers. |
|
||
## ModularPipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From here on, it doesn't seem to match the structure in the toc tree.
- [`LoopSequentialPipelineBlocks`] is a multi-block that runs iteratively and is designed for iterative workflows. | ||
- [`AutoPipelineBlocks`] is a collection of blocks for different workflows and it selects which block to run based on the input. It is designed to conveniently package multiple workflows into a single pipeline. | ||
|
||
[Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow. Start with the `IMAGE2IMAGE_BLOCKS` preset, a collection of `ModularPipelineBlocks` for image-to-image generation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow. Start with the `IMAGE2IMAGE_BLOCKS` preset, a collection of `ModularPipelineBlocks` for image-to-image generation. | |
This may still sound too abstract, but we can usually get started with block _presets_ provided by modular diffusers. In this case, [Differential Diffusion](https://differential-diffusion.github.io/) is an image-to-image workflow, so we can adopt the `IMAGE2IMAGE_BLOCKS` preset – a collection of `ModularPipelineBlocks` for image-to-image generation. |
IMAGE2IMAGE_BLOCKS = InsertableDict([ | ||
("text_encoder", StableDiffusionXLTextEncoderStep), | ||
("image_encoder", StableDiffusionXLVaeEncoderStep), | ||
("input", StableDiffusionXLInputStep), | ||
("set_timesteps", StableDiffusionXLImg2ImgSetTimestepsStep), | ||
("prepare_latents", StableDiffusionXLImg2ImgPrepareLatentsStep), | ||
("prepare_add_cond", StableDiffusionXLImg2ImgPrepareAdditionalConditioningStep), | ||
("denoise", StableDiffusionXLDenoiseStep), | ||
("decode", StableDiffusionXLDecodeStep) | ||
]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this, are we overwriting the preset definition? I think we may just want to print IMAGE2IMAGE_BLOCKS
to see what it contains:
>>> print(IMAGE2IMAGE_BLOCKS)
InsertableDict([
0: ('text_encoder', <class 'diffusers.modular_pipelines.stable_diffusion_xl.encoders.StableDiffusionXLTextEncoderStep'>),
1: ('image_encoder', <class 'diffusers.modular_pipelines.stable_diffusion_xl.encoders.StableDiffusionXLVaeEncoderStep'>),
2: ('input', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLInputStep'>),
3: ('set_timesteps', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLImg2ImgSetTimestepsStep'>),
4: ('prepare_latents', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLImg2ImgPrepareLatentsStep'>),
5: ('prepare_add_cond', <class 'diffusers.modular_pipelines.stable_diffusion_xl.before_denoise.StableDiffusionXLImg2ImgPrepareAdditionalConditioningStep'>),
6: ('denoise', <class 'diffusers.modular_pipelines.stable_diffusion_xl.denoise.StableDiffusionXLDenoiseStep'>),
7: ('decode', <class 'diffusers.modular_pipelines.stable_diffusion_xl.decoders.StableDiffusionXLDecodeStep'>)
])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also some brief commentary might be helpful, like the denoise
block being made with a loop block. This comes later in the guide, but we could still anticipate a bit so the pieces start to click.
|
||
### IP-Adapter | ||
|
||
Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline. | |
Stable Diffusion XL already has an IP-Adapter block preset that you can use, and it doesn't require any changes to work with the existing Differential Diffusion pipeline we created. |
Stable Diffusion XL already has a preset IP-Adapter block that you can use and doesn't require any changes to the existing Differential Diffusion pipeline. | ||
|
||
```py | ||
from diffusers.modular_pipelines.stable_diffusion_xl.encoders import StableDiffusionXLAutoIPAdapterStep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not using a "preset" in the same sense as in the previous instances, just a block.
|
||
### AutoPipelineBlocks | ||
|
||
The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the Differential Diffusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the Differential Diffusion. | |
The Differential Diffusion, IP-Adapter, and ControlNet workflows can be bundled into a single [`ModularPipeline`] by using [`AutoPipelineBlocks`]. This allows automatically selecting which sub-blocks to run based on the inputs like `control_image` or `ip_adapter_image`. If none of these inputs are passed, then it defaults to the standard Differential Diffusion implementation. |
|
||
### ControlNet | ||
|
||
Stable Diffusion XL already has a preset ControlNet block that can readily be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments as in the ip-adapter case.
|
||
components = ComponentsManager() | ||
|
||
diffdiff_pipeline = ModularPipeline.from_pretrained("YiYiXu/modular-diffdiff-0704", trust_remote_code=True, components_manager=components, collection="diffdiff") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diffdiff_pipeline = ModularPipeline.from_pretrained("YiYiXu/modular-diffdiff-0704", trust_remote_code=True, components_manager=components, collection="diffdiff") | |
diffdiff_pipeline = ModularPipeline.from_pretrained( | |
"YiYiXu/modular-diffdiff-0704", | |
trust_remote_code=True, | |
components_manager=components, | |
collection="diffdiff" | |
) |
For more clarity. Also, a comment about trust_remote_code
could be helpful.
Draft for a quickstart that should ideally briefly summarize everything a developer needs to know about Modular Diffusers without referencing other resources.
edit: expanding scope into other modular docs as well with additions like API references
(bit of a brain dump at the moment, not ready for review yet!)