Skip to content

Commit 478df93

Browse files
sayakpaulstevhliu
andauthored
[docs] clarify the mapping between Transformer2DModel and finegrained variants. (#11947)
* clarify the mapping between Transformer2DModel and finegrained variants. * Update src/diffusers/pipelines/dit/pipeline_dit.py Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
1 parent 18c8f10 commit 478df93

File tree

3 files changed

+26
-2
lines changed

3 files changed

+26
-2
lines changed

src/diffusers/pipelines/dit/pipeline_dit.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ class DiTPipeline(DiffusionPipeline):
4646
4747
Parameters:
4848
transformer ([`DiTTransformer2DModel`]):
49-
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents.
49+
A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents. Initially published as
50+
[`Transformer2DModel`](https://huggingface.co/facebook/DiT-XL-2-256/blob/main/transformer/config.json#L2)
51+
in the config, but the mismatch can be ignored.
5052
vae ([`AutoencoderKL`]):
5153
Variational Auto-Encoder (VAE) model to encode and decode images to and from latent representations.
5254
scheduler ([`DDIMScheduler`]):

src/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -256,7 +256,9 @@ class PixArtAlphaPipeline(DiffusionPipeline):
256256
Tokenizer of class
257257
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
258258
transformer ([`PixArtTransformer2DModel`]):
259-
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents.
259+
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
260+
[`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/transformer/config.json#L2)
261+
in the config, but the mismatch can be ignored.
260262
scheduler ([`SchedulerMixin`]):
261263
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
262264
"""

src/diffusers/pipelines/pixart_alpha/pipeline_pixart_sigma.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,26 @@ def retrieve_timesteps(
185185
class PixArtSigmaPipeline(DiffusionPipeline):
186186
r"""
187187
Pipeline for text-to-image generation using PixArt-Sigma.
188+
189+
This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
190+
library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
191+
192+
Args:
193+
vae ([`AutoencoderKL`]):
194+
Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
195+
text_encoder ([`T5EncoderModel`]):
196+
Frozen text-encoder. PixArt-Alpha uses
197+
[T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5EncoderModel), specifically the
198+
[t5-v1_1-xxl](https://huggingface.co/PixArt-alpha/PixArt-alpha/tree/main/t5-v1_1-xxl) variant.
199+
tokenizer (`T5Tokenizer`):
200+
Tokenizer of class
201+
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
202+
transformer ([`PixArtTransformer2DModel`]):
203+
A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
204+
[`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/transformer/config.json#L2)
205+
in the config, but the mismatch can be ignored.
206+
scheduler ([`SchedulerMixin`]):
207+
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
188208
"""
189209

190210
bad_punct_regex = re.compile(

0 commit comments

Comments
 (0)