File tree Expand file tree Collapse file tree 3 files changed +26
-2
lines changed Expand file tree Collapse file tree 3 files changed +26
-2
lines changed Original file line number Diff line number Diff line change @@ -46,7 +46,9 @@ class DiTPipeline(DiffusionPipeline):
46
46
47
47
Parameters:
48
48
transformer ([`DiTTransformer2DModel`]):
49
- A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents.
49
+ A class conditioned `DiTTransformer2DModel` to denoise the encoded image latents. Initially published as
50
+ [`Transformer2DModel`](https://huggingface.co/facebook/DiT-XL-2-256/blob/main/transformer/config.json#L2)
51
+ in the config, but the mismatch can be ignored.
50
52
vae ([`AutoencoderKL`]):
51
53
Variational Auto-Encoder (VAE) model to encode and decode images to and from latent representations.
52
54
scheduler ([`DDIMScheduler`]):
Original file line number Diff line number Diff line change @@ -256,7 +256,9 @@ class PixArtAlphaPipeline(DiffusionPipeline):
256
256
Tokenizer of class
257
257
[T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
258
258
transformer ([`PixArtTransformer2DModel`]):
259
- A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents.
259
+ A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
260
+ [`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-XL-2-1024-MS/blob/main/transformer/config.json#L2)
261
+ in the config, but the mismatch can be ignored.
260
262
scheduler ([`SchedulerMixin`]):
261
263
A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
262
264
"""
Original file line number Diff line number Diff line change @@ -185,6 +185,26 @@ def retrieve_timesteps(
185
185
class PixArtSigmaPipeline (DiffusionPipeline ):
186
186
r"""
187
187
Pipeline for text-to-image generation using PixArt-Sigma.
188
+
189
+ This model inherits from [`DiffusionPipeline`]. Check the superclass documentation for the generic methods the
190
+ library implements for all the pipelines (such as downloading or saving, running on a particular device, etc.)
191
+
192
+ Args:
193
+ vae ([`AutoencoderKL`]):
194
+ Variational Auto-Encoder (VAE) Model to encode and decode images to and from latent representations.
195
+ text_encoder ([`T5EncoderModel`]):
196
+ Frozen text-encoder. PixArt-Alpha uses
197
+ [T5](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5EncoderModel), specifically the
198
+ [t5-v1_1-xxl](https://huggingface.co/PixArt-alpha/PixArt-alpha/tree/main/t5-v1_1-xxl) variant.
199
+ tokenizer (`T5Tokenizer`):
200
+ Tokenizer of class
201
+ [T5Tokenizer](https://huggingface.co/docs/transformers/model_doc/t5#transformers.T5Tokenizer).
202
+ transformer ([`PixArtTransformer2DModel`]):
203
+ A text conditioned `PixArtTransformer2DModel` to denoise the encoded image latents. Initially published as
204
+ [`Transformer2DModel`](https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/blob/main/transformer/config.json#L2)
205
+ in the config, but the mismatch can be ignored.
206
+ scheduler ([`SchedulerMixin`]):
207
+ A scheduler to be used in combination with `transformer` to denoise the encoded image latents.
188
208
"""
189
209
190
210
bad_punct_regex = re .compile (
You can’t perform that action at this time.
0 commit comments