Skip to content

stable-diffusion-2-depth training problems #11925

Answered by asomoza
Henry-Bi asked this question in Q&A
Discussion options

You must be logged in to vote

I think you're in the right track, don't really know if MM-DiT models are good for super resolution tasks, I don't use them normally because they're really slow and resource hungry, upscaling with them is even slower and you need a very high end GPU to be able to do it, and the benefits right now aren't that great to justify it.

What I do know is that the current SOTA model for super resolution is called SUPIR and it's based in SDXL, so a U-Net. Also there is a model called Stable Cascade that worked really well with a small latent and in the second stage upscale it with a second U-Net.

The other solution right now is to train or use a Tile ControlNet and do a tiled img2img over an image …

Replies: 1 comment 5 replies

Comment options

You must be logged in to vote
5 replies
@Henry-Bi
Comment options

@asomoza
Comment options

@Henry-Bi
Comment options

@asomoza
Comment options

Answer selected by Henry-Bi
@Henry-Bi
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants