ComfyUI Update: Stable Video Diffusion on 8GB vram with 25 frames and more
Stable Video Diffusion, LCM models, Kohya Deep Shrink, ZSNR V Prediction Models, And More
Stable Video Diffusion
ComfyUI now supports the new Stable Video Diffusion image to video model. With ComfyUI you can generate 1024x576 videos of 25 frames long on a GTX 1080 with 8GB vram. I can confirm that it also works on my AMD 6800XT with ROCm on Linux.
LCM
LCM models are models that can be sampled in very few steps. Recently loras have been released to convert regular SDXL and SD1.x models to LCM.
Kohya Deep Shrink
The _for_testing->PatchModelAddDownscale node adds a downscale to the unet that can be scheduled so that it only happens during the first timesteps of the model. This lets you generate consistent images at higher resolutions without having to do a second pass.
Support for ZSNR V Prediction Models
The new ModelSamplingDiscrete node lets you set a model as v_prediction zsnr to sample them properly.
The new RescaleCFG node implements the rescale cfg algorithm in the zsnr paper and should also be used with these models to sample them properly.
To use a ZSNR v_pred model use the regular checkpoint loader node to load it then chain the ModelSamplingDiscrete with v_pred and zsnr selected. You can then add the RescaleCFG node.
Comments
None