Latest Research Papers
2025-01-22
arXiv
Accelerate High-Quality Diffusion Models with Inner Loop Feedback
The paper introduces Inner Loop Feedback (ILF), a method to speed up diffusion models' inference by predicting future features using a lightweight module. This approach reduces runtime while maintaining high-quality results, and it is effective for both class-to-image and text-to-image generation. The performance of ILF is validated through various metrics including FID, CLIP score, and qualitative comparisons.
We propose Inner Loop Feedback (ILF), a novel approach to accelerate
diffusion models' inference. ILF trains a lightweight module to predict future
features in the denoising process by leveraging the outputs from a chosen
diffusion backbone block at a given time step. This approach exploits two key
intuitions; (1) the outputs of a given block at adjacent time steps are
similar, and (2) performing partial computations for a step imposes a lower
burden on the model than skipping the step entirely. Our method is highly
flexible, since we find that the feedback module itself can simply be a block
from the diffusion backbone, with all settings copied. Its influence on the
diffusion forward can be tempered with a learnable scaling factor from zero
initialization. We train this module using distillation losses; however, unlike
some prior work where a full diffusion backbone serves as the student, our
model freezes the backbone, training only the feedback module. While many
efforts to optimize diffusion models focus on achieving acceptable image
quality in extremely few steps (1-4 steps), our emphasis is on matching best
case results (typically achieved in 20 steps) while significantly reducing
runtime. ILF achieves this balance effectively, demonstrating strong
performance for both class-to-image generation with diffusion transformer (DiT)
and text-to-image generation with DiT-based PixArt-alpha and PixArt-sigma. The
quality of ILF's 1.7x-1.8x speedups are confirmed by FID, CLIP score, CLIP
Image Quality Assessment, ImageReward, and qualitative comparisons. Project
information is available at https://mgwillia.github.io/ilf.