Stable Diffusion, a Stable Latent Diffusion Model for image generation, v1 with an autoencoder factor of 8 and UNet size of 860M, leverages conditioning mechanisms through cross-attention for multi-modal training. It introduces perceptual compression to reduce computational complexity and facilitate use on consumer-grade GPUs. The model's...