In this section, we implement the forward process of diffusion models using the formula:
$$ x_t = \sqrt{\bar\alpha_t} x_0 + \sqrt{1 - \bar\alpha_t} \epsilon \quad \text{where}~ \epsilon \sim N(0, 1) $$
Key variables:
Steps:
In this section, Gaussian blur is applied to noisy images generated from the forward process to evaluate the denoising quality. Steps:
forward() function to generate images at different noise levels.kernel_size=5) to denoise the images.
In this section, the goal is to denoise images in one step by predicting noise using a UNet model and reconstructing the original image based on the given formula:
$$ x_t = \sqrt{\bar\alpha_t} x_0 + \sqrt{1 - \bar\alpha_t} \epsilon $$
Steps:
forward() function.$$ x_0 = \frac{x_t - \sqrt{1 - \bar\alpha_t} \epsilon}{\sqrt{\bar\alpha_t}} $$
In this section, the iterative denoising process is performed by gradually refining the noisy image using the formula:
$$ x_{t'} = \frac{\sqrt{\bar\alpha_{t'}}\beta_t}{1 - \bar\alpha_t} x_0 + \frac{\sqrt{\alpha_t}(1 - \bar\alpha_{t'})}{1 - \bar\alpha_t} x_t + v_\sigma $$
Steps:
In this section, we generate images from random noise by applying the iterative denoising process guided by a text prompt.
Steps:
Classifier-Free Guidance (CFG) improves image quality by enhancing the conditional noise estimation based on the formula:
$$ \epsilon = \epsilon_u + \gamma (\epsilon_c - \epsilon_u) $$
Steps:
Steps:
forward function.
Steps:
In this section, specific text prompts are used to guide the image generation process. The noise level controls how much of the original image's features are retained. Steps:
forward function.
In this section, we create visual anagrams by averaging noise estimations from two different prompts, one for the image and one for its flipped version.
Key formulas:
In this section, we generate hybrid images by combining the low-frequency and high-frequency components of two images based on different prompts.
Key formulas:
Steps: