Diffusion Model

Architecture

Simple Definition

The AI architecture behind most image, audio, and video generation — learns to reverse a noise-adding process to generate content.

Full Explanation

Diffusion models work by gradually adding Gaussian noise to training data until it becomes pure noise, then training a neural network to reverse this process. During generation, the model starts from random noise and iteratively denoises it into a coherent image or audio clip. Stable Diffusion, DALL-E 3, Midjourney, Sora, and most modern image/video generators use diffusion-based architectures.

Related Terms

Transformer

The neural network architecture that underpins all modern large language models, introduced by Google in 2017.

Multimodal AI

AI systems that can process and generate multiple types of data — text, images, audio, video, and code.

Last verified: 2026-03-30← Back to Glossary