📡 Model Update Tracker
The AI changelog you wish existed. Every release, price change, and deprecation — in one place.
25
Total events tracked
6
Price changes
2
Deprecations
March 30, 2026
Last updated
Claude 3.7 Sonnet Released
Anthropic — Claude 3.7 Sonnet
Introduces extended thinking mode with visible chain-of-thought reasoning. 200k token context window. Outperforms GPT-4o on SWE-bench coding tasks and complex multi-step reasoning.
GPT-4.1 Released
OpenAI — GPT-4.1
Dramatically improved coding performance, reaching 54.6% on SWE-bench Verified. 1M token context window. Replaces GPT-4o as OpenAI's flagship model.
Gemini 2.0 Flash Pricing Cut 50%
Google — Gemini 2.0 Flash
Google slashes Gemini 2.0 Flash pricing by 50%, now at $0.075/M input and $0.30/M output tokens. Remains one of the fastest and most cost-effective models available.
Llama 3.3 70B Released — Open Source
Meta — Llama 3.3 70B
Meta releases Llama 3.3 70B, an open-source model that matches GPT-4o performance on key benchmarks at a fraction of the cost. Supports 128k context and multilingual tasks.
o3-mini Released — Affordable Reasoning
OpenAI — OpenAI o3-mini
OpenAI launches o3-mini, a cost-efficient reasoning model. Three effort levels (low/medium/high). Outperforms o1 on math and science benchmarks at significantly lower cost.
DeepSeek V3 Launches at $0.27/M Input Tokens
DeepSeek — DeepSeek V3
DeepSeek V3 debuts with aggressive pricing at $0.27/M input and $1.10/M output tokens. Delivers GPT-4o-class performance, triggering a global AI pricing response.
Gemini 2.0 Pro Released
Google — Gemini 2.0 Pro
Google's most capable Gemini model yet. Native multimodal reasoning, 2M token context, improved agentic capabilities, and tighter Workspace integration.
GPT-4o Context Window Increased to 128k
OpenAI — GPT-4o
OpenAI doubles the GPT-4o context window from 64k to 128k tokens across all API tiers, with no pricing change. Enables longer document processing without chunking.
Claude 3.5 Haiku Released — Fast & Affordable
Anthropic — Claude 3.5 Haiku
Anthropic's fastest and most affordable Claude model. Processes 200k token context at sub-second latency. Matches Claude 3 Opus performance at 10x lower cost.
Mistral Large 2 Pricing Reduced 30%
Mistral AI — Mistral Large 2
Mistral AI reduces Mistral Large 2 pricing by 30%, now at $2/M input and $6/M output tokens. Aims to compete directly with GPT-4o on cost-performance ratio.
GPT-4 Deprecated — Migration Required
OpenAI — GPT-4
OpenAI officially deprecates GPT-4 (gpt-4-0314, gpt-4-0613). All API calls must migrate to GPT-4o or GPT-4 Turbo. Legacy endpoints return errors after this date.
Llama 3.2 Multimodal Models Released
Meta — Llama 3.2
Meta releases Llama 3.2 with 11B and 90B vision variants. First open-source Llama models with native image understanding. 128k context. Released under Meta Llama 3.2 license.
OpenAI o1 Released — Reasoning-First Model
OpenAI — OpenAI o1
OpenAI launches o1, a model trained to reason before responding. Thinks through problems for up to 30 seconds. Tops AIME, GPQA, and Codeforces rankings. Available in ChatGPT Plus.
Claude 3.5 Sonnet with Computer Use
Anthropic — Claude 3.5 Sonnet
Anthropic releases the upgraded Claude 3.5 Sonnet with computer use capability in public beta. Can control desktop apps, browse the web, and write/execute code autonomously.
Gemini 1.5 Pro Reaches 2M Token Context
Google — Gemini 1.5 Pro
Google expands Gemini 1.5 Pro's context window to 2 million tokens — the largest context window of any production LLM. Enables processing of entire codebases or books in a single call.
GPT-4o Pricing Cut 50%
OpenAI — GPT-4o
OpenAI halves GPT-4o pricing to $5/M input and $15/M output tokens. Responds to increased competition from Gemini and Claude. Batch API discounts increased to 50% off.
Mistral Large 2 Released
Mistral AI — Mistral Large 2
Mistral AI releases Mistral Large 2 with 128k context, improved multilingual support (13 languages), and strong coding capabilities. Benchmarks competitive with Claude 3.5 Sonnet.
Llama 3.1 405B Released — Largest Open Source LLM
Meta — Llama 3.1 405B
Meta releases Llama 3.1 405B, the largest open-source model ever. 128k context, competitive with GPT-4o on MMLU. Available for self-hosting and via major cloud providers.
Claude 3 Haiku Deprecated
Anthropic — Claude 3 Haiku
Anthropic deprecates the original Claude 3 Haiku in favour of Claude 3.5 Haiku. Existing API users receive a 30-day migration window. Pricing remains the same for the new version.
Gemini 1.5 Flash at $0.075/M Input Tokens
Google — Gemini 1.5 Flash
Google prices Gemini 1.5 Flash at $0.075/M input and $0.30/M output tokens for prompts under 128k. Makes production-scale deployments of fast multimodal AI cost-accessible.
GPT-4o Structured Outputs Becomes Generally Available
OpenAI — GPT-4o
OpenAI makes Structured Outputs GA, allowing JSON schema-constrained responses with 100% reliability. Eliminates the need for custom parsing layers in production applications.
Gemini 2.0 Flash Experimental Released
Google — Gemini 2.0 Flash Experimental
Google releases Gemini 2.0 Flash in experimental access, introducing native audio output, real-time streaming, and agentic tool-use. Marks start of the Gemini 2.0 era.
Claude 3.5 Sonnet — Initial Release
Anthropic — Claude 3.5 Sonnet (v1)
Anthropic's first 3.5-series model. Topped the LMSYS chatbot arena for multiple months. Best-in-class coding at the time, surpassing GPT-4o on HumanEval and SWE-bench.
OpenAI Realtime API Released
OpenAI — Multiple
OpenAI launches the Realtime API for low-latency speech-to-speech conversations. Enables building voice agents without a separate STT+TTS pipeline. Priced at $0.06/min input audio.
Claude 3 Haiku Price Reduction — 80% Cut
Anthropic — Claude 3 Haiku
Anthropic reduces Claude 3 Haiku pricing by 80%, to $0.25/M input and $1.25/M output. One of the most aggressive price cuts in LLM history, establishing Haiku as the budget leader.
Showing 25 of 25 events. Data last updated March 30, 2026.