📡 Model Update Tracker

The AI changelog you wish existed. Every release, price change, and deprecation — in one place.

Total events tracked

Price changes

Deprecations

March 30, 2026

Last updated

March 2026ReleaseHigh impact

Claude 3.7 Sonnet Released

Anthropic — Claude 3.7 Sonnet

Introduces extended thinking mode with visible chain-of-thought reasoning. 200k token context window. Outperforms GPT-4o on SWE-bench coding tasks and complex multi-step reasoning.

March 2026ReleaseHigh impact

GPT-4.1 Released

OpenAI — GPT-4.1

Dramatically improved coding performance, reaching 54.6% on SWE-bench Verified. 1M token context window. Replaces GPT-4o as OpenAI's flagship model.

February 2026PricingMedium impact

Gemini 2.0 Flash Pricing Cut 50%

Google — Gemini 2.0 Flash

Google slashes Gemini 2.0 Flash pricing by 50%, now at $0.075/M input and $0.30/M output tokens. Remains one of the fastest and most cost-effective models available.

February 2026ReleaseHigh impact

Llama 3.3 70B Released — Open Source

Meta — Llama 3.3 70B

Meta releases Llama 3.3 70B, an open-source model that matches GPT-4o performance on key benchmarks at a fraction of the cost. Supports 128k context and multilingual tasks.

January 2026ReleaseHigh impact

o3-mini Released — Affordable Reasoning

OpenAI — OpenAI o3-mini

OpenAI launches o3-mini, a cost-efficient reasoning model. Three effort levels (low/medium/high). Outperforms o1 on math and science benchmarks at significantly lower cost.

January 2026PricingHigh impact

DeepSeek V3 Launches at $0.27/M Input Tokens

DeepSeek — DeepSeek V3

DeepSeek V3 debuts with aggressive pricing at $0.27/M input and $1.10/M output tokens. Delivers GPT-4o-class performance, triggering a global AI pricing response.

December 2025ReleaseHigh impact

Gemini 2.0 Pro Released

Google — Gemini 2.0 Pro

Google's most capable Gemini model yet. Native multimodal reasoning, 2M token context, improved agentic capabilities, and tighter Workspace integration.

December 2025API ChangeMedium impact

GPT-4o Context Window Increased to 128k

OpenAI — GPT-4o

OpenAI doubles the GPT-4o context window from 64k to 128k tokens across all API tiers, with no pricing change. Enables longer document processing without chunking.

November 2025ReleaseMedium impact

Claude 3.5 Haiku Released — Fast & Affordable

Anthropic — Claude 3.5 Haiku

Anthropic's fastest and most affordable Claude model. Processes 200k token context at sub-second latency. Matches Claude 3 Opus performance at 10x lower cost.

November 2025PricingMedium impact

Mistral Large 2 Pricing Reduced 30%

Mistral AI — Mistral Large 2

Mistral AI reduces Mistral Large 2 pricing by 30%, now at $2/M input and $6/M output tokens. Aims to compete directly with GPT-4o on cost-performance ratio.

October 2025DeprecatedHigh impact

GPT-4 Deprecated — Migration Required

OpenAI — GPT-4

OpenAI officially deprecates GPT-4 (gpt-4-0314, gpt-4-0613). All API calls must migrate to GPT-4o or GPT-4 Turbo. Legacy endpoints return errors after this date.

October 2025ReleaseHigh impact

Llama 3.2 Multimodal Models Released

Meta — Llama 3.2

Meta releases Llama 3.2 with 11B and 90B vision variants. First open-source Llama models with native image understanding. 128k context. Released under Meta Llama 3.2 license.

September 2025ReleaseHigh impact

OpenAI o1 Released — Reasoning-First Model

OpenAI — OpenAI o1

OpenAI launches o1, a model trained to reason before responding. Thinks through problems for up to 30 seconds. Tops AIME, GPQA, and Codeforces rankings. Available in ChatGPT Plus.

August 2025ReleaseHigh impact

Claude 3.5 Sonnet with Computer Use

Anthropic — Claude 3.5 Sonnet

Anthropic releases the upgraded Claude 3.5 Sonnet with computer use capability in public beta. Can control desktop apps, browse the web, and write/execute code autonomously.

July 2025API ChangeHigh impact

Gemini 1.5 Pro Reaches 2M Token Context

Google — Gemini 1.5 Pro

Google expands Gemini 1.5 Pro's context window to 2 million tokens — the largest context window of any production LLM. Enables processing of entire codebases or books in a single call.

June 2025PricingHigh impact

GPT-4o Pricing Cut 50%

OpenAI — GPT-4o

OpenAI halves GPT-4o pricing to $5/M input and $15/M output tokens. Responds to increased competition from Gemini and Claude. Batch API discounts increased to 50% off.

May 2025ReleaseMedium impact

Mistral Large 2 Released

Mistral AI — Mistral Large 2

Mistral AI releases Mistral Large 2 with 128k context, improved multilingual support (13 languages), and strong coding capabilities. Benchmarks competitive with Claude 3.5 Sonnet.

April 2025ReleaseHigh impact

Llama 3.1 405B Released — Largest Open Source LLM

Meta — Llama 3.1 405B

Meta releases Llama 3.1 405B, the largest open-source model ever. 128k context, competitive with GPT-4o on MMLU. Available for self-hosting and via major cloud providers.

March 2025DeprecatedLow impact

Claude 3 Haiku Deprecated

Anthropic — Claude 3 Haiku

Anthropic deprecates the original Claude 3 Haiku in favour of Claude 3.5 Haiku. Existing API users receive a 30-day migration window. Pricing remains the same for the new version.

February 2025PricingMedium impact

Gemini 1.5 Flash at $0.075/M Input Tokens

Google — Gemini 1.5 Flash

Google prices Gemini 1.5 Flash at $0.075/M input and $0.30/M output tokens for prompts under 128k. Makes production-scale deployments of fast multimodal AI cost-accessible.

January 2025API ChangeMedium impact

GPT-4o Structured Outputs Becomes Generally Available

OpenAI — GPT-4o

OpenAI makes Structured Outputs GA, allowing JSON schema-constrained responses with 100% reliability. Eliminates the need for custom parsing layers in production applications.

December 2024ReleaseHigh impact

Gemini 2.0 Flash Experimental Released

Google — Gemini 2.0 Flash Experimental

Google releases Gemini 2.0 Flash in experimental access, introducing native audio output, real-time streaming, and agentic tool-use. Marks start of the Gemini 2.0 era.

November 2024ReleaseHigh impact

Claude 3.5 Sonnet — Initial Release

Anthropic — Claude 3.5 Sonnet (v1)

Anthropic's first 3.5-series model. Topped the LMSYS chatbot arena for multiple months. Best-in-class coding at the time, surpassing GPT-4o on HumanEval and SWE-bench.

October 2024API ChangeMedium impact

OpenAI Realtime API Released

OpenAI — Multiple

OpenAI launches the Realtime API for low-latency speech-to-speech conversations. Enables building voice agents without a separate STT+TTS pipeline. Priced at $0.06/min input audio.

September 2024PricingHigh impact

Claude 3 Haiku Price Reduction — 80% Cut

Anthropic — Claude 3 Haiku

Anthropic reduces Claude 3 Haiku pricing by 80%, to $0.25/M input and $1.25/M output. One of the most aggressive price cuts in LLM history, establishing Haiku as the budget leader.

Showing 25 of 25 events. Data last updated March 30, 2026.