Key Takeaways
- Claude Sonnet 5 landed June 30, scoring 63.2% on SWE-bench Pro at $2/$10 per million tokens — close to Opus 4.8 at 40% of its standard price. It's now the best Claude most people can actually use, with Mythos 5 and Fable 5 still suspended under a US export-control order.
- GPT-5.6 (Sol/Terra/Luna) previewed June 26 with three tiers — Sol (frontier reasoning), Terra (balanced), Luna (cost-efficient) — plus two new reasoning modes, but access is restricted to government-vetted partners only.
- Gemini 3.5 Flash went GA at Google I/O 2026 ($1.50/$9.00 per million tokens). Gemini Omni launched simultaneously as Google's first natively multimodal model.
- Open-source surged: DeepSeek V4-Pro (Apr 24), MiniMax M3 (Jun 1 — first open-weight triple-frontier model), GLM-5.2 (Jun 16), and Kimi K2.7 Code (Jun 12) all shipped.
- Stanford AI Index 2026: US-China model gap has effectively closed to ~2.7 points. US private AI investment hit $285.9B — 23x China's $12.4B — but Chinese models now match US counterparts on several key benchmarks.
The first half of 2026 has delivered an unprecedented wave of large language model releases — over 50 frontier and open-weight models shipped between January and June, with every major lab pushing upgrades within weeks of each other. If you blinked, you missed more AI model launches than the entire year of 2023.
This is your complete field guide to every major launch, ranked by real-world impact, benchmarked where it matters, and contextualized with the Stanford HAI AI Index 2026 report that dropped in April.
June 2026: The Hottest Month Yet
Anthropic — Claude Sonnet 5 (June 30)
Just days ago, Anthropic dropped Claude Sonnet 5, and it might be the most consequential launch of the month — not just for its specs, but for when it arrived. After the US Commerce Department ordered Anthropic to suspend Fable 5 and Mythos 5 on June 12 under a national security export-control directive, Sonnet 5 became the de facto ceiling of what Claude users can access.
The benchmarks tell the story: Sonnet 5 lands between Sonnet 4.6 and Opus 4.8 on most metrics — much closer to Opus — and actually edges ahead of Opus 4.8 on knowledge work (GDPval-AA v2).
| Benchmark | Sonnet 5 | Sonnet 4.6 | Opus 4.8 |
|---|---|---|---|
| Agentic coding (SWE-bench Pro) | 63.2% | 58.1% | 69.2% |
| Agentic coding (Terminal-Bench 2.1) | 80.4% | 67.0% | 82.7% |
| Reasoning w/o tools (Humanity's Last Exam) | 43.2% | 34.6% | 49.8% |
| Reasoning w/ tools (Humanity's Last Exam) | 57.4% | 46.8% | 57.9% |
| Computer use (OSWorld-Verified) | 81.2% | 78.5% | 83.4% |
| Knowledge work (GDPval-AA v2) | 1,618 | 1,395 | 1,615 |
At a standard price of $3/$15 per million tokens (introductory $2/$10 through August 31), Sonnet 5 is roughly 40% cheaper than Opus 4.8 ($5/$25). The catch: it uses an updated tokenizer that inflates counts by 1.0-1.35x depending on input, so introductory pricing is designed to make the transition cost-neutral until September.
Safety note: Anthropic deliberately did not train Sonnet 5 on cybersecurity tasks — its partial-success rate on exploit-generation tests was higher than Sonnet 4.6. Anthropic directs security researchers to Opus 4.8 instead. We covered this space in our earlier article on GPT-5.5-Cyber: OpenAI's New Cybersecurity Model and Patch the Planet.
OpenAI — GPT-5.6 Sol, Terra & Luna (June 26 Preview)
OpenAI responded to the June frenzy with its most tiered launch yet: GPT-5.6 arrives in three variants.
| Variant | Target Use Case | Pricing (per M tokens) | Reasoning Mode |
|---|---|---|---|
| Sol | Frontier reasoning, research | $5/$30 | Ultra reasoning |
| Terra | Balanced, everyday use | $2/$10 | Max reasoning |
| Luna | Cost-sensitive, high-throughput | $1/$6 | Standard reasoning |
The bigger story here is access control. OpenAI previewed GPT-5.6 exclusively to "government-vetted trusted partners" and US-allied organizations. This marks a new paradigm where frontier model access is gated by geopolitics, not payment. We covered this dynamic extensively in GPT-5.6 Sol, Terra & Luna: OpenAI's Next-Gen Model Family and the Government-Gated AI Era.
Google DeepMind — Gemini 3.5 Flash & Gemini Omni (May 19)
At Google I/O 2026, Google launched Gemini 3.5 Flash — priced at $1.50/$9.00 per million tokens with a 1 million token context window. It's the cheapest frontier-tier model currently shipping at that context length, with particularly strong multimodal reasoning performance. Gemini Omni is Google's first natively multimodal model, trained from scratch on text, images, audio, and video simultaneously — early benchmarks show it outperforms GPT-5.5 and Claude Opus 4.7 on audiovisual comprehension tasks by 8-12%.
We did a full head-to-head: Google Gemini 3.5 Flash vs GPT-5.5/5.6: The Great AI Model Showdown of 2026.
Q2 2026: The Breakneck Pace
April 2026 — The Foundation Wave
| Date | Model | Lab | Significance |
|---|---|---|---|
| Apr 7 | GLM-5.1 | Z.ai | Open-source Chinese frontier model |
| Apr 8 | Muse Spark | Meta | Meta's first image/video generation model |
| Apr 16 | Claude Opus 4.7 | Anthropic | Set new SOTA on reasoning benchmarks |
| Apr 17 | Grok 4.3 Beta | xAI | Real-time knowledge + reasoning upgrade |
| Apr 21 | Kimi K2.6 | Moonshot AI | 1M token context open-source model |
| Apr 23 | GPT-5.5 / 5.5-Pro | OpenAI | Major speed improvement over GPT-5.4 |
| Apr 24 | DeepSeek V4-Pro/Flash | DeepSeek | Open-weight MoE, 370B total params |
| Apr 29 | Mistral Medium 3.5 | Mistral | European frontier model, 256K context |
DeepSeek V4 deserves special attention. The Chinese lab's V4-Pro model uses Mixture-of-Experts with ~370B total parameters (37B active per token) and matches GPT-5.5 on several coding benchmarks while being fully open-weight — you can run it locally or self-host.
May 2026 — Anthropic's Peak (Before the Fall)
| Date | Model | Significance |
|---|---|---|
| May 19 | Gemini 3.5 Flash + Gemini Omni | Google's multimodal offensive |
| May 28 | Claude Opus 4.8 | #1 AI Index score (61.4) — first model above 60 |
Claude Opus 4.8 held the #1 global ranking for just 14 days before the export-control crisis sidelined Anthropic's top-tier models. It remains the most capable model you can pay for, scoring 69.2% on SWE-bench Pro and 82.7% on Terminal-Bench 2.1.
June 2026 — Open-Source Renaissance
| Date | Model | Significance |
|---|---|---|
| Jun 1 | MiniMax M3 | First open-weight triple-frontier model |
| Jun 9 | Claude Fable 5 | Released then suspended Jun 12 |
| Jun 12 | Kimi K2.7 Code | Specialized coding model, open-weight |
| Jun 16 | GLM-5.2 | Open-source, competitive with GPT-5.5 |
| Jun 26 | GPT-5.6 Sol/Terra/Luna | Three-tier government-gated preview |
| Jun 30 | Claude Sonnet 5 | Best available Claude, near-Opus performance |
Open-Source: The Silent Revolution
While frontier labs battle for benchmark supremacy, the open-source ecosystem has arguably made the most practical progress in 2026.
MiniMax M3 (June 1) is the first open-weight model to deliver three frontier capabilities simultaneously: text reasoning, multimodal understanding, and audio processing — all in a single model. Released on Hugging Face and GitHub, it's the most ambitious open-source release of the year.
DeepSeek V4 (April 24) — open-weight, MoE architecture, competitive with GPT-5.5 on coding, available in both Pro (high-quality reasoning) and Flash (speed-optimized) variants. DeepSeek continues to be the price-performance king of the open-source world.
GLM-5.2 (June 16) from Z.ai is the strongest Chinese open-source model on English and Chinese benchmarks combined, scoring competitively with GPT-5.5 on MMLU-Pro while requiring significantly less compute for inference.
Kimi K2.7 Code (June 12) from Moonshot AI matches GPT-5.5 on SWE-bench Lite while being fully open-weight — particularly strong on Chinese-language coding documentation.
The 2026 AI Model Comparison Table
| Model | Release | Input Price (per M tokens) | Output Price (per M tokens) | Context | Key Strength |
|---|---|---|---|---|---|
| Claude Opus 4.8 | May 28 | $5 | $25 | 200K | Highest raw intelligence (Index: 61.4) |
| Claude Sonnet 5 | Jun 30 | $2/$3 | $10/$15 | 1M | Best value: near-Opus at 40% cost |
| GPT-5.6 Sol | Jun 26 (preview) | $5 | $30 | 200K | Ultra reasoning, government-gated |
| GPT-5.6 Terra | Jun 26 (preview) | $2 | $10 | 200K | Balanced, Max reasoning mode |
| GPT-5.6 Luna | Jun 26 (preview) | $1 | $6 | 200K | Cost-efficient standard reasoning |
| Gemini 3.5 Flash | May 19 | $1.50 | $9.00 | 1M | Cheapest frontier model, fast multimodal |
| DeepSeek V4-Pro | Apr 24 | Free (open-weight) | Free (open-weight) | 128K | Best open-source coding model |
| MiniMax M3 | Jun 1 | Free (open-weight) | Free (open-weight) | 256K | Triple-frontier (text+vision+audio) |
| GLM-5.2 | Jun 16 | Free (open-weight) | Free (open-weight) | 128K | Strongest Chinese open-source model |
| Kimi K2.7 Code | Jun 12 | Free (open-weight) | Free (open-weight) | 1M | Best open coding for Chinese frameworks |
| GPT-5.5 | Apr 23 | $3 | $15 | 200K | Solid mid-tier frontier |
| Mistral Medium 3.5 | Apr 29 | €2 | €10 | 256K | Best European frontier model |
The Stanford AI Index 2026: Key Findings
The Stanford HAI AI Index 2026 Report (published April 2026) provides essential context for understanding this model release frenzy:
1. The US-China Gap Has Effectively Closed. The AI model performance gap between the US and China has narrowed to just ~2.7 points on the Chatbot Arena leaderboard. Models from Chinese labs — DeepSeek, MiniMax, Z.ai GLM, Moonshot AI Kimi — now match or exceed US counterparts on several key benchmarks. The report concludes the gap is "effectively closed."
2. Investment Divergence. US private AI investment hit a staggering $285.9 billion in 2025 — more than 23 times China's $12.4 billion. However, China gets significantly more capability per dollar through aggressive open-source strategy and efficient architectures.
3. Transparency Collapse. Transparency scores among frontier AI developers dropped from 58/100 to 40/100 in the past year. Labs are publishing less about training data, architectures, and safety evaluations — a trend the report calls "the transparency recession."
4. Organizational AI Adoption Hits 88%. 88% of organizations now report using AI in at least one business function, up from 72% the previous year. Agentic AI — autonomous systems executing multi-step workflows — is the fastest-growing adoption category.
5. Model Proliferation Accelerates. The number of notable AI models released globally doubled from 2025 to 2026, with over 50 significant models in the first half alone.
Source: Stanford HAI AI Index 2026 Report
The Big Questions
Who's Actually Winning?
- Raw intelligence: Claude Opus 4.8 — #1 AI Index score (61.4)
- Practical value: Claude Sonnet 5 — near-Opus at 40% cost
- Open-source: DeepSeek V4-Pro for coding, MiniMax M3 for multimodal
- Speed + price: Gemini 3.5 Flash — fastest frontier at lowest price
- Government access: GPT-5.6 Sol — but you can't have it
What Happens With the Export Controls?
The suspension of Claude Fable 5 and Mythos 5 under a US Commerce Department national security order has created a bizarre situation where Anthropic's best models are inaccessible to all customers worldwide. OpenAI's GPT-5.6 preview is likewise restricted to "government-vetted trusted partners," creating a two-tier access system for frontier AI that mirrors geopolitical alliances. Anthropic says discussions with the administration that could restore access include the Sonnet 5 launch itself — but no date is confirmed.
What's Next?
- GPT-5.6 public release: Likely late July / August
- Gemini 3.5 Pro: Promised "next month" at Google I/O — any day now
- Claude Fable 5 return: Negotiations ongoing, no confirmed date
- DeepSeek V5: Rumored for late Q3 2026
- Meta Llama 4: Expected by end of 2026
How to Choose Your Model in Mid-2026
- For coding and agentic workflows: Claude Sonnet 5 — $2/$10 intro pricing, near-Opus coding scores, full Anthropic ecosystem
- For raw reasoning power: Claude Opus 4.8 — still the king for complex math and science
- For high-throughput production: Gemini 3.5 Flash — $1.50/$9.00 with 1M context
- For cost-sensitive self-hosting: DeepSeek V4-Pro or MiniMax M3 — zero ongoing API costs
Published July 1, 2026. All benchmarks and pricing accurate as of publication date. The AI landscape changes weekly — check the sources below for latest updates.
External Sources:
- Stanford HAI AI Index 2026 Report — Comprehensive US-China AI landscape analysis
- AI Model Release Tracker — Continuously updated release timeline
- Artificial Analysis — Model Intelligence Index — Independent benchmark leaderboard
Further Reading on GetYourDozAi:
- Google Gemini 3.5 Flash vs GPT-5.5/5.6: The Great AI Model Showdown of 2026
- GPT-5.6 Sol, Terra & Luna: OpenAI's Next-Gen Model Family and the Government-Gated AI Era
- GPT-5.5-Cyber: OpenAI's New Cybersecurity Model and Patch the Planet
- What is RAG? Retrieval-Augmented Generation Explained Simply (2026)