PH PROMPTHACKER.AI
Issue 112 | New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray

New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray

This week, the AI landscape saw a flurry of major releases, including xAI's Grok 4.20 Beta, Alibaba's Qwen 3.5, Zhipu AI's GLM-5, and Chile's Latam-GPT. Anthropic also unveiled Claude Sonnet 4.6, while Google previewed Gemini 3.1 Pro, setting new performance benchmarks.

4 min read
Quick Scan

What matters today

This week, the AI landscape saw a flurry of major releases, including xAI's Grok 4.20 Beta, Alibaba's Qwen 3.5, Zhipu AI's GLM-5, and Chile's Latam-GPT. Anthropic also unveiled Claude Sonnet 4.6, while Google previewed Gemini 3.

Format Weekly newsletter issue
Audience Executives using AI at work
Time 4 min read
Topic New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray

Welcome byte

This week, the AI landscape saw a flurry of major releases, including xAI's Grok 4.20 Beta, Alibaba's Qwen 3.5, Zhipu AI's GLM-5, and Chile's Latam-GPT. Anthropic also unveiled Claude Sonnet 4.6, while Google previewed Gemini 3.1 Pro, setting new performance benchmarks.

Quick Hits

#01

Our analysis delivers four quick hits on these critical developments, three deep dives into the most impactful launches, and several executive-grade AI prompts for immediate application in your organization.

01 / Quick Hits

#02

Grok 4.20 Beta (xAI)

xAI launched Grok 4.20 Beta, a 4-agent collaborative system with a 2M token context window and 235 tokens/second generation speed. It reduced hallucination to 4.2% via multi-agent synthesis.

#03

Executive angle: Evaluate Grok 4.20 Beta for high-volume, truth-sensitive content generation, especially with its 2M token context window for complex synthesis tasks.

#2 Qwen 3.5 (Alibaba)

#04

Alibaba released Qwen 3.5 (397B parameters, 17B active), an open-weight model that beats Qwen3-Max on benchmarks, is 19x faster, and 60% cheaper.

Executive angle: Explore Qwen 3.5 for open-weight deployments requiring high performance, speed, and cost efficiency in general reasoning and coding.

#05

GLM-5 (Zhipu AI)

Zhipu AI launched GLM-5, a 128B open-source model optimized for coding and agentic tasks, competitive with proprietary models.

Top Updates

Upgrade to PromptHacker Premium for the full archive of executive AI playbooks.

Productivity Gem

Productivity gem

Gemini 3.1 Pro: New Benchmark Ceiling at Half the Cost of Its Predecessor

94.3% GPQA Diamond. 13 of 16 benchmarks led. $2/M input tokens. The strongest cost-per-unit-of-reasoning argument in the frontier model market.

Put it to work

Health Tip

A Smarter Health Log: Use Claude Projects as Your Persistent Health Journal

Tools: Claude Projects

The Prompt

This is not medical advice. Consult a qualified healthcare professional before making changes to your health routine.

Deep dive ->

06 / Kids Tip | Ages 8-16

AI health tip

Build a Health Log That Gets Smarter Over Time: Set Up Claude Projects as Your Persistent Health Journal

Three prompts and a 15-minute setup that turns Claude Projects into a longitudinal health tracking system - with monthly synthesis reviews that get more accurate as the log grows.

Step-by-step guide

Kids Tip

Presidents Day Debate: AI Literacy Activity for Kids

Tools: Any LLM

Try Asking

Core lesson:

Deep dive ->

This issue revealed a dynamic AI landscape, with Sonnet 4.6 setting new cost-performance benchmarks, Gemini 3.1 Pro pushing raw power, and an array of open-source models expanding access and innovation.

The executives who act this week will gain a competitive edge by leveraging these new models for developer productivity, cost efficiency, and culturally relevant AI deployments.

Forward this to one executive you respect. They will thank you next week.

Premium Upgrade

Kids AI project

Presidents Day Debate: Which President Had the Hardest Job? A Post-Holiday AI Literacy Activity for Ages 8 to 16

How to teach children to evaluate AI arguments - not just accept them - using the one topic every kid spent at least part of Monday thinking about.

Try the activity

Wrap Up

About the author

Pierre Bradshaw Founder, PromptHacker.ai

Pierre has spent 25+ years building growth systems across fintech, real estate, lending, campaigns, and AI workflows, with $1.5B+ in client value delivered.

If you have any questions or comments about New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray feel free to reach out. I'd love to hear from you.

Contact Pierre
Free weekly briefing

Three deep dives. Four useful moves. One email worth opening.

PromptHacker turns the AI firehose into practical next steps for work, health, family, and everything time keeps trying to steal.