New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray

01 Start here

Welcome byte

This week, the AI landscape saw a flurry of major releases, including xAI's Grok 4.20 Beta, Alibaba's Qwen 3.5, Zhipu AI's GLM-5, and Chile's Latam-GPT. Anthropic also unveiled Claude Sonnet 4.6, while Google previewed Gemini 3.1 Pro, setting new performance benchmarks.

02 Fast scan

Quick Hits

#01

Our analysis delivers four quick hits on these critical developments, three deep dives into the most impactful launches, and several executive-grade AI prompts for immediate application in your organization.

01 / Quick Hits

#02

Grok 4.20 Beta (xAI)

xAI launched Grok 4.20 Beta, a 4-agent collaborative system with a 2M token context window and 235 tokens/second generation speed. It reduced hallucination to 4.2% via multi-agent synthesis.

#03

Executive angle: Evaluate Grok 4.20 Beta for high-volume, truth-sensitive content generation, especially with its 2M token context window for complex synthesis tasks.

#2 Qwen 3.5 (Alibaba)

#04

Alibaba released Qwen 3.5 (397B parameters, 17B active), an open-weight model that beats Qwen3-Max on benchmarks, is 19x faster, and 60% cheaper.

Executive angle: Explore Qwen 3.5 for open-weight deployments requiring high performance, speed, and cost efficiency in general reasoning and coding.

#05

GLM-5 (Zhipu AI)

Zhipu AI launched GLM-5, a 128B open-source model optimized for coding and agentic tasks, competitive with proprietary models.

03 Main moves

Top Updates

Upgrade to PromptHacker Premium for the full archive of executive AI playbooks.

04 Reuse this

Productivity Gem

Productivity gem

Gemini 3.1 Pro: New Benchmark Ceiling at Half the Cost of Its Predecessor

94.3% GPQA Diamond. 13 of 16 benchmarks led. $2/M input tokens. The strongest cost-per-unit-of-reasoning argument in the frontier model market.

Put it to work

05 Better week

Health Tip

A Smarter Health Log: Use Claude Projects as Your Persistent Health Journal

Tools: Claude Projects

The Prompt

This is not medical advice. Consult a qualified healthcare professional before making changes to your health routine.

Deep dive ->

06 / Kids Tip | Ages 8-16

AI health tip

Build a Health Log That Gets Smarter Over Time: Set Up Claude Projects as Your Persistent Health Journal

Three prompts and a 15-minute setup that turns Claude Projects into a longitudinal health tracking system - with monthly synthesis reviews that get more accurate as the log grows.

Step-by-step guide

06 Family AI

Kids Tip

Presidents Day Debate: AI Literacy Activity for Kids

Tools: Any LLM

Try Asking

Core lesson:

Deep dive ->

This issue revealed a dynamic AI landscape, with Sonnet 4.6 setting new cost-performance benchmarks, Gemini 3.1 Pro pushing raw power, and an array of open-source models expanding access and innovation.

The executives who act this week will gain a competitive edge by leveraging these new models for developer productivity, cost efficiency, and culturally relevant AI deployments.

Forward this to one executive you respect. They will thank you next week.

Premium Upgrade

Kids AI project

Presidents Day Debate: Which President Had the Hardest Job? A Post-Holiday AI Literacy Activity for Ages 8 to 16

How to teach children to evaluate AI arguments - not just accept them - using the one topic every kid spent at least part of Monday thinking about.

Try the activity

07 Closing note