Quick Scan
What matters today
This week, the AI landscape saw a flurry of major releases, including xAI's Grok 4.20 Beta, Alibaba's Qwen 3.5, Zhipu AI's GLM-5, and Chile's Latam-GPT. Anthropic also unveiled Claude Sonnet 4.6, while Google previewed Gemini 3.
Format Weekly newsletter issue
Audience Executives using AI at work
Time 4 min read
Topic New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray
01 Start here
Welcome byte
This week, the AI landscape saw a flurry of major releases, including xAI's Grok 4.20 Beta, Alibaba's Qwen 3.5, Zhipu AI's GLM-5, and Chile's Latam-GPT. Anthropic also unveiled Claude Sonnet 4.6, while Google previewed Gemini 3.1 Pro, setting new performance benchmarks.
02 Fast scan
Quick Hits
#01 Our analysis delivers four quick hits on these critical developments, three deep dives into the most impactful launches, and several executive-grade AI prompts for immediate application in your organization.
01 / Quick Hits
#02 Grok 4.20 Beta (xAI)
xAI launched Grok 4.20 Beta, a 4-agent collaborative system with a 2M token context window and 235 tokens/second generation speed. It reduced hallucination to 4.2% via multi-agent synthesis.
#03 Executive angle: Evaluate Grok 4.20 Beta for high-volume, truth-sensitive content generation, especially with its 2M token context window for complex synthesis tasks.
#2 Qwen 3.5 (Alibaba)
#04 Alibaba released Qwen 3.5 (397B parameters, 17B active), an open-weight model that beats Qwen3-Max on benchmarks, is 19x faster, and 60% cheaper.
Executive angle: Explore Qwen 3.5 for open-weight deployments requiring high performance, speed, and cost efficiency in general reasoning and coding.
#05 GLM-5 (Zhipu AI)
Zhipu AI launched GLM-5, a 128B open-source model optimized for coding and agentic tasks, competitive with proprietary models.
03 Main moves
Top Updates
Upgrade to PromptHacker Premium for the full archive of executive AI playbooks.
04 Reuse this
Productivity Gem
Productivity gem Gemini 3.1 Pro: New Benchmark Ceiling at Half the Cost of Its Predecessor
94.3% GPQA Diamond. 13 of 16 benchmarks led. $2/M input tokens. The strongest cost-per-unit-of-reasoning argument in the frontier model market.
Put it to work 05 Better week
Health Tip
A Smarter Health Log: Use Claude Projects as Your Persistent Health Journal
Tools: Claude Projects
The Prompt
This is not medical advice. Consult a qualified healthcare professional before making changes to your health routine.
Deep dive ->
06 / Kids Tip | Ages 8-16
AI health tip Build a Health Log That Gets Smarter Over Time: Set Up Claude Projects as Your Persistent Health Journal
Three prompts and a 15-minute setup that turns Claude Projects into a longitudinal health tracking system - with monthly synthesis reviews that get more accurate as the log grows.
Step-by-step guide 06 Family AI
Kids Tip
Presidents Day Debate: AI Literacy Activity for Kids
Tools: Any LLM
Try Asking
Core lesson:
Deep dive ->
This issue revealed a dynamic AI landscape, with Sonnet 4.6 setting new cost-performance benchmarks, Gemini 3.1 Pro pushing raw power, and an array of open-source models expanding access and innovation.
The executives who act this week will gain a competitive edge by leveraging these new models for developer productivity, cost efficiency, and culturally relevant AI deployments.
Forward this to one executive you respect. They will thank you next week.
Premium Upgrade
Kids AI project Presidents Day Debate: Which President Had the Hardest Job? A Post-Holiday AI Literacy Activity for Ages 8 to 16
How to teach children to evaluate AI arguments - not just accept them - using the one topic every kid spent at least part of Monday thinking about.
Try the activity PB
About the author
Pierre Bradshaw Founder, PromptHacker.ai
Pierre has spent 25+ years building growth systems across fintech, real estate, lending, campaigns, and AI workflows, with $1.5B+ in client value delivered.
If you have any questions or comments about New Models Reshape the AI Landscape: Sonnet Overtakes Opus, Gemini 3.1 Pro Sets New Benchmarks, and Grok 4.20 Enters the Fray feel free to reach out. I'd love to hear from you.
Contact Pierre