Build a Three-Tier Subagent Workflow With GPT-5.4 Nano, Mini, and Standard

The three-tier cost architecture: when to use Nano, Mini, and Standard in the same workflow

March 18, 2026 5 min read

multi model subagent workflow gpt 5 4 nano mini routing

Quick Scan

What matters today

The three-tier cost architecture: when to use Nano, Mini, and Standard in the same workflow

Format TOP UPDATE

Audience Executives using AI at work

Time 5 min read

Topic Top Update

Key points

The Opportunity
The Routing Architecture
Workflow Example 1: Customer Support Ticket Triage
Workflow Example 2: Weekly Contract Review Summary
Implementing Without External Orchestration

What You'll Learn

The three-tier cost architecture: when to use Nano, Mini, and Standard in the same workflow
The verbatim routing prompt that classifies tasks before sending them to the right model
Two complete workflow examples with cost breakdowns showing the savings versus single-model approaches
How to implement the intake-analysis-output pattern without any external orchestration tools

The Opportunity

The release of GPT-5.4 Nano and Mini on March 17 created a complete three-tier cost architecture for AI workflows. Nano at $0.20 per million input tokens. Mini at $0.75 per million with 400K context. Standard at $2.50 per million with 1M context.

The cost difference is significant enough to be an engineering requirement, not just a budget preference. Running every task through Standard when 70% of tasks do not require it wastes 70% of your API spend. The multi-tier routing approach routes each task to the cheapest model capable of handling it.

This is a PromptHacker Premium article.

The full breakdown, verbatim prompts, and action steps are available to Premium subscribers.

The Routing Architecture

The three-tier subagent workflow has a single routing step that determines which model handles each incoming task. The router runs on Nano - cheapest possible - and classifies the task into one of three buckets before any analysis begins.

The Routing Prompt (run on GPT-5.4 Nano):

You are a task classifier. Analyze the incoming task request and classify it into one of three tiers based on complexity and context requirements. TIER-1 (NANO): Simple classification, tagging, routing, short summarization under 500 words, binary decisions, metadata extraction. TIER-2 (MINI): Document analysis up to 200 pages, draft generation, research synthesis, question answering over a defined document set, meeting notes analysis, ticket resolution. TIER-3 (STANDARD): Complex multi-step reasoning, M&A or legal document synthesis across hundreds of files, high-stakes client-facing output, tasks requiring domain expert quality review. Incoming task: [TASK_DESCRIPTION] Respond with only: TIER-1, TIER-2, or TIER-3

This prompt runs in under 200 tokens. At Nano pricing, each routing call costs $0.00004 - effectively free.

Workflow Example 1: Customer Support Ticket Triage

An enterprise support team receives 500 tickets per day. Current workflow: all tickets analyzed by GPT-5.3 Mini to extract issue type, urgency, and draft a response.

Three-tier workflow:

Nano routing step: Classify each ticket into Simple (password reset, basic product question) vs. Complex (billing dispute, technical escalation, policy exception). Cost: 500 tickets x ~300 tokens x $0.20/M = $0.03/day
Nano resolution layer: Simple tickets (estimated 340 of 500) get Nano-generated responses using a structured template. Cost: 340 x ~600 tokens x $0.20/M = $0.04/day
Mini resolution layer: Complex tickets (160 of 500) get Mini analysis and draft responses. Cost: 160 x ~800 tokens x $0.75/M = $0.10/day

Total daily cost: $0.17. Prior workflow running all tickets through Standard: $0.60/day. Savings: 72%.

Workflow Example 2: Weekly Contract Review Summary

A procurement team reviews vendor contracts weekly. Currently, a legal assistant manually extracts key terms and deadlines from each contract, then drafts a summary report for the CPO.

Three-tier workflow:

Nano routing step: Classify contracts by length - short (under 20 pages), medium (20 to 80 pages), long (80+ pages). Route accordingly.
Mini analysis layer: Short and medium contracts (up to 80 pages, approximately 64K tokens). Extract parties, term dates, renewal provisions, liability caps, and unusual terms. Cost per contract: ~70K tokens x $0.75/M = $0.05
Standard analysis layer: Long contracts (80+ pages, when synthesis requires cross-document reasoning). Load complete contract plus prior contracts for comparison. Cost per contract: ~200K tokens x $2.50/M = $0.50

For a weekly review of 20 contracts (15 medium, 5 long):

Total cost: (15 x $0.05) + (5 x $0.50) = $3.25/week. Time savings: 4 to 6 hours of manual extraction work replaced.

Implementing Without External Orchestration

The three-tier system does not require LangChain, AutoGen, or any orchestration framework. It requires a small script or a ChatGPT custom workflow that:

Takes the incoming task description as input
Sends it to the Nano routing prompt
Routes based on the TIER response
Passes the full task to the appropriate model tier

For teams not running code, a manual version works in ChatGPT: before running any AI task, first ask ChatGPT (using the routing prompt above) which tier the task falls into. Then open the appropriate model interface. This adds 30 seconds to task setup and produces an average 60% cost reduction on API usage.

Action Steps

Start with the routing prompt. Copy it and run it on your next 20 incoming AI tasks. Track which tier each falls into. This gives you a realistic distribution before you rebuild any workflow.
Identify your top-volume repetitive workflow. Customer support triage, contract extraction, email classification, report summarization. This is your first multi-tier implementation target.
Calculate the current monthly cost of running that workflow entirely on Standard. Apply the three-tier distribution from your test (step 1) to get the projected new cost.
Implement the routing step first - before rebuilding the analysis steps. A routing layer on top of your existing workflow is additive and low-risk.
Benchmark output quality across all three tiers for your specific use case before deploying to production. Nano will fail on tasks that require nuanced judgment. That is expected and correct - the routing step should have caught those before they reached Nano.

Bottom line

The useful move with Build a Three-Tier Subagent Workflow With GPT-5.4 Nano, Mini, and Standard is to run one narrow test this week, then keep only the workflow that saves time, improves a decision, or gives your team clearer output. Treat the announcement as raw material, not the win itself.

About the author

Pierre Bradshaw Founder, PromptHacker.ai

Pierre has spent 25+ years building growth systems across fintech, real estate, lending, campaigns, and AI workflows, with machine-learning work dating back to 2012.

If you have any questions or comments about Build a Three-Tier Subagent Workflow With GPT-5.4 Nano, Mini, and Standard feel free to reach out. I'd love to hear from you.

Contact Pierre