Automate Document Analysis Pipelines with Claude 3 Sonnet
A reusable prompt template turns any contract, report, or submission into structured JSON automatically via the Amazon Bedrock API.
What matters today
A reusable prompt template turns any contract, report, or submission into structured JSON automatically via the Amazon Bedrock API.
Key points
- The Core Prompt Template
- Why This Works
- Bedrock API Configuration
- Document Types and Expected Accuracy
- Scaling the Pipeline
What You'll Learn
- A reusable prompt template for structured document extraction
- How to configure Claude 3 Sonnet via the Bedrock API for production use
- Which document types respond best to this approach
Every executive function processes documents that follow predictable patterns: contracts, vendor proposals, board reports, customer submissions. Claude 3 Sonnet, called through the Amazon Bedrock API, can convert any of these into structured JSON automatically.
Manual document review is one of the highest-cost, lowest-leverage activities in most organizations. A legal ops team spending 40 hours per week reviewing contracts is a team that could be redirected to negotiation, strategy, or risk management. The pipeline described here handles the extraction layer.
This article includes a verbatim prompt template, configuration notes for the Bedrock API call, and guidance on which document types produce the highest-accuracy outputs.
SUBSCRIBER BREAK -- Premium Content Below
The Core Prompt Template
Paste this into your Bedrock API call as the system or user message, replacing the bracketed fields with your document type and required fields:
You are a document analyst. I will provide you with a [CONTRACT / REPORT / SUBMISSION]. Extract the following fields and return them as valid JSON: { "document_type": "", "key_parties": [], "effective_date": "", "termination_conditions": [], "financial_terms": [], "risk_flags": [], "summary": "" } If a field is not present in the document, return null for that field. Do not infer values not explicitly stated. Return only valid JSON with no surrounding explanation. Document: [PASTE DOCUMENT TEXT HERE]
Why This Works
The explicit null instruction is the most important element. Without it, Claude will generate plausible-sounding values for missing fields, which is dangerous in legal and financial contexts. "If not present, return null" forces the model to distinguish between "I found this" and "I could not find this."
The "Return only valid JSON" instruction eliminates the conversational wrapper that Claude sometimes adds before structured output. This makes the output directly parseable by downstream systems without text cleaning.
Bedrock API Configuration
- Model: anthropic.claude-3-sonnet-20240229-v1:0
- max_tokens: 2048 (sufficient for most document summaries; increase to 4096 for very long contracts)
- temperature: 0 (deterministic output for extraction tasks; do not use higher values)
- anthropic_version: bedrock-2023-05-31
Document Types and Expected Accuracy
High accuracy (90%+): Standard commercial contracts, vendor proposals with clear pricing tables, structured reports with defined sections, regulatory filings with standardized formats.
Medium accuracy (70-85%): Emails and correspondence, technical specifications, and informal board memos.
Lower accuracy (use with human review): Handwritten documents after OCR, legal documents with extensive cross-references, and documents where key terms appear only in exhibits not included in the text.
Scaling the Pipeline
- Store incoming documents in an S3 bucket (one prefix per document type).
- Trigger a Lambda function on new object creation.
- The Lambda reads the document text, constructs the Bedrock API call, and writes the JSON output to a second S3 prefix.
- Connect the output bucket to your data warehouse or document management system for downstream use.
At Haiku pricing ($0.25/M input tokens), processing 10,000 standard contracts costs approximately $25-50 depending on document length.
Three deep dives. Four useful moves. One email worth opening.
PromptHacker turns the AI firehose into practical next steps for work, health, family, and everything time keeps trying to steal.