Choose the right model for the right job — capabilities, cost and context window at a glance.
| Task / Metric | GPT-4.1 OpenAI |
GPT-5 Chat OpenAI |
GPT-5 Auto OpenAI |
GPT-5 Reasoning OpenAI |
O4 mini RL FT OpenAI |
Sonnet 4.5 Anthropic |
Sonnet 4.6 Anthropic |
Opus 4.5 Anthropic |
|---|---|---|---|---|---|---|---|---|
| Status | Default | Stable | Preview | Preview | Exp. | Stable | Stable | Exp. |
| Quick Q&A | ■ | ■ | ◆ | · | ▲ | ■ | ■ | ◆ |
| Coding | ■ | ■ | ■ | ■ | ■ | ■ | ■ | ◆ |
| Long Writing | ◆ | ■ | ◆ | ▲ | · | ■ | ■ | ■ |
| ⬡ Deep Reasoning | ▲ | ◆ | ■ | ■ | ■ | ◆ | ◆ | ■ |
| ∑ Data / Math | ◆ | ◆ | ■ | ■ | ■ | ◆ | ◆ | ■ |
| Summarise Doc | ■ | ■ | ◆ | ▲ | ▲ | ■ | ■ | ■ |
| Speed | ||||||||
| Input $/1M | $2.00 | $10.00 | Varies | $15.00 | TBD | $3.00 | $3.00 | $15.00 |
| Output $/1M | $8.00 | $30.00 | Varies | $60.00 | TBD | $15.00 | $15.00 | $75.00 |
| Context Window | 1M | 1M | 1M | 1M | 128K | 200K | 200K | 200K |
Prices per 1 million tokens via API. Subscription/UI usage is absorbed in your plan. Output tokens cost more — keep responses concise to save budget.
Start with GPT-4.1 (your default). Only switch if the task genuinely needs more depth. Most daily tasks don't require GPT-5 Reasoning or Opus.
Same price, newer model. If your picker shows both, always choose 4.6. There's no reason to use 4.5 unless 4.6 has an issue.
Opus 4.5 and RL FT O4 mini are experimental. Great capability — but outputs may be inconsistent. Always review before sending to a client.
Auto-mode is convenient but if it routes to Reasoning for a simple question, you pay Reasoning prices. Use it when you genuinely don't know the task complexity.
Across all models, output is the expensive part. Tell the model: "Be concise" or "Respond in under 300 words" on simple tasks to cut API spend significantly.
GPT-5 Auto and GPT-5 Reasoning are in Preview — pricing, behaviour, and availability can change without notice. Don't build client deliverable pipelines on Preview-only models yet.
⚠ IMPORTANT: CMMC is a certification of your business practices, not your AI tool. However, any cloud tool used to process CUI must meet FedRAMP Moderate or higher under DFARS 252.204-7012. The table below shows what each vendor has certified — and which deployment tier is required.
| Framework | OpenAI / ChatGPT | Claude / Anthropic | M365 Copilot | Grok / xAI |
|---|---|---|---|---|
| SOC 2 Type II |
✓
Enterprise/Team/API only
Free & Plus: NOT covered |
✓
API & Claude for Work
Free/Pro: NOT covered |
✓
GCC / GCC High / Commercial
Covered under M365 SOC 2 |
~
Business & Enterprise tier only
Consumer: NOT covered |
| HIPAA / BAA |
✓
Enterprise + API (BAA available)
Must sign BAA explicitly |
✓
Claude for Work + API (BAA)
Or via Bedrock/Vertex BAA |
✓
All M365 tiers with BAA
GCC High recommended |
—
No BAA available
NOT suitable for PHI |
| FedRAMP (Moderate+) |
~
Via Azure OpenAI (FedRAMP High)
Direct OpenAI API: NOT FedRAMP |
✓
Claude for Gov (C4G): FedRAMP High
Also via Bedrock GovCloud IL4/5 |
~
GCC: FedRAMP High ✓
Commercial: NOT FedRAMP for CUI |
—
No FedRAMP authorization
(DoD pilot separate, not certified) |
| NIST 800-171 / CMMC L2 |
~
Azure OpenAI (FedRAMP) meets req.
Consumer API: do NOT use for CUI |
✓
C4G or Bedrock GovCloud meets
FedRAMP Moderate req. for CUI |
~
GCC High: CMMC L2/L3 ✓
Commercial M365: NOT compliant |
—
No CMMC compliance path
Do NOT use for CUI/FCI |
| ISO 27001 |
✓
27001 + 27017 + 27018 + 27701
API & Enterprise tiers |
✓
SOC 2 Type II attested
ISO certs via AWS/GCP infrastructure |
✓
27001 + 27017 + 27018
All M365 tiers |
—
No ISO certification
Enterprise tier only has SOC 2 |
| GDPR / CCPA |
✓
DPA available · EU data residency
options via Azure EU regions |
✓
DPA available · GDPR compliant
Most privacy-forward by default |
✓
Full GDPR / CCPA compliance
EU data boundary available |
~
GDPR/CCPA claimed (Business+)
DPC (Ireland) investigation ongoing |
| DoD IL4 / IL5 |
~
Via Azure Government OpenAI
Not via direct OpenAI API |
✓
Bedrock GovCloud: IL4 + IL5
AWS Secret region: IL6 |
~
GCC High: IL5 ✓
Copilot in GCC High: limited feature set |
—
No IL authorization
(DoD GenAI.mil pilot ≠ certification) |
Enterprise/API tier: SOC 2 Type II, HIPAA BAA, ISO 27001 — solid for most enterprise use. For CUI/CMMC, must deploy via Azure OpenAI Government, not the direct API. Free and Plus tiers have zero compliance coverage — never use for client data.
Strongest compliance path of the group. SOC 2 Type II, HIPAA BAA, FedRAMP High via Claude for Government (C4G), DoD IL4/5 via Bedrock GovCloud, IL6 via AWS Secret. For CMMC CUI work — use C4G or Bedrock, not claude.ai directly.
Tier matters enormously. Commercial M365 = NOT CMMC compliant. GCC = FedRAMP High, CMMC L2 (non-ITAR). GCC High = FedRAMP High, DFARS 7012, CMMC L2/L3, ITAR. M365 Copilot on GCC High inherits those authorizations — but feature set is reduced vs commercial.
Enterprise tier only has SOC 2 + GDPR/CCPA. No FedRAMP, no HIPAA BAA, no CMMC compliance path, no IL authorization. The DoD GenAI.mil pilot is a political deployment — not a certified authorization. Do not use Grok for any regulated client data. Active EU DPC investigation ongoing.
No compliance coverage: Free/consumer tiers of any AI tool. ChatGPT Free/Plus, Claude Free/Pro, Grok consumer.
SOC 2 only: ChatGPT Enterprise, Claude for Work API, Grok Business/Enterprise — adequate for most commercial MSP work, not for government CUI.
FedRAMP Moderate+ / CMMC-eligible: Claude for Government (C4G), Claude via Bedrock GovCloud, Azure OpenAI Government, M365 GCC or GCC High.
DoD IL5 / ITAR: M365 GCC High, Claude via Bedrock GovCloud IL5, Azure OpenAI GovCloud only.
Remember: CMMC certifies your organisation's controls, not the tool. A FedRAMP-authorized tool is required infrastructure — but your SSP, policies, and audit evidence are what get certified.