The AI Trust Framework

A practical decision framework from AI, Honestly EP001 — free for you and your team.
Enter your email to read it and save a copy.

No spam. No newsletter unless you ask.

AI, Honestly — Episode 001 Companion

The AI Trust Framework:
A Starting Point

A practical guide for organizations deploying AI — built around one question: how do you know when to trust it?

Published March 2026  ·  aihonestly.com  ·  Free to share and adapt with attribution

Amazon's SVP sent a mandatory email to his engineering org in March 2026. The subject, in plain language: AI-assisted code had been causing high-stakes outages, and they hadn't built the right protocols for when to trust it and when to add a human checkpoint.

Amazon is not a cautionary tale about AI being bad. It is a very clear signal that deploying AI without a trust calibration framework is how you find yourself building one after an incident instead of before one.

"The question isn't whether to trust AI. It's whether you've built the protocols to know when to trust it — and when to take the controls back."

— Kyle, Episode 001  ·  AI, Honestly

This document is a starting point. It is not a compliance framework or a regulatory guide. It is a set of five components that any organization can implement — the equivalent of the aviation industry's checklist system, built for AI-assisted decisions. Use it as a foundation and adapt it to your context.

Component 1 of 5

Decision Classification

Sort every AI-assisted decision before deployment, not after the first incident.

The first job is simple but almost nobody does it: classify every decision your organization might make with AI assistance by the stakes involved and how reversible the outcome is. This determines which tier of human oversight is required.

Tier 1  ·  Low stakes / Reversible
AI Decides — No Review Required
The AI can act autonomously. The consequence of an error is small, easily corrected, and doesn't create downstream risk.
Examples: content recommendations, search ranking, auto-categorization of low-value items, scheduling suggestions
Tier 2  ·  Medium stakes / Mostly reversible
AI Recommends — Human Confirms
The AI produces a recommendation or draft. A human reviews and approves before it takes effect. The human can catch errors before they propagate.
Examples: code deployment (Amazon's new policy), customer communications, contract summaries, pricing adjustments within bands
Tier 3  ·  High stakes / Difficult to reverse
AI Informs — Human Decides
The AI provides analysis and options. The human makes the final decision. AI output is input to the decision, not the decision itself.
Examples: hiring decisions, safety-critical system changes, medical flagging, significant financial decisions, legal document review
Tier 4  ·  Highest stakes / Irreversible
AI Not Involved
Some decisions should not involve AI at this stage of development. The blast radius of an error is too large, and the evidence base for AI reliability in this specific context does not yet exist.
Examples: (define these for your organization — what decisions are simply off the table?)

The Amazon Lesson

Amazon was implicitly operating junior engineers on Tier 1 for decisions that should have been Tier 2. The fix — requiring senior engineer sign-off on AI-assisted code — is a Tier 2 classification applied retroactively. The framework exists to make that classification before deployment, not after the outage.

Component 2 of 5

Trust Calibration Criteria

For any given AI use case, answer these six questions before deployment.

Tier classification tells you what oversight level is needed. Trust Calibration Criteria tell you whether your current evidence base actually justifies the trust level you're extending. These should be answered by the people deploying the AI — not assumed.

Question Why it matters Red flags
What is the evidence base for this AI's performance in this specific domain? General capability benchmarks don't predict performance on your specific data and decisions. "It performs well generally" without domain-specific testing
What are the known failure modes — and how does the system signal when it's in one? AI systems that perform confidently in failure modes are the most dangerous. The signal is the gap. No known failure characterization; system provides no uncertainty signal
What is the blast radius if this AI is wrong? Small errors in low-blast-radius decisions are learning. Large errors in high-blast-radius decisions are incidents. Blast radius is large, poorly defined, or dependent on downstream systems
How reversible is the decision if the AI output is wrong? Reversibility determines how much tolerance you have for error, and how quickly you need to catch mistakes. Decision is irreversible or reversal is costly and time-consuming
Who on the human side is equipped to evaluate the AI's output? "User error" is often "system deployed to users who couldn't evaluate it." The human checkpoint is only meaningful if the human can actually check. Human reviewers lack the expertise to meaningfully evaluate AI output
How will you know if trust calibration needs to change? Trust calibration is not a one-time decision. It requires a feedback loop to update as conditions change. No monitoring, no incident reporting, no scheduled review
Component 3 of 5

Human Override Protocol

Define exactly when a human takes control — before anyone needs to do it under pressure.

The aviation industry's single most important contribution to autopilot safety wasn't the autopilot. It was the precise definition of when the pilot is flying and when the machine is flying — and exactly what the handoff between those states looks like.

Your AI deployments need the same thing. The override protocol answers three questions:

1

Under what conditions does a human override the AI?

Define specific triggers — not "when something seems wrong" but precisely: when output falls outside a defined confidence range, when certain data conditions are present, when a decision crosses a defined threshold. Make it concrete enough that any person in the role can apply it without judgment calls under pressure.

2

Who has override authority — and are they in the room?

Override authority should sit with the person who has both the expertise to evaluate the AI's output and the organizational standing to act on that evaluation. This is often not the same person using the AI. Define it explicitly, including escalation paths.

3

What does the handoff look like?

When a human takes control from an AI, what happens? Is there a documented transition? Does the AI system log the override? Is there a review of why the override happened? This is how you turn individual override events into institutional learning.

Don't leave vague

  • "Use your judgment"
  • "If something looks off"
  • "When in doubt, escalate"
  • Undefined confidence thresholds
  • No escalation path specified

Make it concrete

  • Specific confidence score thresholds
  • Named decision conditions that trigger review
  • Named individuals with override authority
  • Defined escalation chain
  • Logged transition with timestamp and reason
Component 4 of 5

Incident Attribution

Define how you'll assign responsibility before the incident happens — not after.

When an AI-assisted decision leads to a bad outcome, there are three possible attributions: AI error, human error, or system design error. How you attribute the failure determines whether you build a protocol or just blame the person who was closest to it.

Amazon's "user error, not AI" framing on their Kiro tool incidents is an example of what happens without this framework. When you attribute failures to user error, you repair the user behavior. When you attribute them to AI error, you repair the system. When you attribute them to system design, you repair the deployment model. Getting the attribution right is how you learn the right lesson.

Attribution Type Definition Response
AI Error The AI produced an output that was incorrect, incomplete, or confidently wrong on a well-defined task within its specified domain. Update model, adjust training data, flag failure mode in calibration criteria, raise tier if needed
Human Error The AI produced a reasonable output. The human reviewer failed to evaluate it correctly, given the tools and expertise available to them. Review training, review access to appropriate expertise, assess whether human oversight was meaningful
System Design Error The AI produced output it was designed to produce. The human reviewer behaved as expected. The system was deployed in a way that made failures likely — wrong tier, missing override protocol, mismatched expertise. Revise deployment model, reassign tier, implement or strengthen override protocol — this is the most common and most underdiagnosed category

The Key Question

After every incident: "If we deploy this system identically tomorrow, is the same outcome likely?" If yes, you have a system design problem, regardless of how you've attributed individual errors. The protocol should prevent recurrence — not just document what happened.

Component 5 of 5

Audit and Evolution

Trust calibration is not a one-time decision. It requires a feedback loop.

AI systems change. Data distributions shift. New failure modes surface. The risk environment evolves. A trust framework that was accurate at deployment will drift from reality unless it's actively maintained.

Minimum Viable Audit Requirements

Log AI-assisted decisions at Tier 2 and above. At minimum: timestamp, decision type, AI output, human action, outcome. This is how you build the dataset you need to evaluate whether trust calibration is holding.

Schedule a trust calibration review at least annually — or after any significant incident. Re-answer the six Trust Calibration Criteria questions for each deployment. Have conditions changed? Has your evidence base updated?

Track override rates. If humans are overriding the AI frequently, the system may be operating above its appropriate tier. If humans are never overriding, they may not be meaningfully reviewing — or the system has improved enough to warrant a tier reduction.

Conduct attribution analysis quarterly. Review recent incidents, apply the attribution framework, and look for patterns. One system design error is an incident. Three system design errors in the same deployment is a signal that the framework needs adjustment.

Document tier reclassifications. When a deployment moves up or down a tier, record why. This is how you build institutional knowledge about how trust calibration evolves over time — and how you make the case for future decisions.

Quick Reference

The Pre-Deployment Checklist

Before any AI-assisted decision system goes live, answer every item below.

AI Trust Framework — Pre-Deployment Review

Complete before any new AI-assisted workflow goes live. Re-run after significant changes.

Component 1 — Decision Classification

This deployment has been assigned a Tier (1–4) based on stakes and reversibility

The tier assignment is documented and the rationale is recorded

Stakeholders who will use this system know what tier it is and what that means

Component 2 — Trust Calibration

We have domain-specific performance evidence (not just general benchmarks)

Known failure modes are documented, and the system's uncertainty signal is characterized

Blast radius of an error is defined and acceptable for the assigned tier

Human reviewers have the expertise to meaningfully evaluate AI output

Component 3 — Override Protocol

Override triggers are defined specifically (not "use your judgment")

Override authority is assigned to a named role with appropriate expertise

Override events are logged with timestamp and reason

Component 4 — Incident Attribution

Attribution framework is in place — team knows the difference between AI error, human error, and system design error

Incident reporting process exists and is accessible to everyone who uses this system

Component 5 — Audit and Evolution

Decision logging is in place for Tier 2 and above

First calibration review is scheduled (no longer than 12 months from deployment)

Override rate tracking is configured