Behavioral Orchestration
A framework for how AI should behave in safety-critical environments — when to surface, when to wait, when to escalate, when to stay quiet.
Context & origin
It didn't begin as a framework — it began as fifteen years of the same problem in different forms: how AI, automation, and ambient intelligence should behave when they share a surface with a person whose attention is finite and whose decisions carry weight. At Google it was governing behavior across a product surface used by billions; at Amazon Lab126, ambient devices and the Widget Drawer surfacing without demanding attention; at 42dot, in-vehicle AI deciding whether to suggest, confirm, escalate, or stay silent where the cost of error is measured in seconds. I named and structured the synthesis into a whitepaper, written independently after my most recent role — before I had the chance to build the product version.
Four scenarios · Four outcomes
Scenario
Pick a scenario to see how the system reads it.
Inbound event
—
—
What the system reads
- Event importance—
- Urgency—
- Driver activity—
- Attention available—
- Reversibility—
Decision
Why
The framework’s job is not to deliver. It’s to decide whether to.
The Decision Moment
Same event. Two contexts. Two outcomes.
In the article's vocabulary: criticality ≈ what the event is worth + how urgent it is; workload ≈ what the person is doing + how much attention the moment can bear; timing/authority ≈ what's reversible.
Context A · Surface
→ Surface
Ephemeral banner. Voice reply offered. Dismissible.
Context B · Suppress
→ Suppress, defer
Held silently. Re-surfaces at next low-workload window. Driver never interrupted.
The framework's job is not to deliver. It's to decide whether to.
The framework
A core lifecycle — Predict → Suggest → Confirm → Act → Learn — where each stage is a designable surface with its own bound: prediction by humility, suggestion by timing, confirmation by authority, action by reversibility, learning by consent. Underneath sit six behavioral primitives:
- Criticality — impact and urgency, on a four-level scale.
- Confidence — how sure the system is, and what it may do at each level.
- Timing — whether the moment suits the interaction, given workload.
- Workload — the person's current cognitive demand, from environmental, behavioral, and intent signals.
- Authority — who has the right to act, and whether confirmation is required.
- Suppression — the explicit decision not to surface, and when deferred information returns.
These primitives feed a Criticality Framework that standardizes how aggressively information surfaces:
| Level | Type | Surface behavior |
|---|---|---|
| C1 | Informational | Silent badge, logged for later |
| C2 | Relevant | Ephemeral banner, soft chime |
| C3 | Time-sensitive | Dynamic card, voice interaction |
| C4 | Safety-critical | Full takeover, audio and haptic |
Beyond the vehicle
The car forced the synthesis, but the framework isn't about cars. The same primitives govern a surgical assistant deciding whether to surface a finding mid-procedure, a factory agent deciding whether to interrupt an operator, a home robot deciding whether to speak, a coding agent deciding whether to act or wait. The domain changes; the behavioral question doesn't.
Outcome
The framework gives a team three things a feature-by-feature approach can't: a shared vocabulary for the decisions a system makes, a governable layer above any single surface, and a posture that treats restraint as designed behavior rather than a missing feature.
The bet underneath it: the AI systems people trust over time are the ones that know when to stay quiet.
The whitepaper is available on request.
Request access: pejmon00@gmail.com