A visual essay · 24 verified papers

The Sidecar Oracle

Why fast models need a slow mind riding beside them — and the research that proves every piece of it.

Every arXiv link verified against the arXiv API · papers embedded below

Read the whitepaper (PDF)

SYSTEM 1 · ON THE CLOCK YOUR AGENT talks in <800ms

SYSTEM 2 · OFF THE CLOCK SUPAFONE_LABS

The sidecar: it doesn't drive, it doesn't slow the bike down — it watches the road and taps the rider's shoulder.

Honesty first: no single paper coined a "sidecar oracle," and Sakana AI — whose adjacent work appears in thread four — has not published this architecture. What follows is the assembled evidence: five research threads that each validate one component.

1 · A fast talker, a slow reasoner

The model on the latency clock cannot also be the model that deliberates. Kahneman's dual-process theory, built as an agent.

2024Google DeepMind

Agents Thinking Fast and Slow: A Talker-Reasoner Architecture

A fast conversational "Talker" paired with a slower "Reasoner" that plans and maintains the beliefs the Talker acts on. The closest published precedent for this product.

In Supafone Labs: your voice agent is the Talker; the oracle is the Reasoner; the belief state is the shared memory.

2020IBM Research

Thinking Fast and Slow in AI

The standard position paper mapping System 1 / System 2 onto AI design: fast reactive components, slow deliberative supervision.

2 · Supervision must be external

A model grading its own live output — with no outside signal — often makes things worse. That's the argument for a second mind, not a longer prompt.

2023ICLR 2024the key negative result

Large Language Models Cannot Self-Correct Reasoning Yet

Intrinsic self-correction without external feedback frequently degrades performance. External feedback is what makes correction work.

In Supafone Labs: the supervisor is a separate model with separate context and ground-truth tool results — exactly the external signal this paper says is required.

2023CMU / AI2

Self-Refine: Iterative Refinement with Self-Feedback

Iterated feedback loops improve outputs across tasks — the refine-from-feedback mechanism the supervisor applies from outside.

2023Northeastern / MIT

Reflexion: Language Agents with Verbal Reinforcement Learning

Agents that store verbal reflections on failures and improve later attempts — the memory-of-mistakes pattern behind post-call reports.

3 · A second model checking the first works

Across math, reasoning, and safety: a dedicated checker consistently catches what the generator misses — even when the checker is smaller.

2021OpenAI

Training Verifiers to Solve Math Word Problems

A separate verifier ranking a generator's outputs beats just making the generator bigger — the original generator/verifier split.

2023OpenAI

Let's Verify Step by Step

Supervising each step of a process beats judging only the final outcome.

In Supafone Labs: the oracle judges every turn of the call, not just the end-of-call summary.

2025OpenAI

Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation

A weaker LLM monitoring a stronger model's chain-of-thought catches reward hacking at production scale.

In Supafone Labs: the oracle can be a small, cheap model and still catch a bigger agent's failures.

2023Meta

Shepherd: A Critic for Language Model Generation

A purpose-trained 7B critic that critiques other models' outputs at quality competitive with far larger judges.

4 · Models overseeing models, at inference time

Including Sakana AI's adjacent work: models correcting and building on each other with no retraining beats any single model alone.

2025Sakana AI

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

AB-MCTS (open-sourced as TreeQuest): multiple frontier models cooperate at inference time, correcting each other's attempts — the combination beat every individual model on ARC-AGI-2. Announcement.

In Supafone Labs: the same principle at call time — two models on one problem beat one model alone.

2025Sakana AI

Reinforcement Learning Teachers of Test Time Scaling

A "teacher" model trained not to solve tasks but to guide a student model — Sakana's closest work to a dedicated helper-model role.

2025Sakana AI · Institute of Science Tokyo

Transformer²: Self-adaptive LLMs

A two-pass inference framework: a dispatcher identifies the task, then RL-trained "expert" vectors reweight the model's singular values in real time — the model adapts itself per request, beating LoRA with fewer parameters.

In Supafone Labs: adaptation at inference time, not training time — the same philosophy as whisper injection.

2024Sakana AI

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Includes an automated LLM reviewer critiquing another model's generated papers with near-human accuracy. v2 (2025) adds agentic tree search.

2023MIT / Google

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Model instances critiquing each other's answers measurably improves factuality — disagreement between models is signal.

2018OpenAI

AI safety via debate

Agents exposing each other's flaws for a judge — the foundational adversarial-oversight framing.

2022Anthropic

Constitutional AI: Harmlessness from AI Feedback

Models critiquing and revising outputs against explicit written principles — the pattern behind operator guardrails enforced by a critic.

5 · Prompts that improve from measured feedback

Score real outcomes, hand them to a critic, get a better prompt, version it. The standing-directive loop is this lineage, run on live calls.

2025Sakana AI · UBC

Darwin Gödel Machine: Open-Ended Evolution of Self-Improving Agents

Agents that rewrite their own code, keep the variants that empirically perform better, and improve open-endedly — SWE-bench 20.0% → 50.0%, Polyglot 14.2% → 30.7%. The strongest evidence yet that agents improve when graded against real outcomes.

In Supafone Labs: the same evolutionary shape at prompt scale — versioned standing directives, kept only when measured call scores improve.

2023Google DeepMind

Large Language Models as Optimizers

OPRO: an LLM iteratively proposes better prompts from a trajectory of scored attempts.

In Supafone Labs: /v1/optimizer/improve is OPRO over your post-call reports — scores in, better standing directive out.

2023Stanford

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Prompts as parameters optimized against metrics, not hand-tuned strings.

2024Stanford

TextGrad: Automatic "Differentiation" via Text

Natural-language feedback backpropagated as "textual gradients" through compound AI systems.

2023Microsoft

Automatic Prompt Optimization with "Gradient Descent" and Beam Search

ProTeGi: critiques of concrete errors act as gradients that edit the prompt.

6 · The industry already puts a model beside the model

Production guardrail systems converged on the same shape: a separate runtime component watching the live model's traffic.

2023NVIDIA

NeMo Guardrails: A Toolkit for Controllable and Safe LLM Applications with Programmable Rails

A programmable runtime layer between the user and the LLM enforcing dialogue rails.

2023Meta

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

A dedicated safeguard model classifying a conversational model's traffic in real time.

In Supafone Labs: the same runtime position — but instead of blocking, it coaches.

Fast models need slow supervisors. Supervision must be external. A second model catches what the first can't. And prompts should improve from measured outcomes.
That's the sidecar oracle.

Powered by Supafone Labs