// COMING 2026

Prompt Recovery

A novel about building AI systems that actually work. Think The Goal and The Phoenix Project — but for teams shipping large language models into production.

by Michael John Peña

Get notified at launch Read the premise
Prompt Recovery book cover

⚠ Cover art is a work-in-progress placeholder

agentOS-prod — tmux
incident 0:ops-triage* 1:agents 2:eval 3:logs sarah@autoscale   02:47 AM
▸ pane 0 — ops triage
$ kubectl get pods -n agentOS-prod NAME READY STATUS RESTARTS agentOS-router-7b4f9 0/1 CrashLoopBackOff 47 agentOS-llm-worker-3 0/1 OOMKilled 12 agentOS-eval-runner 1/1 Running (lying) agentOS-gateway-a2c1 1/1 Running 0   $ cat /var/log/billing-alert.log | tail -3 [CRIT] OpenAI spend: $47,231.89 / 24h (budget: $2,000) [WARN] Token burn rate: 4.2M tok/min — 3x normal [INFO] Cost anomaly detection triggered at 11:47 PM   $ ./recover.sh --plan --agents=all ▸ Loading recovery playbook... ▸ 25 chapters. 90 days. One shot. $
▸ pane 1 — agent swarm
$ agentctl status --watch ┌─ Agent Fleet ──────────────┐ │ router ✗ crashed (47x) │ │ planner ⚠ looping │ │ retriever ✓ healthy │ │ executor ⚠ throttled │ │ evaluator ✗ lying │ └────────────────────────────┘ ↻ refreshing in 5s...
▸ pane 2 — live tail
$ stern -n agentOS-prod --since 5m 02:44:12 planner → "Retrying prompt... attempt 94" 02:44:13 router → panic: nil pointer dereference 02:45:01 executor→ rate limit hit (429) 02:46:58 eval → assert failed: "accuracy" > 0.7 02:47:02 gateway → Sarah connected from 10.0.1.42 02:47:03 gateway → "Let's fix this."
// The Premise

Day one.
Everything is already on fire.

Sarah Chen is a seasoned engineering leader who has just been hired to run the AI platform at AutoScale — a fast-growing startup whose crown jewel, AgentOS, is held together by one exhausted engineer, duct tape, and good intentions.

Sarah's Slack notification sounded at 11:47 PM. Then again at 11:48. By 11:50, her phone was buzzing with the intensity of a trapped bee. She knew what that meant: production was on fire, and she was about to do something reckless about it.

She untangled herself from the couch, dislodging Kernel from her lap and earning a look of betrayal that only a cat could deliver with such precision.

— Chapter 1: Into the Fire

She has 90 days before the board pulls the plug. What follows is a crash course in building AI systems that survive contact with reality — told through the lens of one team's fight to turn chaos into something they can be proud of.

// What You'll Learn

Real engineering. Real consequences.

Every chapter embeds production-grade AI engineering concepts inside a story you can't put down.

🪟

Context Window Architecture

Why your prompts break at scale and how to design context as a contract, not an afterthought.

🛡️

Guardrails & Safety

Prompt injection, jailbreaks, and the layered defense patterns that keep AI systems from going off the rails.

📊

Evaluation That Doesn't Lie

Moving beyond vibes-based testing. Building eval frameworks that catch failures before your customers do.

🔁

Agent Orchestration

Multi-agent systems, cascade failures, circuit breakers, and the patterns that make AI agents reliable.

👁️

Observability & Cost Control

Tracing LLM calls, spotting the $47,000 Tuesday before it happens, and building dashboards that matter.

⚖️

AI Ethics in Practice

Not theory — the messy, real-world moments when technically correct recommendations have devastating human consequences.

// The Structure

Three acts. Twenty-five chapters.

A ninety-day journey from inherited chaos to production confidence.

Act I — Days 1–30

Inheriting Chaos

Sarah discovers what's broken: runaway costs, a single point of failure, shadow agents nobody owns, evaluations that lie, and a team on the edge of burnout.

Chapters 1–8
Act II — Days 31–72

Building the Foundation

With the clock ticking, the team rebuilds around three principles: context is your contract, reliability through orchestration, deploy with humility.

Chapters 9–18
Act III — Days 73–90

Trial by Fire

The enterprise demo. A crisis of values. A walkout. And the beginning of everything that comes after.

Chapters 19–25
// Who It's For

If you've ever been paged at 2 AM over an AI system, this book is for you.

// If You Liked
📕 The Goal + 📕 The Phoenix Project + 🤖 AI Engineering + ⚙️ Release It! + 🧠 The Alignment Problem

…then Prompt Recovery was written for you.

// Stay in the Loop

Get notified when it launches.

No spam. Just a single email on launch day — plus an optional early chapter preview for subscribers.