Cosmon

Cosmon is a stateless CLI that gives AI coding agents a persistent identity, a typed lifecycle, and crash-recovery — so you can run ten Claude sessions in parallel without losing track of who is doing what.

No daemon. No database server. No scheduler process. JSON files on disk are the source of truth. One binary, cs tackle / cs done.

This site has two pages, each written for one person standing at one door.

Public-Release Readiness — for the maker whose private galaxy grew up inside a client engagement, and who is afraid that one git push leaks something they can never take back. It is the checklist that walks the galaxy across the one-way door without leaving anything private on the other side.
Modes de pilotage — for the maker who just learned they can say "auto-pilot," and is equal parts excited and terrified of runaway. It draws the fence, the gates, and the leash, so you can hand over the cockpit knowing exactly how much rope the robot has.

This very site is built by the same pipeline the first page describes — mdBook, then Cloudflare Pages, then a Playwright pass that checks the diagrams actually rendered. Cosmon eats its own dog food.

Public-Release Readiness

Your galaxy is about to cross a one-way door. This page is the checklist that walks it across without leaving anything private on the other side.

You built something inside a client engagement. It works. Now you want to open it to the world — but the git history is full of a client's name, an endorser's email, cross-references to private tooling. You are afraid that one git push leaks something you can never take back.

This page is for you. Not "people releasing software" — you, standing at the door, holding a history you're a little ashamed to show, wanting a way across that doesn't burn you.

Here is the picture for the whole page:

flowchart LR
    subgraph private ["Private side — stays behind"]
        G["Your galaxy<br/>+ a backpack:<br/>names, emails,<br/>private tools"]
        ARCH["Frozen archive<br/>(read-only,<br/>full history)"]
    end
    subgraph membrane ["The membrane — nine guards"]
        M["A · B · C · D · E<br/>F · G · H · I"]
    end
    subgraph public ["Public side — what a stranger may see"]
        CLEAN["Your galaxy,<br/>clean.<br/>Backpack left<br/>behind."]
    end
    G -- "crosses" --> M
    M -- "waved through" --> CLEAN
    G -. "backpack drops off" .-> ARCH

A galaxy passes through a membrane. The backpack of private things — names, emails, the orchestrator's private tools — stays behind, frozen and archived. Nothing is thrown away. It just doesn't cross. That is the single most reassuring fact on this page, so it lives in the first screen, not buried in a footnote: you do not lose your history. You leave it home, safe, on the private side.

A checklist of nine phases reads as nine chances to fail. A membrane-crossing reads as one journey with nine guards who each say "clear." Same content, opposite feeling. You arrive scared; the page's job is to turn that fear into earned confidence. So the spine is the crossing, and each guard checks one thing and waves the galaxy through.

The lesson before the list: cosmon crossed its own membrane

Before any guard, one true story — because it teaches the thing that surprises everyone.

Cosmon's engine was about ninety percent built. Typed state machines, crash-recovery, the whole orchestration core — solid, tested, proven in daily use. By every instinct it was ready.

It was not ready. The front door leaked.

There was no AGENTS.md — so a stranger's coding agent landing in the repo had no honest map of where to look.
CLAUDE.md was a real file, not a symlink — so the same instructions lived in two places, already drifting apart.
The handbook pointed a visitor's agent straight at cs and .cosmon/state/ — cosmon's own private orchestrator. A stranger can't run that. The front door was sending guests into the owner's private workshop.
And client names were sitting in pixels — inside committed screenshots — where a text search for the client's name finds nothing. The scrub that reads text walks right past an image.

Here is the counter-intuition, and it is the whole reason this page exists:

The defects do not live in the code. They live in the first five minutes of a stranger's trust.

The engine was excellent. The front door wasn't ready. That gap — between a proven core and an un-crossed membrane — is exactly what the nine guards are built to close. Keep cosmon's own story in mind as you read them: every guard below is a lesson cosmon learned by failing it first.

The nine guards at the membrane

Here is the whole journey on one screen. Read it once, top to bottom, before any detail. The table is the index. The guards are the chapters.

Guard	Checks	Cleared when
A	Is anything private still inside?	Public log carries no red-list term; frozen archive exists
B	Are all the lights green, for the right reason?	Code gates green; local checks mirror every CI gate
C	What does a stranger see in their first five minutes?	Zero open BLOCKERs in the entry-point pre-mortem
D	Does your front door send strangers into your private house?	No public pointer to a tool a stranger can't run
E	Does the public site render cleanly?	HTTPS live, visual QA clean (this page passed it)
F	Can you ship a version with one safe gesture?	The documented release recipe is the real one
G	Are your names claimed?	Package names reserved; nothing internal is publishable
H	Are you promising only what you deliver?	No over-promise; every gap is a named roadmap item
I	Is the origin story true for the public repo?	The making-of reads true to someone holding only the public repo

Three of these run in order — a hard spine. The scrub (A) comes first, because nothing else is safe until the private things are out. The stranger's-eyes pre-mortem (C) and the front-door scrub (D) follow, because they read the cleaned repo. The rest — CI (B), the doc site (E), namespaces (G), standards honesty (H) — are a parallel finishing fan: independent guards that can all check at once once the spine holds.

flowchart LR
    A["A · scrub<br/>private out"] --> C["C · stranger's<br/>five minutes"]
    C --> D["D · front-door<br/>scrub"]
    D --> CROSS(["membrane<br/>crossed"])
    B["B · CI green"] --> CROSS
    E["E · site<br/>renders"] --> CROSS
    F["F · one-gesture<br/>release"] --> CROSS
    G["G · names<br/>claimed"] --> CROSS
    H["H · honest<br/>promises"] --> CROSS
    I["I · true<br/>making-of"] --> CROSS

Now, each guard.

Guard A — Is anything private still inside?

This is the guard you came here scared of, so it comes first.

Make a red list — every name, email, client, and private tool that must never appear in public. Then prove the public copy contains none of them: git log -p of the public copy turns up zero red-list terms.

Two ways across, depending on how the private stuff is spread:

A few stray names, scattered thinly → rewrite them out with git filter-repo (token-replace plus a mailmap for the emails). The history keeps its shape; only the private words are gone.
Smeared across hundreds of commits (cosmon's neighbour oxymake had the client's fingerprints in 411 places) → you can't pick them out one by one. Squash the whole history to a single commit. The public repo starts fresh, clean, at commit one.

Either way — and this is the reassurance, made concrete — the full original history lives on in a private, archived, frozen repo (<your-project>-archived-history, then gh repo archive to make it read-only). That is the backpack from the first picture. It stays behind. It is not deleted. You can always open it; a stranger never can.

The invariant, in plain words: never git push from the working repo that still holds the full history. The public copy comes from a clean clone — one with no private commits to leak in the first place.

Operator-only. Guard A's destructive work — filter-repo, squash, archive — is proposed by the fleet and pulled by your hand. Never automatic. The robot prepares the surgery; you make the cut.

Cleared when: the public log is clean and the frozen archive exists.

Guard B — Are all the lights green, for the right reason?

You glance at CI and every check is red. Panic? Look closer.

A private repo on a free plan runs out of Actions minutes. When that happens, every job fails — not because your code broke, but because of a billing message dressed up to look like a code failure. The trap is to pay to turn the lights green on a repo you're about to make public.

Don't. Actions is free and unlimited on a public repo. The red vanishes the moment you flip public. So the real check is one step back: on the last commit before the minutes ran out, were the code gates green?

While you're here, close the matching gap: your local Definition of Done must mirror every CI gate. Oxymake found out the hard way that its CI ran cargo doc -D warnings but its local checklist didn't — a red light invisible from the maker's desk. If CI checks it, your local recipe checks it too.

Cleared when: code gates green on the last good commit, and local checks mirror every CI gate.

Guard C — What does a stranger see in their first five minutes?

This is the guard that catches what cosmon's own story warned about: the defects live in the first five minutes of trust, not in the code.

Run a pre-mortem panel — cosmon's reusable deep-think formula, a roomful of personas reading the repo as a first-time visitor — through two pairs of eyes:

The user who wants to try it: About → README → install → first run → docs. Does the path hold, or does it break at step three?
The developer who wants to contribute: architecture, invariants, ADRs — and the path to plug in their own agent system, not yours.

The panel hands back a punch-list, sorted BLOCKER / MAJOR / MINOR, each item pinned to a file and a line. The verdict cosmon and oxymake both got is the one you should expect: the engine is usually excellent and proven; the front door is what isn't ready.

Cleared when: zero open BLOCKERs before the flip.

Guard D — Does your front door send strangers into your private house?

The sharpest guard, and the one most makers don't know they need.

Your repo's agent-facing surface — AGENTS.md — is a transport layer. It points at the one true source of instructions (man <tool>, <tool> help, CONTRIBUTING.md) and never paraphrases it. Paraphrase, and the two copies drift; a year later they contradict each other and nobody knows which is true. CLAUDE.md is a symlink to AGENTS.md — same file, one place.

The invariant, as one image: a stranger's agent must never be handed the keys to your private workshop. No public pointer to cs, to cosmon, to .cosmon/state/ — tools a stranger cannot run. Cosmon's handbook once pointed a visitor's agent at cs help; oxymake's did the same. Both were sending guests into a room only the owner can enter. That is exactly the leak this guard closes.

Cleared when: every instruction lives in exactly one place, and there is no public pointer to a tool a stranger can't run.

Guard E — Does the public site render cleanly?

Build the doc site (mdBook, or the equivalent), then deploy it to Cloudflare Pages on your project's domain. Tie the apex with a proxied CNAME so your existing email keeps working — the MX and SPF records survive the move.

Then check the rendering with Playwright, the way a real browser would see it:

The mermaid diagrams render as pictures — not shown as raw code.
No real console errors.
Syntax highlighting is correct in the code blocks.

This is also cosmon eating its own dog food: the page you are reading right now passed this exact guard. The same pipeline that built this site is the one Guard E describes.

Cleared when: the site serves over HTTPS on the target domain, and the visual QA is clean.

Guard F — Can you ship a version with one gesture, safely?

A release should be one turnkey recipe — just tag-release X.Y.Z — that does the whole dance for you:

Guards: working tree clean, you're on main, the tag doesn't already exist, the CHANGELOG carries this version.
Preflight: test, clippy, fmt, doc — all green.
Ship: regenerate the man pages, bump the version, commit, annotated tag, push — and CI builds the release.

The invariant: this recipe runs only from a clean-history clone. Its own git push would leak the private history otherwise. (Guard A's clean clone is the soil this recipe must grow in.)

Cleared when: the documented recipe is the real recipe — no drift between what the docs say and what the command does — and a tag can never ship an untested binary.

Guard G — Are your names claimed?

On crates.io, npm, pypi — reserve your name before someone else takes it.

A placeholder package with publish = true holds the name; every internal library is marked publish = false so nothing internal can ever leak out through a stray publish. Two steps, never rushed: reserve first (a placeholder at v0.0.0), then publish the real package at release. The publish itself is a manual, one-shot gesture — never wired into CI, where a mis-fire would push the wrong thing to the world.

Cleared when: cargo package / npm pack --dry-run / python -m build pass, and the names are claimable.

Guard H — Are you promising only what you deliver?

If your galaxy aligns to a standard (a research project might align to FAIR4RS), climb an honesty ladder: promise only what actually ships today. Frame everything else as a named roadmap item — not a vague "we support X" that a careful reader will catch as false.

A false compliance claim is the one defect that erodes trust permanently. Better to under-promise and name the gap.

Cleared when: there is no over-promise, and every gap is a named roadmap item.

Guard I — Is the origin story true for the public repo?

If you squashed the history (Guard A), the public repo no longer holds the long trail of how it was built. So when you write the making-of, tell it accurately for the repo a stranger actually holds. The erased traces live in the private archive — don't claim they're "still in the repo" when they aren't.

Cleared when: the making-of reads true to someone holding only the public copy.

The membrane crossed — and what stays in your hands

Recall the first picture. On the public side: your galaxy, clean. On the private side: the frozen backpack, every private thing accounted for, nothing thrown away.

Now the honest line that earns long-term trust:

The fleet prepares and proposes every guard. But the gestures you can never take back stay under your finger, not the robot's.

The irreversible, operator-only gestures:

Flip the repo to public.
cargo publish (and npm / pypi publish).
Push a release tag.
Delete or recreate a repo.
Change DNS.

The robot walks the galaxy all the way to the door, checks every guard, and tells you each one is clear. You open the door. That division — robot prepares, human commits the irreversible act — is the same discipline you'll meet on the next page, Modes de pilotage, where the robot runs free inside a fence but never opens the four gates alone.

Modes de pilotage

Auto-pilot is a dog on a long leash in a fenced yard: free inside the fence, stopped at four gates it will never open alone, and a leash you can tug at any second.

You just learned you can say "auto-pilot," and the system will start doing things on its own. You are excited — and a little terrified. The fear has a name: runaway. If I hand over the cockpit, what stops it from doing something I'd never have allowed?

This page makes the leash visible. Here is the one picture for the whole thing:

flowchart TB
    subgraph yard ["The fenced yard — free movement inside"]
        DOG["The dog runs free:<br/>tidies, fetches,<br/>patrols, takes notes"]
    end
    LEASH["The leash in your hand<br/>— tug any second —"] -. "holds" .-> DOG
    DOG --> GATE1["Gate · new direction"]
    DOG --> GATE2["Gate · real fork"]
    DOG --> GATE3["Gate · can't be undone"]
    DOG --> GATE4["Gate · rewiring the house"]
    GATE1 -. "sits and barks for you" .-> YOU(["You decide"])
    GATE2 -. "sits and barks for you" .-> YOU
    GATE3 -. "sits and barks for you" .-> YOU
    GATE4 -. "sits and barks for you" .-> YOU

Inside the fence, the dog runs free — it fetches, it tidies, it patrols, and you don't micromanage each step. But the fence is real. At the edge there are four gates the dog will never push open on its own; it sits and barks for you instead. And you always hold the leash: three quick tugs and the dog freezes mid-stride.

Bounded autonomy is free movement inside a real fence, with an always-present leash. That single image carries the whole page. We'll walk it in order: the fence, what runs free inside, the four gates, the leash, and how a good co-pilot talks to you.

Switching it on is always your hand

Auto-pilot never turns itself on. It starts only when you say it — "auto-pilot," "je te passe le cockpit," "mode automatique."

The dog doesn't unclip its own leash. You do.

This matters more than it looks. It means the default state is always you driving. Silence is not consent. The cockpit stays yours until you hand it over with a conscious word.

Inside the fence: what the dog does without asking

Once you've switched it on, here is the free-running zone — the things the robot does inside the fence without stopping to ask. Each one is a small, safe, reversible chore:

It tidies up finished work it finds lying around. Completed jobs get harvested and closed out, and it reports back in plain words what it did.
It picks up the next clearly-marked job — but only when the path is obvious and already inside the scope you set. (The next ready item on the hot shelf, nothing surprising.)
It moves a job from the shelf to the desk when the thing it was waiting on finally arrives. (Warm → hot, once the blocker clears.)
It jots a sticky-note for a small follow-up it noticed — without starting it. A note on the board, not a new project launched behind your back.
It writes a short line in the logbook when something taught a lesson — a brief chronicle entry, not a long essay.
It glances at the pictures after a build to check they look right — a visual spot-check of the screenshots and rendered output.

Inside the fence, the robot tidies, picks up obvious chores, and takes notes. It never opens a gate.

Notice what's not here: nothing irreversible, nothing new-in-direction, nothing that rewires the system. Those are gates, not chores.

The four gates the dog never opens alone

These are the fence edges. At each one, the dog sits and barks instead of pushing through — every time, even on auto-pilot.

A new direction. A whole new galaxy, or a feature outside the job you posed. The robot doesn't decide where the work goes.
A real fork in the road. A design choice with genuine trade-offs, where two good answers pull apart. That's a decision with your name on it.
Anything it can't take back. Push, reset, delete, uninstall — and the big irreversible gestures from the previous page: flip-to-public, cargo publish, push a tag, change DNS. If it can't be undone, the robot stops.
Changing the house's wiring. Installing a background service, editing the rules the robot itself obeys, touching shared configs. The robot doesn't rewire the house it lives in.

If it can't be undone, isn't inside the posed scope, or rewires the system — the robot stops and asks, every time, even on auto-pilot.

The leash: three ways to freeze it instantly

You always hold the leash. There are three tugs, and any one of them freezes the dog where it stands:

You speak. Any message from you, at any moment, pauses it. This is the simplest tug of all: just say something.
You drop a stone in the yard. touch ~/.cosmon/autopilot.off. The robot checks for that stone at every decision point; stone present, it sits down and waits.
It trips three times on the same rock. Three failures in a row on the same job, and it stops itself and waits for you. The robot knows when it's stuck.

The leash is always in your hand — a word, a stone, or its own stumble freezes it.

How the robot talks to you — three rules of a good co-pilot

Bounded autonomy is half of it. The other half is good manners — the way the robot talks to you so the back-and-forth never wastes your attention. Three rules, each a picture.

One question, one decision

The robot never asks "should we do X now, or wait for Y first?" — that's two questions wearing one coat, and you can't answer it cleanly. It asks one thing, proposes a default, and lets you say yes / no / not yet (the iMessage 1-2-3-later pattern).

A door with one handle, not a menu.

If two decisions are genuinely needed, it knocks twice — two clean doors, not one tangled one.

It re-knocks at the doorway

A question asked earlier can get lost in a busy session — you move on, the moment passes, the work stalls waiting on an answer you never saw. So the robot asks again — same words, same choices — at the next natural pause, so you recognise it instantly.

A child who asked "are you asleep?" and, getting no answer, asks again at the door rather than giving up.

It speaks in pictures, not jargon

Every reply uses concrete images and short sentences — the picture a smart 8-year-old would grasp — not dense tables of technical terms. Trees, kitchens, paper planes, a dog on a leash.

Showing you a drawing of the kitchen instead of reading you the recipe's ingredient codes.

This voice has a name: the Feynman register. That's the name of how the robot speaks to you — a promise about its voice. It is not a switch you flip or a mode you select; you can't "turn it on." It's simply how a good co-pilot talks. (This whole site is written in it.)

Telling you a secret before you send it, not after

One rule gets its own section because it's sharp and easy to get wrong.

Suppose something you're about to send out — an email, a message — must stay private. The robot folds that "please don't forward" into the same message — never as an anxious second message a minute later.

You seal the "confidential" stamp onto the envelope before you mail it. Running down the street after the postman shouting "don't show anyone!" only makes the letter look nervous — and the recipient has already read it.

Two honest moves only: stamp it before sending, or send it freely. Never tighten the secret after the fact. The embargo travels with the artefact or not at all.

Calibrated trust

Recall the dog and the fence one last time.

Auto-pilot is not "the robot takes over," and it is not "you watch it like a hawk." It is a fence you can see, four gates it always knocks on, and a leash you always hold.

That's the whole deal. You switch it on knowing exactly how much rope it has — not blind faith, not paralysis, but calibrated trust. The dog runs free inside the yard, sits and barks at the four gates, and freezes the instant you tug the leash.

Keyboard shortcuts

Cosmon