Open source · Apache-2.0

Task in. Accountable change in production out.

sembl-stack runs the whole pipeline around your coding agent: a spec becomes declared bounds, an agent writes inside a disposable sandbox, the sembl gate judges the real diff against those bounds, a PASS merges and deploys, and a post-deploy gate confirms it's healthy - or rolls it back. Every stage is a swappable adapter behind one typed contract. The stack doesn't promise the model writes better code; it guarantees the process around the model is correct, recorded, and can't be talked out of its verdict.

Watch a run ↓

sembl-stack loop task.yaml

A scripted replay of a real loop - including the part vendors don't show you: the first attempt gets BLOCKED and retried. Run it yourself: quickstart →

The stage map

Nine stages. One wire between them.

Each stage consumes typed artifacts and produces typed artifacts - that's the whole interface. A stage doesn't know or care what implements its neighbors.

L0Protocol & hubThe typed artifact contract and the stage Protocol every layer plugs into.we own

L1Repo intelCode-graph context for the task - symgraph, codebase-memory, or nothing.adapter

L2Spec → boundsThe task becomes a bounds contract: editable paths, forbidden areas, churn budget.we own

L3ExecuteAn agent writes the change. Claude Code, Aider, OpenCode - the stack never grades its own executor.adapter

L4SandboxThe change happens in a disposable clone. A bad diff can't hurt the real tree.adapter

L5VerifyThe sembl gate judges the real diff against the bounds: PASS / WARN / BLOCK, deterministically.the gate

L5.5ReviewAdvisory code-quality pass on the same diff — bring your own agent-CLI reviewer (or CodeRabbit). A signal, never a gate.adapter

L6OrchestrateThe loop: plan → execute → gate, retry on BLOCK, trace everything.we own

L6.5MergeOnly a PASS verdict bound to this exact change can merge. The MergeRecord says who let it through, and why.we own

L7 / L8Deploy & verify in prodShip, then gate production itself: health and payload checks confirm the delivery - or trigger a rollback.the gate

The artifact contract

Stages talk in artifacts, not in prose.

Every hand-off in the pipeline is a typed, serialized artifact in the run store. That's what makes stages swappable - and what makes a run auditable after the fact: .sembl/runs/<id>/ holds the complete paper trail of how a change reached production.

Taskwhat was asked - the input to everything
Boundsthe contract: editable paths, forbidden areas, churn budget
Changethe real unified diff the executor produced
VerdictPASS / WARN / BLOCK, with reasons, bound to the exact diff it judged
MergeRecordwhat merged, under which verdict, with the binding check's outcome
Deliverywhat deployed where - the input to the post-deploy gate
Traceevery stage transition, timed

a run, on disk

.sembl/runs/2ca41f/
├─ task.json          # what was asked
├─ bounds.json        # the declared contract
├─ change.json        # the actual diff
├─ verdict.json       # the gate's judgement + subject binding
├─ merge-record.json  # what shipped, and under whose PASS
└─ trace.json         # the timeline

Swappable by construction

Change a layer with one line. Not one refactor.

The stack takes no side in the agent wars. If a better executor ships next month, you swap it in sembl.stack.yaml and the rest of the pipeline doesn't notice. The same goes for the sandbox, the repo-intel layer, and the deploy target.

sembl.stack.yaml

execute:  claude      # → aider | opencode | yours
sandbox:  clone       # disposable worktree
intel:    symgraph
gate:     sembl       # the one layer that never lies
deploy:   vercel

We own exactly three things

The artifact contract + stage Protocol, the gate (L5 and the post-deploy L8), and the glue that lets layers be replaced. Everything else is deliberately someone else's best-in-class tool behind an interface.

Executors are interchangeable

Three ship today - Claude Code, Aider, OpenCode - and adding one is an adapter, not a fork. The gate treats them all identically: it only ever sees the diff.

claudeaideropencodemock (no keys)

Presets to start from

just-gate gates any diff with nothing but sembl installed. gate+sandbox runs the whole loop with a mock executor and no API keys. full-loop is the real thing.

The accountable spine

A verdict is bound to the change it judged.

Most agent pipelines stop at "the check passed." sembl-stack's spine goes further: a verdict can only ship the exact change it was issued for.

Verdicts carry their subject

Every verdict is stamped with the SHA-256 and file set of the diff it judged. apply recomputes the hash and refuses a verdict that wasn't issued for that patch; merge refuses if the merge would ship files the verdict never saw.

⊘

BLOCK means blocked

A BLOCK verdict is never applied and never merged - the loop retries the executor instead. Overrides exist (--skip-binding-check), but they're recorded in the MergeRecord forever. There is no quiet way past the gate.

↻

Production is gated too

After deploy, the L8 gate checks the live delivery - health and payload, deterministically - and triggers a rollback when it fails. Proven live against a real deployment, not a diagram.

The gate at the center is Sembl - and it stands alone.

If you don't want the whole factory, take just the gate: pip install sembl gives you the deterministic PASS / WARN / BLOCK verdict in CI, pre-commit, or over MCP. The stack is what you graduate to when you want the gate wired through merge, deploy, and production.

Meet the gate →

Install

From zero to a gated loop in four commands.

Or in one: bare sembl-stack in your repo starts a guided run - agent & keys with live status, the task in plain English, the loop streaming in your terminal. The commands below are the same machinery, scriptable. Full detail in the documentation.

Install & scaffold

pip install sembl-stack sembl

# scaffold sembl.stack.yaml + task.yaml from a preset
sembl-stack init
sembl-stack doctor   # config-aware preflight

Run the loop

# plan → execute → gate → retry-on-BLOCK
sembl-stack loop task.yaml

sembl-stack runs             # inspect the run store
sembl-stack apply <run-id>   # apply the accepted patch