Cascade Chat Has Moved to an ADLC

A while back I built an IRC client and mentioned, mostly in passing, that the real reason was to have something non-trivial to point at the next time someone insists AI can only crank out greenfield toy apps. Cascade Chat is that something. What I never actually wrote down is how it gets built, and the how has settled into enough of a routine that I've started calling it a process. An Agentic Development Lifecycle, ADLC, mostly because it needed a name.

The shape of it is simple enough. I write a spec, an agent turns it into a plan, the plan gets verified before it's ever a PR, and then the PR gets put through the wringer before anything ships. The only places my hands are on the work are the very front and the very back: ideation, architecture, and the call that something is good enough to release. The whole middle runs without me.

And before anyone reaches for the word slop, none of this holds up if the testing underneath it is for show. I've said it before and I'll keep saying it: if your coverage was weak before you let an agent loose in the repo, it's going to be weak a great deal faster now. The agent inherits your bar. It won't raise it for you.

Spec, then plan

It starts with me writing a spec. It doesn't need to be a novel; just a plain statement of what I want, how it should behave, and the constraints I'm not willing to bend on. The agent reads it and comes back with a plan: the steps it'll take, the files it expects to touch, what it intends to test. Then it goes and does the work, and I'm not standing over its shoulder the whole way.

I write the what. It writes the how. That split is the entire trick, and it only holds up if the what is genuinely clear. Hand over a vague, half-awake prompt and you'll get back a plan you're redoing by lunch. The spec is where the actual thinking lives; everything past it is just typing. The people who skip that part and then gripe that the agent "doesn't understand them" are, frankly, reviewing their own prompt.

The rules live in the repo

The spec is the part I write fresh every time. It isn't the only thing the agent reads, though. Sitting at the root of the project are a CLAUDE.md and an agents.md, and they get pulled into the context on every change whether I think to point at them or not.

That's where the standing orders live. How the Go is laid out and which file a Wails binding belongs in. That a database change means editing the schema and regenerating, not hand-writing a query. That errors get wrapped on the way up. That nothing is finished until the tests and the checks are all green. I wrote those down once. Every agent that touches this repo, today or six months from now, gets handed the same set before it does a single thing.

It's why, by the time a change is sitting in a PR, the tests are already written and the checks already pass. I didn't have to ask for any of it this round; the repo asked on my behalf, the way it does every time. The spec covers what I want out of this one change. The file covers what I want out of all of them, and writing that down was the highest-leverage hour I've put into the whole setup.

Watching it work

This next part is the one I'm a little proud of. Before a change is allowed anywhere near a PR it gets verified, and wherever it makes sense that verification is something I can sit and look at.

Cascade Chat has a fully working local environment, and when Playwright runs it isn't prodding at mocks. It builds a headless copy of the app, stands up a real Ergo server in Docker Compose, and drives the genuine client against a genuine IRC connection. The frontend, the Go backend, the SQLite layer, the server on the far end, all of it is live. If something in that chain is broken the test trips over it the same way a person at the keyboard would.

So when a panel won't open, or a layout has slid a few pixels, or a connection quietly fails to come back after a restart, there's a screenshot of it waiting for me before I've read a line of the diff. I'm not going to pretend this replaces unit tests. It catches the kind of bug where every individual piece passes on its own and the assembled thing still faceplants, and in a client like this that's most of the bugs worth catching.

The same rig pulls double duty. The screenshots in the docs come straight out of these tests, shot from a real peer talking to that same disposable Ergo server, and they're regenerated whenever the suite runs. So every image in the documentation is the client honestly doing the thing it claims to do. There's no folder of lovingly posed marketing shots that stopped being accurate three releases ago, which, let's be honest, is where screenshots in a README usually end up.

Defense in layers

Once it's a PR the tests stop being polite. Go unit tests, table-driven, over the IRC core and the storage layer. Vitest on the frontend. The Playwright suite again, this time in CI. I lean hardest on the Go side because IRC is deceptively stateful and its bugs are mean little things: a capability you forgot to negotiate, a race in the event bus, a write buffer that didn't flush before shutdown. That class of problem shows up in a test or it shows up in front of a user, and I have a very firm preference between those two.

More than one at a time

The other thing that falls out of working this way is that I'm hardly ever running just one agent. Once a change is mostly hands-off, sitting and watching a single one defeats the entire point.

So each agent gets its own git worktree. Its own checkout, its own branch, its own copy of the tree to make a mess in, all hanging off the same repo. One can be reworking the connection settings panel while another tears into the storage layer, and they're not reaching across each other to clobber the same files. When one wraps up, its branch goes through the same verification and the same PR gauntlet as anything I touched by hand. Running three at once doesn't soften the bar for any of them. It just means three things are clearing it at the same time.

The catch is that my day is mostly review now. I'm bouncing between branches reading diffs instead of sitting inside one of them writing code, and that took a while to get used to. It's the trade I'd take every time, though. Reviewing three good changes is a better use of me than typing out one.

What's left for me

So what's actually mine in all this? Three things, give or take.

I decide what gets built. I decide how the big pieces fit together. And I decide when something is genuinely fit to ship.

Ideation is the one that caught me off guard, because it's gotten bigger. I'd have bet that handing the typing to an agent would shrink the part where I sit around dreaming up what to build, and it did the exact opposite. When building a thing stops being expensive, you quit talking yourself out of ideas just because they sound like a slog. You just try them. Architecture matters for the same reason it always has, except the stakes are sharper now: the agent will build faithfully inside whatever structure I give it, so a bad structure just gets me bad architecture, delivered on time and fully tested.

The release call stays with me because an agent genuinely doesn't know what "ready" means. It can tell me the suite is green. It can't tell me whether the thing is any good. That's taste, and taste is the one piece of this I have zero interest in handing off.

Everything else, the code, the tests, the refactors, the hundred small judgment calls about naming and edge cases, is handled. Which doesn't get me off the hook for knowing my own codebase, to be clear. I still read it, I still know how it fits together. It just means the hours I spend in here are the ones that actually needed a human in the chair.

If you want to see what falls out the other end of all this, the unstable builds are up on GitHub. I run them before anyone else does; the release candidates are the project eating its own cooking until I trust a build enough to call it stable. Pull one down and poke at it. What you're holding is just what a normal week of this process spits out, and the whole thing is BSD-licensed, so if any of it looks worth lifting, lift it.