gi[t]hings.done

2026-03-04

In my last blog post, I mentioned a little tool to help me quickly run the same prompt and process across multiple coding agents and models. Turns out I stumbled onto my new favorite way to interact with coding agents - not interacting at all.

The Problem#

My journey through the rapid evolution of agentic coding always had a clear trend towards fewer, but more focused interactions. And while terminal chats were a huge leap over smarter autocompletes and next-edit suggestions, I still felt glued to that chat interface and wasting time watching tokens fly by. The magic faded rather quickly. Also, the chat interfaces make some basic things hard:

jumping back to the previous version
navigating within the "plan" document
giving focused feedback to the agent

Turns out I already use two great tools that can cover that! git and Neovim 🤯 . Also I own a very expensive keyboard, with a very sophisticated layout, so I'm determined to squeeze every ounce of clickety-clackety pleasure out of it. Talking to claude code via Wispr Flow? Hard pass!

So I went on a trip to cobble together a little workflow that focuses on files and git commits instead. And it eventually became githingsdone.

Example workflow#

1. Create a plan#

First, I create a TODO.md containing the rough requirements. Sometimes I also just paste in the link to a github issue. Whatever the agent can handle.

- add team management page
- paginated listing
- filter by role
- add users (email, real name, roles)
- block/unblock users

2. Run `gtd`

Then I run gtd. It will first commit my requirements as a 🌱-commit, to keep full history. Then it will kick off the planning agent and create a plan with packages and action items that can be ticked off and questions for any uncertainties.

# Team management

## Action Items

- [ ] Create a new route `/admin/team`
- [ ] Implement listing with pagination
- [ ] Add filters
- [ ] "Add user" dialog
- [ ] "Block user" with confirmation dialog

## Open questions

- Should the filter form automatically submit on change?
- Should filters persist to the URL?

The result is committed as a 🤖-commit.

3. Review and give feedback#

I open TODO.md in my editor, answer questions, and add comments to items that I have different opinions on.

## Action Items

- [ ] Create a new route `/admin/team`
- [ ] Implement listing with pagination
- [ ] Add filters
- [ ] "Add user" dialog
  > make sure all operations use optimistic UI patterns
- [ ] "Block user" with confirmation dialog

## Open questions

- Should the filter form automatically submit on change?
  > never do this, it's an accessibility nightmare
- Should filters persist to the URL?
  > yes

After running gtd again, it will commit my feedback as 🤦 and re-run the planning agent to adapt the plan. I repeat that until I'm satisfied and confident that the genie will do a decent job.

4. Build#

Once the plan is solid, I again run gtd. It will kick off a ralph loop that iterates through all work packages and builds them. At the end of each package it will run the test suite and feed any errors back into the agent, repeating until everything is green.

5. Review#

Now I open Neovim and start review.nvim to review all files that have been changed since my 🌱-commit. General testing feedback I put right into TODO.md, code feedback can be done as simple // TODO: ... comments where they are relevant. Running gtd will again commit my plain feedback as 🤦 and start a new planning session that will remove my comments and turn them into new items in TODO.md. I can do this as often as I want and re-run gtd to incorporate more feedback. If it is run without any more changes, it will start the build/test loop again for all the new items.

6. Cleanup#

After all is done and I have no more testing or code feedback, I run gtd for one last time. This will create a 🧹-cleanup commit that just removes TODO.md. Now I would open a pull request and let CI do its thing.

You might ask what happens with all those colorful emoji commits? I squash them, because I don't want to be pulled into discussions with my coworkers 😅. But they actually give me a nice history in the PR on how a feature evolved.

Under the hood#

In the beginning this started out as an agent-agnostic tool that works both with Claude Code and Opencode (for the sake of pitching those two with different models against each other). But at some point I switched to Pi, a minimalistic coding agent, of questionable fame as the agent inside OpenClaw. It will respect any Pi-configuration that you already have on your system, like skills and plugins and models. Thanks to Pi it is compatible to pretty much all subscriptions out there. Even Claude Max — although the legal situation is a little unclear. Pi uses their official agent-sdk which automatically picks up my subscription, but they stated that anything not Claude Code is against their terms of service. If they block me, OpenAI has a new customer 🤷‍♂️.

Vibed all the way down#

Another interesting aspect of this project is that I vibed all the way down. Barely ever looked at any code, just had the agent churning in the background. And my main takeaway is that there is no way I would ever do this with anything even remotely connected to my salary.

At some point in the middle I even hit a wall where the agent couldn't do any more changes without completely breaking half of the tool. It also deleted 45 minutes of manual code review without committing it 🤦‍♂️ Very frustrating. But I managed to recover from that corner. And help came from a very unlikely place. I added cucumber.js and tasked Claude to create scenarios for all use- and edge cases. Not a single Gherkin scenario was written by me, and the agent did not even run the scenarios itself. The mere existence of clear user stories made the agent suddenly perform way better. Not all heroes wear capes.

What's next?#

I have been using gtd the last couple of weeks for all my production work, and it has been a blast. Feel free to install it with npm install -g githingsdone, try it and leave feedback in the github repository. But to be honest, even if I remain the only user, its worth it. We really live in the age of personal software. After a little more testing, I plan to integrate a couple of more features:

An "exploration" phase for tasks that require more research (the emoji will be 🤠, but I do not know more yet)
A "learning" (🎓) feature that looks for recurring patterns in my feedback or test fixes and integrates them into AGENTS.md
An AI-based review step before my manual review. Currently blocked by the lack of a robot-facepalm emoji.

I did try to add some of those, but Claude was not able to handle this on its own any more. Too much complexity and even with scenarios, it always broke a lot. Looks like I will eventually have to roll up my sleeves and actually look at that code first.