gi[t]hings.done
In my last blog post, I mentioned a little tool to help me quickly run the same prompt and process across multiple coding agents and models. Turns out I stumbled onto my new favorite way to interact with coding agents - not interacting at all.
The Problem#
My journey through the rapid evolution of agentic coding always had a clear trend towards fewer, but more focused interactions. And while terminal chats were a huge leap over smarter autocompletes and next-edit suggestions, I still felt glued to that chat interface and wasting time watching tokens fly by. The magic faded rather quickly. Also, the chat interfaces make some basic things hard:
Turns out I already use two great tools that can cover that! git and Neovim ๐คฏ . Also I own a very expensive keyboard, with a very sophisticated layout, so I'm determined to squeeze every ounce of clickety-clackety pleasure out of it. Talking to claude code via Wispr Flow? Hard pass!
So I went on a trip to cobble together a little workflow that focuses on files and git commits instead. And it eventually became githingsdone.
Example workflow#
1. Create a plan#
First, I create a TODO.md containing the rough requirements. Sometimes I also
just paste in the link to a github issue. Whatever the agent can handle.
- add team management page
- paginated listing
- filter by role
- add users (email, real name, roles)
- block/unblock users2. Run gtd
Then I run gtd. It will first commit my requirements as a ๐ฑ-commit, to keep
full history. Then it will kick off the planning agent and create a plan with
packages and action items that can be ticked off and questions for any
uncertainties.
# Team management
## Action Items
- [ ] Create a new route `/admin/team`
- [ ] Implement listing with pagination
- [ ] Add filters
- [ ] "Add user" dialog
- [ ] "Block user" with confirmation dialog
## Open questions
- Should the filter form automatically submit on change?
- Should filters persist to the URL?The result is committed as a ๐ค-commit.
3. Review and give feedback#
I open TODO.md in my editor, answer questions, and add comments to items that
I have different opinions on.
## Action Items
- [ ] Create a new route `/admin/team`
- [ ] Implement listing with pagination
- [ ] Add filters
- [ ] "Add user" dialog
> make sure all operations use optimistic UI patterns
- [ ] "Block user" with confirmation dialog
## Open questions
- Should the filter form automatically submit on change?
> never do this, it's an accessibility nightmare
- Should filters persist to the URL?
> yesAfter running gtd again, it will commit my feedback as ๐คฆ and re-run the
planning agent to adapt the plan. I repeat that until I'm satisfied and
confident that the genie will do a decent job.
4. Build#
Once the plan is solid, I again run gtd. It will kick off a ralph loop that
iterates through all work packages and builds them. At the end of each package
it will run the test suite and feed any errors back into the agent, repeating
until everything is green.
5. Review#
Now I open Neovim and start review.nvim to review all files that have been
changed since my ๐ฑ-commit. General testing feedback I put right into TODO.md,
code feedback can be done as simple // TODO: ... comments where they are
relevant. Running gtd will again commit my plain feedback as ๐คฆ and start a
new planning session that will remove my comments and turn them into new items
in TODO.md. I can do this as often as I want and re-run gtd to incorporate
more feedback. If it is run without any more changes, it will start the
build/test loop again for all the new items.
6. Cleanup#
After all is done and I have no more testing or code feedback, I run gtd for
one last time. This will create a ๐งน-cleanup commit that just removes TODO.md.
Now I would open a pull request and let CI do its thing.
You might ask what happens with all those colorful emoji commits? I squash them, because I don't want to be pulled into discussions with my coworkers ๐ . But they actually give me a nice history in the PR on how a feature evolved.
Under the hood#
In the beginning this started out as an agent-agnostic tool that works both with Claude Code and Opencode (for the sake of pitching those two with different models against each other). But at some point I switched to Pi, a minimalistic coding agent, of questionable fame as the agent inside OpenClaw. It will respect any Pi-configuration that you already have on your system, like skills and plugins and models. Thanks to Pi it is compatible to pretty much all subscriptions out there. Even Claude Max โ although the legal situation is a little unclear. Pi uses their official agent-sdk which automatically picks up my subscription, but they stated that anything not Claude Code is against their terms of service. If they block me, OpenAI has a new customer ๐คทโโ๏ธ.
Vibed all the way down#
Another interesting aspect of this project is that I vibed all the way down. Barely ever looked at any code, just had the agent churning in the background. And my main takeaway is that there is no way I would ever do this with anything even remotely connected to my salary.
At some point in the middle I even hit a wall where the agent couldn't do any more changes without completely breaking half of the tool. It also deleted 45 minutes of manual code review without committing it ๐คฆโโ๏ธ Very frustrating. But I managed to recover from that corner. And help came from a very unlikely place. I added cucumber.js and tasked Claude to create scenarios for all use- and edge cases. Not a single Gherkin scenario was written by me, and the agent did not even run the scenarios itself. The mere existence of clear user stories made the agent suddenly perform way better. Not all heroes wear capes.
What's next?#
I have been using gtd the last couple of weeks for all my production work, and
it has been a blast. Feel free to install it with npm install -g githingsdone,
try it and leave feedback in the github repository. But to be honest, even if
I remain the only user, its worth it. We really live in the age of personal
software. After a little more testing, I plan to integrate a couple of more
features:
I did try to add some of those, but Claude was not able to handle this on its own any more. Too much complexity and even with scenarios, it always broke a lot. Looks like I will eventually have to roll up my sleeves and actually look at that code first.