Draft
Notes on Coding Agents
I've been using coding agents (like Claude Code/Cursor) very extensively for the
past few months and wanted to share some notes.
(Unexpected) Great Use Cases
Open source workflows
- Let the agent create minimal reproductions of a (suspected) bug in an open
source project + open an issue with detailed instructions and context
- Let the agent help maintaining forks (e.g. by rebasing the Git history)
- Let the agent verify bug reports and turn bug reports into test cases + fixes
Product management
- Let the agent maintain/create GitHub issues
Development workflows
- Investigate bugs / issues and create a detailed report
Other thoughts
- Testing is everything
- Regression testing beyond correctness
- Architecting/structuring tests will be as important as the code itself.
Tests need be be maintainable over time and should be as orthogonal as
possible.
- Do things that compound
- I'm pretty sure containers will play a bigger role again to
containerize/isolate development environments for agents
- We'll probably need to figure out a better version control story to
collaborate with agents (Dagger could be an interesting
building block)
- We might also benefit from better code review/diffing tools
- There should be a improved Markdown thing
- Basically what TypeScript was to JavaScript
- Ideas:
- Symbolic referencing (jump to definition, refactoring, etc.)
- Principled engineering is more important than ever to make sure the agent
doesn't do something stupid
- The boundaries between local dev, CI and prod will blur
- Some aspects are still very scary (e.g. security/correctness/performance
implications)
- If something is hard to do, instead of throwing spaghetti at the wall, tell
the agent to write a script to iterate on it until it works
- API design: It's a cool pattern to ask the AI for how it would have wished the API would have looked like. This often results in a more intuitive/elegant API.
UI work
- Coding agents so far "can't see"
- Are only trained on Tailwind web stuff, not on historical application UIs
(e.g. iTunes, Windows 95, etc.)
Perspective: What can we afford doing now that wouldn't be viable before agents?
- Examples
- Other ideas
- Collecting data about own workflows
- e.g. what leads mostly to merge conflicts?
- what pollutes the AI context unnecessarily?o
Tools I tried
- Claude Code
- Cursor
- OpenAI Codex
- Limitations:
- Can't resume previous conversations
- Easy to accidentially press
Ctrl+C to cancel the conversation and then
you can't resume it
- CLI very buggy
- Can't see how much usage is still left for the current usage allowance
- Observations:
- Very strong results so far. Even better than Opus with Claude Code.
- Conductor
- Only works on local machine
- Catnip
- Benefits:
- Portable (even works from web browser - this also mobile web)
- Isolated in containers
- Can even set resource usage per container
- Limitations:
- Docker setup: SSH tunnel / Nix / authed tools (
gh, ...)
- Home
CLAUDE.md file
Building machines
- Workflow idea: Drawing machine architecture diagrams (e.g. in TLDraw or ASCII
art) to express intent of workflow
Building custom agent workflow systems
- Use cases
- Strategic debugging and root cause analysis (tree of possible root causes)
- Hiearchical agent architecture
- Poor man's version:
- Use
cursor-terminal "cd path/to/project && claude 'solve problem XYZ'" to
create n new terminals with a new agent each
Open questions / challenges
- How to let the agent use/work with long-running processes?
- I've been looking into
process-compose but have been running into
some issues
- Related aspects:
- long running processes
- Port clashes -> force explicit ports and coordinate on per-machine level
(e.g. in
~/.config/ports)
- Restarting
- Dependencies
- Logs -> canonical log files
- Maybe also use
docker compose to run the agent in a container?
- How to make sure the agent strictly (!) follows the rules (e.g.
CLAUDE.md)?
- How to let the agent "see" so it can actually confidentially do pixel-perfect
UI work?
- How to let the agent use Otel traces?
- Agents tend to make things more complicated than they need to be
- They create a lot of stuff
- Agents hitting walls and going in circles
Good resources