Johannes Schickling
Draft

Notes on Coding Agents

I've been using coding agents (like Claude Code/Cursor) very extensively for the past few months and wanted to share some notes.

(Unexpected) Great Use Cases

Open source workflows

  • Let the agent create minimal reproductions of a (suspected) bug in an open source project + open an issue with detailed instructions and context
  • Let the agent help maintaining forks (e.g. by rebasing the Git history)
  • Let the agent verify bug reports and turn bug reports into test cases + fixes

Product management

  • Let the agent maintain/create GitHub issues

Development workflows

  • Investigate bugs / issues and create a detailed report

Other thoughts

  • Testing is everything
    • Regression testing beyond correctness
      • Performance
  • I'm pretty sure containers will play a bigger role again to containerize/isolate development environments for agents
  • We'll probably need to figure out a better version control story to collaborate with agents (Dagger could be an interesting building block)
  • We might also benefit from better code review/diffing tools
  • There should be a improved Markdown thing
    • Basically what TypeScript was to JavaScript
    • Ideas:
      • Symbolic referencing (jump to definition, refactoring, etc.)
  • Principled engineering is more important than ever to make sure the agent doesn't do something stupid
  • The boundaries between local dev, CI and prod will blur
  • Some aspects are still very scary (e.g. security/correctness/performance implications)

UI work

  • Coding agents so far "can't see"
  • Are only trained on Tailwind web stuff, not on historical application UIs (e.g. iTunes, Windows 95, etc.)

Open questions / challenges

  • How to let the agent use/work with long-running processes?
    • I've been looking into process-compose but have been running into some issues
    • Related aspects:
      • long running processes
      • Port clashes -> force explicit ports and coordinate on per-machine level (e.g. in ~/.config/ports)
      • Restarting
      • Dependencies
      • Logs -> canonical log files
    • Maybe also use docker compose to run the agent in a container?
  • How to make sure the agent strictly (!) follows the rules (e.g. CLAUDE.md)?
  • How to let the agent "see" so it can actually confidentially do pixel-perfect UI work?
  • How to let the agent use Otel traces?
  • Agents tend to make things more complicated than they need to be
  • They create a lot of stuff
    • Needs to be cleaned up
  • Agents hitting walls and going in circles

Good resources