I filed bug reports against GitHub Agentic Workflows (gh-aw), a repo with thousands of issues and over two hundred automated agentic workflows. I never opened a pull request. Most of the bugs were fixed and merged into main, all implemented by agents, all reviewed and approved by a human maintainer.
The bottleneck in open source is moving from coding to diagnosis. This is what happens when it does.
How it started
At the first GitHub Dev Club meetup in Canada, Ian Reay presented a 4-level AI maturity model for the software development lifecycle, from Level 0 (fully manual) to Level 3 (AI-delegated, humans review exceptions). That night, I built prd-to-prod: a pipeline on top of gh-aw that takes product requirements, decomposes them into issues, implements each as a PR, reviews its own code, merges, and loops until everything ships. No human writes a line of application code.
Over several days and thousands of workflow runs, the pipeline surfaced dozens of distinct failure modes in gh-aw itself. Most were my fault, bugs in my own orchestration layer. But 19 turned out to be genuine platform issues.
How the bugs were found
The pipeline is built with an AI-native development flow. Claude Code and Codex do the execution. I’m the orchestrator and architect. When something broke, I would command Claude Code or Codex to investigate. Often they’d trace the fault back to my orchestration. But sometimes the root cause was in gh-aw itself, and once I realized that, I pushed the library harder and harder to find more.
The bug-filing process was itself agentic. One agent would determine the fault was in gh-aw and draft the issue. A second agent would fact-check the diagnosis, verify the file paths and line numbers against the actual source at a specific git SHA. Then I’d file it. One bug per issue, every time, per Peli’s advice.
You can see what one of these looks like: #19017 — Permanently deferred safe-output items do not fail the workflow. Observed failure, root cause traced to specific files and line numbers, git SHA, production evidence from a 4-agent concurrent pipeline, proposed fix scoped to two approaches.
The model
What I stumbled into is a contribution model the gh-aw project was designed to enable. The repository runs on its own platform: specialized agentic workflows handle everything from issue triage to CI diagnosis to code implementation. When a well-formed issue arrives, an agent picks it up, implements the fix, and opens a PR. The maintainer reviews, directs, and approves. Then it ships.
A good deep research on your side leads to good issues, then the fix goes in fast.
This inverts the traditional OSS model. In classic open source, the contributor who writes the best patch gets their bug fixed. Here, the contributor who writes the best issue does.
What separates the fixes that shipped from the ones that didn’t
The pattern across all 19 filings is clear: the more decision-free the fix, the faster the agent resolves it.
Issue #18980, a push failure misattributed as a patch failure, was filed at 4:20 AM UTC on March 1. The agent picked it up, opened two PRs, and the maintainer merged both. The fix shipped 71 minutes later. I wasn’t watching. I came back to find it closed.
Two other issues, #19606 and #19607, were filed the same minute. Both resolved independently within 75 and 83 minutes, in parallel. All three had a single root cause, a clear code path, and enough context that the fix required no design decisions.
Bugs that required architectural judgment still wait for a human, even when the diagnostic quality was identical. A bug in how the platform handles diverged branches requires a design decision about when to rebase. An enhancement to auto-merge gating needs a filtering mechanism designed, not just a line patched. The agent can’t make those calls. The maintainer can, but it takes time.
The human role in this loop isn’t rubber-stamping. In several fixes, the maintainer actively shaped the outcome during review. On one concurrency bug, I proposed a narrow conditional fallback. The agent implemented it. The maintainer redirected: make this a universal fallback, not a special case. The agent then reworked the fix across every event-type expression in the file. The final implementation was broader and more robust than what I proposed, shaped by the maintainer’s architectural judgment in conversation with the agent.
On another PR, the Copilot AI reviewer caught a compile error the initial implementation introduced, a scoping issue with a variable declaration. The agent diagnosed and fixed it in one round. The review conversation between human, AI reviewer, and implementing agent produced a better fix than any one alone.
What this means
If you’re a contributor, diagnosis is the new bottleneck skill. Writing a well-formed issue is what gets a bug fixed, not writing code. Running a platform in production, noticing when something breaks, tracing the failure to root cause, describing it with enough precision that the fix is obvious. File paths, line numbers, git SHAs, reproduction evidence, proposed minimal fixes.
If you’re a maintainer, high-quality issues become directly actionable inventory for agents. The bottleneck shifts from “who has time to review this PR” to “is this issue clear enough to hand to an agent.” Maintainers who build agent infrastructure can convert good issues into merged fixes at a rate impossible with human implementers alone.
For open source more generally, this lowers the implementation barrier. The traditional model required contributors to understand the codebase well enough to write a correct patch, navigate the build system, write tests, and survive code review. This model requires understanding the codebase well enough to diagnose the problem precisely. That’s a high bar, but a different one, and it lets more people in.
It’s the new OSS. The user does a lot of agentic reasoning to build the issue, contributor assigns to an agent. OSS is going to be blooming.
The constraint on open-source velocity is moving from coding capacity to diagnosis quality. The projects that benefit most will be the ones that invest in agent infrastructure and attract contributors who invest in diagnostic depth. That combination is what makes a 71-minute turnaround from bug report to merged fix possible.