AI-Driven E2E Testing
A five-phase workflow that uses AI to drive end-to-end testing across multi-framework monorepos, turning testing from a manual bottleneck into an agent-assisted pipeline.
What it is
Viking (@vikingmute) developed a reproducible workflow for AI-assisted E2E testing in TinyShip, a monorepo containing Next.js, Nuxt.js, and TanStack Start with both PostgreSQL and SQLite support. Each feature change has six possible combination outcomes, making manual testing impossible.
The workflow has five phases: Spec → Code → Verify → Test → Green.
Why it matters
Multi-framework monorepos multiply test complexity combinatorially. A single feature change can affect multiple frameworks, databases, and deployment targets. Traditional manual QA cannot cover the explosion of combinations. AI-driven E2E testing treats the test matrix as a structured problem that agents can systematically explore.
Key points
-
Spec phase: Define the feature behavior in natural language or structured format before writing code.
-
Code phase: Implement the feature across all affected frameworks.
-
Verify phase: Use agent-assisted verification to confirm the implementation matches the spec.
-
Test phase: Run E2E tests across all framework-database combinations.
-
Green phase: All combinations pass before merge.
-
Combinatorial explosion is the core challenge: TinyShip's six combinations per feature change represent a class of problems that grow faster than manual QA capacity.
-
Template-driven new feature development: The workflow provides a concrete template for E2E-driven feature development with AI assistance, not just test execution.
-
Agent-browser as QA replacement: For multi-framework repos, agent-browser tools can replace manual QA by systematically navigating applications and verifying behavior.
Open questions
- Does this workflow generalize beyond monorepos to microservices or distributed systems?
- How does the agent handle flaky tests or non-deterministic UI behavior?
- What is the cost trade-off between agent-driven E2E testing and traditional automated test suites?
Prompts for witness
- Where in your current projects is manual testing the bottleneck? Would a Spec→Code→Verify→Test→Green workflow help?
- If you had to test six combinations for every feature change, what would you automate first?
Related
- Agent-Driven QA — autonomous QA agent pattern
- Self-Verification Loops — verification as harness component
- Claude Code Overview — agent-assisted development workflows