Part 1: I'm Building an AI Agent to Write My Unit Tests
Hey AI community! π
Like many of you, I've spent countless hours writing unit tests. It's one of the most critical parts of building reliable software, but it can also be a real grind. As I've been diving deeper into the world of AI Agents, I thought: what if I could automate this?
So, I started a tiny project to build my own AI agent to handle it. This is my journey of learning in public, and I wanted to share the first version with you all.
What I've Built So Far: The "Dev Engineer" Agent
The first phase is a simple but functional "Dev Engineer" agent. The concept is straightforward:
-
You give it a Python source file.
-
It gives you back a
test_.pyfile with unit tests ready to run withpytest.
Under the hood, it's a Python script using LangChain to manage the logic and OpenAI LLM. It's a simple, powerful starting point.
The Big Picture: An Autonomous Testing Team
This is just the beginning. The ultimate goal isn't just to generate tests, but to create a collaborative team of AI agents that can ensure code quality autonomously. The vision is to build a "QA Engineer" agent that will work alongside the "Dev Engineer" in a feedback loop:
-
The Dev Agent writes the tests.
-
The QA Agent runs them, checks for failures, and analyzes code coverage.
-
If anything is wrong, the QA Agent sends feedback to the Dev Agent.
-
The Dev Agent corrects the tests and sends them back.
-
...and so on, until we have a robust and passing test suite.
Let's Build This Together! (Call for Collaboration)
This project is my personal learning playground, but I believe it has the potential to become a genuinely useful tool for the community. That's where you come in.
I'm building this completely open source, and I would love for you to get involved. Whether you're an AI expert or just a curious developer, there are plenty of ways to contribute:
-
Check out the code and give feedback.
-
Suggest new features or improvements.
-
Tackle an open issue or a task from the roadmap below.
I've laid out a clear plan for where the project is headed. Take a look and see if anything sparks your interest!
π Project Roadmap
This project is under active development. Below is a summary of my progress and a look at what's ahead. Contributions are highly encouraged!
β Phase 1: Core Test Generation Engine (MVP)
[x] Develop "Dev Engineer" Agent: A core agent capable of generating unit tests from a single Python source file.
[x] LLM Integration: Connect the agent to a foundational LLM (e.g., GPT-4o, Llama 3) to power code generation.
[x] Basic CLI: A simple command-line interface to input a file and receive the generated test file.
π― Phase 2: Multi-Agent Collaboration & Feedback Loop
[ ] Introduce "QA Engineer" Agent: Develop a second agent responsible for reviewing, validating, and executing the generated tests.
[ ] Implement Test Execution Tool: Create a secure tool for the QA Agent to programmatically run pytest, capture results, and parse code coverage reports.
[ ] Establish Collaborative Framework (CrewAI): Refactor the agent logic into a Crew to manage the feedback loop, allowing the Dev Agent to fix tests based on the QA Agent's feedback until a target coverage is achieved.
ποΈ Phase 3: API-First Architecture & State Management
[ ] Expose via API: Wrap the agent crew in a FastAPI application to make it accessible as a service.
[ ] Job State Management: Integrate Redis or a database to manage the state of long-running jobs, allowing for asynchronous operation.
[ ] Containerization: Create a Dockerfile and docker-compose.yml to ensure a consistent and reproducible environment for the entire application stack.
β¨ Future Vision
[ ] LLMOps & Observability: Integrate with tools like LangSmith to trace, debug, and evaluate the performance of the agent interactions.
[ ] IDE Integration: Develop a VSCode extension for a seamless developer experience right within the editor.
[ ] Multi-Language Support: Expand capabilities beyond Python to include other languages like JavaScript/TypeScript and Go.
[ ] Automated Code Refactoring: Empower the Dev Agent to suggest fixes in the source code itself, not just the tests.
You can find the repository with all the code for Phase 1 here:
π https://github.com/herchila/unittest-ai-agent
What do you think? What other developer chores do you wish you could automate with AI? Let me know in the comments below!
See you!