Alejandro Crosa

Building My AI Code Review Clone: Automating Architectural Opinions with Claude

TL;DR: I built a GitHub Action that reviews PRs on my behalf, enforcing my architectural opinions automatically. When someone requests my review, my AI clone checks the code against my documented rules and leaves feedback. Here's how I built it and how you can too.

The Problem

As a tech lead, I found myself leaving the same code review comments over and over:

  • "Don't build JSON manually, use a serializer"
  • "This needs a transaction wrapper"
  • "Use the design system helpers, not custom Tailwind"
  • "Move this logic out of the controller"
  • etc, etc.

These are my opinions—patterns I've learned matter for maintainability. But repeating them is tedious, and I'm not always available when PRs need review.

The solution: Capture my architectural opinions in code and let an AI enforce them.

The Key Insight: Human in the Loop

Here's what makes this different from "let AI review all code":

The human in the loop is me.

The bot doesn't replace my judgment—it encodes it. Every rule in the system came from a review I did manually. The AI is just saving me from repeating myself.

And here's the powerful part: for code that matches patterns I've already approved dozens of times (like clean admin view changes using our design system), the bot can auto-approve. I've already reviewed that pattern. I trust it. Why make the author wait for me to click a button?

This isn't about removing humans from code review. It's about removing repetitive humans from code review—so the actual human (me) can focus on the novel, interesting, judgment-heavy parts.

How It Works

The system has two parts:

  1. Historical Context: Download my past PR comments to understand my review patterns
  2. Automated Review: A GitHub Action that triggers when my review is requested

When someone adds me as a reviewer, my AI clone:

  1. Reads the PR diff
  2. Checks changes against my documented rules
  3. Leaves a review comment (or approves, for simple admin changes)

Part 1: Mining Your Historical Comments

Before building the automation, I wanted to analyze my historical review patterns. Here's a Python script to export all your PR comments from a repository:

Prerequisites

# Create a virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install requests

# Set your GitHub token (needs repo access)
export GITHUB_TOKEN="ghp_your_token_here"

The Export Script

import os
import sys
import time
import requests
from datetime import datetime, timezone

API_ROOT = "https://api.github.com"
REPO = "your-org/your-repo"
USERNAME = "your-github-username"

def get_session():
    token = os.environ.get("GITHUB_TOKEN")
    if not token:
        print("Error: GITHUB_TOKEN environment variable is not set.", file=sys.stderr)
        sys.exit(1)

    s = requests.Session()
    s.headers.update({
        "Accept": "application/vnd.github+json",
        "Authorization": f"Bearer {token}",
        "X-GitHub-Api-Version": "2022-11-28",
        "User-Agent": "github-comment-exporter"
    })
    return s


def handle_rate_limit(r):
    if r.status_code != 403:
        return
    remaining = r.headers.get("X-RateLimit-Remaining")
    reset = r.headers.get("X-RateLimit-Reset")
    if remaining == "0" and reset is not None:
        reset_ts = int(reset)
        sleep_for = max(0, reset_ts - int(time.time()) + 5)
        print(f"Rate limit hit. Sleeping for {sleep_for} seconds...", file=sys.stderr)
        time.sleep(sleep_for)


def paginate(session, url, params=None):
    while True:
        r = session.get(url, params=params)
        if r.status_code == 403:
            handle_rate_limit(r)
            r = session.get(url, params=params)
        r.raise_for_status()
        yield r
        links = r.links
        if "next" in links:
            url = links["next"]["url"]
            params = None
        else:
            break


def get_all_prs(session):
    """Fetch all PRs from the repository."""
    url = f"{API_ROOT}/repos/{REPO}/pulls"
    params = {"state": "all", "per_page": 100}
    for r in paginate(session, url, params=params):
        for pr in r.json():
            yield pr


def get_review_comments(session, pr_number):
    """Fetch inline review comments (on specific lines of code)."""
    url = f"{API_ROOT}/repos/{REPO}/pulls/{pr_number}/comments"
    for r in paginate(session, url, params={"per_page": 100}):
        for c in r.json():
            yield c


def get_issue_comments(session, pr_number):
    """Fetch general PR comments (not on specific lines)."""
    url = f"{API_ROOT}/repos/{REPO}/issues/{pr_number}/comments"
    for r in paginate(session, url, params={"per_page": 100}):
        for c in r.json():
            yield c


def quote_block(text):
    lines = (text or "").replace("\r\n", "\n").split("\n")
    return "\n".join("> " + line for line in lines)


def main():
    session = get_session()
    print(f"Fetching PRs from {REPO}...", file=sys.stderr)

    output_file = "github_comments.md"
    max_comments = 500  # Adjust as needed
    comment_count = 0
    pr_count = 0

    with open(output_file, "w", encoding="utf-8") as f:
        now = datetime.now(timezone.utc).isoformat()
        f.write(f"# GitHub Comments - {REPO}\n\n")
        f.write(f"_Generated at {now}_\n\n")
        f.write(f"_User: {USERNAME}_\n\n\n\n")
        f.flush()

        entries = []

        for pr in get_all_prs(session):
            if comment_count >= max_comments:
                break

            pr_count += 1
            pr_number = pr["number"]
            pr_title = pr.get("title") or ""
            pr_html_url = pr.get("html_url") or f"https://github.com/{REPO}/pull/{pr_number}"

            print(f"  PR #{pr_number}: {pr_title[:50]}...", file=sys.stderr)

            # Review comments (inline on diff)
            for c in get_review_comments(session, pr_number):
                if comment_count >= max_comments:
                    break
                if not c.get("user") or c["user"].get("login") != USERNAME:
                    continue

                created_at = c.get("created_at") or ""
                body = c.get("body") or ""
                path = c.get("path") or ""
                diff_hunk = c.get("diff_hunk") or ""

                snippet = []
                snippet.append(f"### PR: [{pr_title}]({pr_html_url})")
                snippet.append(f"**File**: `{path}`")
                snippet.append(f"**Created**: `{created_at}`")
                snippet.append("")
                if diff_hunk:
                    snippet.append("**Diff:**")
                    snippet.append("```diff")
                    snippet.append(diff_hunk.replace("\r\n", "\n"))
                    snippet.append("```")
                    snippet.append("")
                snippet.append("**Comment:**")
                snippet.append(quote_block(body))
                snippet.append("\n\n")

                entries.append((created_at, "\n".join(snippet)))
                comment_count += 1
                print(f"    Found comment #{comment_count}", file=sys.stderr)

            # Issue comments (general PR comments)
            if comment_count < max_comments:
                for c in get_issue_comments(session, pr_number):
                    if comment_count >= max_comments:
                        break
                    if not c.get("user") or c["user"].get("login") != USERNAME:
                        continue

                    created_at = c.get("created_at") or ""
                    body = c.get("body") or ""

                    snippet = []
                    snippet.append(f"### PR: [{pr_title}]({pr_html_url})")
                    snippet.append(f"**Created**: `{created_at}`")
                    snippet.append("")
                    snippet.append("**Comment:**")
                    snippet.append(quote_block(body))
                    snippet.append("\n\n")

                    entries.append((created_at, "\n".join(snippet)))
                    comment_count += 1
                    print(f"    Found comment #{comment_count}", file=sys.stderr)

        # Sort by date and write
        entries.sort(key=lambda x: x[0] or "")
        for _, snippet in entries:
            f.write(snippet)
            f.flush()

    print(f"\nProcessed {pr_count} PRs", file=sys.stderr)
    print(f"Found {comment_count} comments by {USERNAME}", file=sys.stderr)
    print(f"Wrote to {output_file}", file=sys.stderr)


if __name__ == "__main__":
    main()

Running the Export

# Set your configuration
export GITHUB_TOKEN="your_token"

# Edit the script to set REPO and USERNAME
# Then run:
python export_github_comments.py

This generates a github_comments.md file with all your historical review comments, including the diff context. Use this to:

  • Identify patterns in your feedback
  • Extract rules for your AI reviewer
  • Build training examples

You can send this .md file to Gemini or any reasoning LLM and ask it to extract the rules to enforce for you, then manually review it.

Part 2: The GitHub Action

Here's the workflow that triggers my AI clone when my review is requested:

.github/workflows/arch-review.yml

name: Your Architecture Review

on:
  pull_request:
    types: [review_requested]

jobs:
  arch-review:
    # Only run when I'm the requested reviewer
    if: github.event.requested_reviewer.login == 'your-username'
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: read
      issues: read
      id-token: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
        with:
          fetch-depth: 1

      - name: Run Architecture Review
        id: arch-review
        uses: anthropics/claude-code-action@v1
        with:
          claude_code_oauth_token: ${{ secrets.CLAUDE_OAUTH_TOKEN }}
          prompt: |
            REPO: ${{ github.repository }}
            PR NUMBER: ${{ github.event.pull_request.number }}

            ## Your Architecture Review

            You are reviewing this PR on behalf of the team lead, enforcing their
            architectural opinions and code standards.

            1. First, use `gh pr diff ${{ github.event.pull_request.number }}`
               to see which files were changed.

            2. Based on the directories with changes, apply the relevant rules:

            

            ### ADMIN VIEWS
            - Use design system helpers, not custom Tailwind CSS
            - Check permissions via admin_permissions.rb
            - Use type formatters (:currency, :number, :percentage)

            ### DATABASE
            - Wrap multi-step SQL in BEGIN/COMMIT transactions
            - Never manually delete records with dependencies
            - Prefer manual scripts over migrations for data fixes

            ### CONTROLLERS
            - Keep controllers thin—logic belongs in models/services
            - Use serializers, not manual JSON building
            - Use ENV.fetch() to fail early on missing config

            ### BACKEND
            - Follow Resource/Manager/DAO pattern
            - Exceptional branches need observability, not just logs

            

            ## Review Instructions

            1. Check each changed file against applicable rules
            2. Be specific: cite the rule and suggest the fix
            3. Be concise but thorough

            Format:
            - Brief summary (1-2 sentences)
            - Violations grouped by category with file:line
            - Positive observations
            - Sign as "Your review bot 🤖"

            ## Submitting the Review

            Default: Leave a comment
            ```
            gh pr comment ${{ github.event.pull_request.number }} --body "REVIEW"
            ```

            Admin-only changes with no violations: May approve
            ```
            gh pr review ${{ github.event.pull_request.number }} --approve --body "REVIEW"
            ```

          claude_args: '--allowed-tools "Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh pr review:*)"'

Make sure to add the rules you extracted into that file, above I just added some random placeholder rules.

Key Design Decisions

1. Trigger on review_requested

on:
  pull_request:
    types: [review_requested]

The workflow only runs when someone explicitly requests my review, not on every PR. It respects the normal PR workflow.

2. Conditional execution

if: github.event.requested_reviewer.login == 'your-username'

Only triggers for my username. You can create similar workflows for other team members with their own rules.

3. Scoped tool access

claude_args: '--allowed-tools "Bash(gh pr comment:*),Bash(gh pr diff:*),..."'

The AI can only run specific gh commands—it can't modify code or access arbitrary files.

4. Context-aware rules

The prompt checks which directories changed and applies relevant rules. Admin changes get admin rules; backend changes get backend rules.

5. Graduated responses

  • Most PRs: Leave a comment (doesn't block merging)
  • Some PRs can auto-approve (speeds up simple changes)

Part 3: The Feedback Loop

The real power is the feedback loop:

Review PR manually → Notice pattern → Add rule → AI enforces it

Example Evolution

Week 1: I notice developers keep building JSON manually in controllers.

Action: Add rule:

- Use serializers, not manual JSON building with map/OpenStruct

Week 2: I see N+1 queries in a PR the AI missed.

Action: Add rule:

- Use includes/joins to avoid N+1 queries

Week 3: Someone commits a logging statement with PII.

Action: Add rule:

- Sensitive params must be in filter_parameter_logging.rb

Each time I give feedback, I ask: "Should my AI clone know this?"

If yes, I update the rules. The AI gets smarter. I repeat myself less.


Setting Up Your Own

1. Create a GitHub Personal Access Token

Go to GitHub Personal Access Tokens → Fine-grained tokens → Generate new token

Required permissions:

  • pull-requests: read & write
  • contents: read
  • issues: read

2. Add Repository Secrets

In your repo: Settings → Secrets → Actions

Add:

  • CLAUDE_OAUTH_TOKEN - Your Claude API token
  • YOUR_GITHUB_TOKEN - The PAT from step 1 (if using gh commands)

3. Create the Workflow File

Copy the YAML above to .github/workflows/your-name-review.yml

Customize:

  • Change your-username to your GitHub username
  • Update the rules to match your opinions
  • Adjust which file patterns trigger which rules

4. Document Your Rules

Keep rules in the workflow or in separate AGENTS.md files that the AI can read. Be specific:

❌ "Write clean code" ✅ "Use ActiveModel::Serializer instead of building JSON with map"

5. Iterate

After each manual review, ask yourself:

  • Did I give feedback the AI should know?
  • Was there a pattern I keep repeating?

Update the rules. Your clone evolves with you.

Conclusion

Your architectural opinions are valuable. They shouldn't be trapped in your head, repeated manually PR after PR.

By documenting your rules and automating enforcement:

  1. You scale your influence across the codebase
  2. Junior devs get instant feedback on patterns
  3. You focus on the interesting problems

The best part? Every time you learn something new, you can teach your AI clone. It's like having an apprentice that never forgets.

Start small: Pick 5 rules you repeat often. Automate those. Add more as you go.

This post was written by a human who got tired of saying "use a serializer" for the hundredth time.

Resources