# Magpie

Multi-AI adversarial code review tool. Multiple AI models independently review your PR, debate their findings, then a code-aware verifier audits each issue against the actual codebase.

## Core Concepts

- **Code-Aware Review**: CLI-based reviewers (Claude Code, Codex, Gemini CLI) read the actual source files via tools — not just the diff text. They can grep for callers, read surrounding context, and verify their findings before reporting.
- **Multi-Dimensional Review**: Beyond correctness/security, reviewers check compatibility (rolling upgrade risks, breaking changes), feature interaction (shared state, cross-feature conflicts), and extensibility.
- **Natural Adversarial**: Different AI models naturally create disagreements and cross-validation through debate.
- **Integrated Verify+Audit**: After issues are extracted, a tool-equipped verifier reads the actual code to confirm each issue, filter false positives, and re-calibrate severity — all within magpie's pipeline.
- **Fair Debate Model**: All reviewers in the same round see identical information — no unfair advantage from execution order.
- **Parallel Execution**: Same-round reviewers run concurrently for faster reviews.

## Supported AI Providers

| Provider | Type | Description |
|----------|------|-------------|
| `claude-code` | CLI | Claude Code CLI (uses your subscription, no API key) |
| `codex-cli` | CLI | OpenAI Codex CLI (uses your subscription, no API key) |
| `gemini-cli` | CLI | Gemini CLI (uses Google account login, no API key) |
| `opencode-cli` | CLI | OpenCode CLI — runs any model (typically via OpenRouter) as a code-aware agent (requires backing provider's API key) |
| `qwen-code` | CLI | Alibaba Qwen Code CLI (uses OAuth login, no API key) |
| `claude-*` | API | Anthropic API (requires ANTHROPIC_API_KEY) |
| `gpt-*` | API | OpenAI API (requires OPENAI_API_KEY) |
| `gemini-*` | API | Google Gemini API (requires GOOGLE_API_KEY) |
| `minimax` | API | MiniMax API (requires MINIMAX_API_KEY) |
| `openrouter/*` | API | OpenRouter API, OpenAI-compatible (requires OPENROUTER_API_KEY) |
| `mock` | Debug | Mock provider for testing (no API key, see [Debug Mode](#debug-mode)) |

**Recommended**: Use CLI providers (claude-code, codex-cli, gemini-cli, qwen-code) - they're free with your subscriptions and don't require API keys.

### Custom API Endpoints

All API providers support custom `base_url` for connecting to compatible third-party services (Azure OpenAI, Ollama, vLLM, one-api, etc.):

```yaml
providers:
  openai:
    api_key: ${OPENAI_API_KEY}
    base_url: https://my-ollama-server:11434/v1
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}
    base_url: https://my-proxy.example.com
```

### OpenRouter

OpenRouter exposes hundreds of models through a single OpenAI-compatible API. Magpie routes any model whose ID starts with `openrouter/` through OpenRouter:

```yaml
providers:
  openrouter:
    api_key: ${OPENROUTER_API_KEY}
    # base_url: https://openrouter.ai/api/v1  # optional, this is the default

reviewers:
  sonnet:
    model: openrouter/anthropic/claude-3.5-sonnet
    prompt: |
      ...
  llama:
    model: openrouter/meta-llama/llama-3-70b-instruct
    prompt: |
      ...
```

The portion after `openrouter/` is sent to OpenRouter verbatim, so use any model ID listed at https://openrouter.ai/models.

### OpenCode CLI

Models routed through `openrouter/*` reach the model purely as a chat completion — the reviewer sees only the diff and prompt and cannot read source files. To get a code-aware agent on top of OpenRouter (or any other backing provider), use the `opencode-cli` provider, which wraps the [OpenCode](https://opencode.ai/) CLI:

```yaml
providers:
  openrouter:
    api_key: ${OPENROUTER_API_KEY}

reviewers:
  sonnet-agent:
    model: opencode-cli:openrouter/anthropic/claude-sonnet-4
    prompt: |
      ...
```

The portion after `opencode-cli:` is passed verbatim to opencode's `-m provider/model` flag. Reviewers run with a read-only tool allowlist (Read, Grep, Glob, plus `gh`/`git`/`rg`) — matching the claude-code provider's permissions. API keys from `providers.openrouter.api_key` (and `anthropic`/`openai`/`google` if configured) are forwarded into opencode's environment, so you don't need a second copy of your keys.

## Installation

```bash
# Clone the repo
git clone https://github.com/liliu-z/magpie.git
cd magpie

# Install dependencies
npm install

# Build
npm run build

# Global install (optional)
npm link
```

## Quick Start

```bash
# Initialize config file (interactive)
magpie init

# Or with defaults
magpie init -y

# Navigate to the repo you want to review
cd your-repo

# Start review (PR number)
magpie review 12345

# Or with full URL
magpie review https://github.com/owner/repo/pull/12345

# Start a discussion on any topic
magpie discuss "Should we use microservices or monolith?"
```

## Configuration

Config file is located at `~/.magpie/config.yaml`:

```yaml
# AI Providers
providers:
  minimax:
    api_key: your-minimax-api-key   # or set MINIMAX_API_KEY env var
    base_url: https://custom-endpoint.example.com/v1  # optional: custom API endpoint

# Default settings
defaults:
  max_rounds: 5           # Maximum debate rounds
  output_format: markdown
  check_convergence: true  # Stop early when consensus reached
  language: en             # Output language (e.g., 'zh', 'en', 'ja')

# Reviewers - same perspective, different models
reviewers:
  claude:
    model: claude-code
    prompt: |
      You are a senior engineer reviewing this PR. Be precise and evidence-based.
      Review dimensions: Correctness, Security, Compatibility (rolling upgrade,
      breaking changes), Feature Interaction (shared state, cross-feature conflicts),
      Extensibility, Architecture, Performance & Resources.
      Use Read/Grep tools to verify findings against actual code.

  codex:
    model: codex-cli
    prompt: |
      # Same dimensions as above

# Analyzer - PR analysis (before debate)
analyzer:
  model: claude-code
  prompt: |
    Analyze this PR and provide:
    1. What this PR does
    2. Architecture/design decisions
    3. Affected interfaces/APIs (flag breaking changes)
    4. Compatibility risks (rolling upgrade, serialization changes)
    5. Feature interaction risks (callers, shared state)
    6. Suggested review focus (specific files + line ranges)

# Summarizer - final conclusion + verify+audit
summarizer:
  model: claude-code
  prompt: |
    You are a neutral technical reviewer. Based on the full reviewer discussion, provide:
    1. Points of consensus
    2. Points of disagreement
    3. Recommended action items
    4. Overall assessment

# Context Gatherer - system context before review (optional)
contextGatherer:
  enabled: true              # Enable/disable context gathering
  model: claude-code         # Optional: defaults to analyzer model
  callChain:
    maxDepth: 2              # How deep to trace call chains
    maxFilesToAnalyze: 20    # Max files to analyze for call chains
  history:
    maxDays: 30              # Look back period for related PRs
    maxPRs: 10               # Max related PRs to include
  docs:
    patterns:                # Doc files to include for context
      - docs
      - README.md
      - ARCHITECTURE.md
      - DESIGN.md
    maxSize: 50000           # Max total size of doc content
```

## CLI Options

```bash
magpie review [pr-number|url] [options]

Options:
  -c, --config <path>       Path to config file
  -r, --rounds <number>     Maximum debate rounds (default: 5)
  -i, --interactive         Interactive mode (pause between turns, Q&A)
  -o, --output <file>       Output to file
  -f, --format <format>     Output format (markdown|json)
  --no-converge             Disable convergence detection (enabled by default)
  -l, --local               Review local uncommitted changes
  -b, --branch [base]       Review current branch vs base (default: main)
  --files <files...>        Review specific files
  --reviewers <ids>         Comma-separated reviewer IDs (e.g., claude-code,gemini-cli)
  -a, --all                 Use all configured reviewers (skip selection)
  --git-remote <remote>     Git remote for PR URL detection (default: origin)
  --skip-context            Skip context gathering phase
  --no-post                 Skip post-processing (GitHub comment flow)
  --no-conclusion           Skip final conclusion generation (for bot/CI use)
  --fail-fast               Abort the entire review immediately if any reviewer fails
  --plan-only               Generate review plan without executing
  --reanalyze               Force re-analyze features (ignore cache)

  # Repository Review Options
  --repo                    Review entire repository
  --path <path>             Subdirectory to review (with --repo)
  --ignore <patterns...>    Patterns to ignore (with --repo)
  --quick                   Quick mode: only architecture overview
  --deep                    Deep mode: full analysis without prompts
  --list-sessions           List all review sessions
  --session <id>            Resume specific session by ID
  --export <file>           Export completed review to markdown
```

### Discuss Command

```bash
magpie discuss [topic] [options]

Options:
  -c, --config <path>       Path to config file
  -r, --rounds <number>     Maximum debate rounds (default: 5)
  -i, --interactive         Interactive mode (follow-up Q&A after conclusion)
  -o, --output <file>       Output to file
  -f, --format <format>     Output format (markdown|json)
  --no-converge             Disable convergence detection
  --reviewers <ids>         Comma-separated reviewer IDs
  -a, --all                 Use all configured reviewers
  -d, --devil-advocate      Add a Devil's Advocate to challenge consensus
  --fail-fast               Abort the entire discussion immediately if any reviewer fails
  --list                    List all discuss sessions
  --resume <id>             Resume a discuss session with follow-up question
```

### Reviewer Selection

By default, Magpie prompts you to select reviewers interactively:

```bash
# Interactive selection (default)
magpie review 12345

# Select reviewers from config:
#   1. claude-code
#   2. codex-cli
#   3. gemini-cli
# Enter numbers separated by commas (e.g., 1,2): 1,3
```

You can also specify reviewers directly:

```bash
# Use all configured reviewers
magpie review 12345 --all
magpie review 12345 -a

# Specify reviewers by ID
magpie review 12345 --reviewers claude-code,gemini-cli
```

### Review Modes

```bash
# Review a GitHub PR (number or URL)
magpie review 12345
magpie review https://github.com/owner/repo/pull/12345

# Review local uncommitted changes (staged + unstaged)
magpie review --local

# Review current branch vs main
magpie review --branch

# Review current branch vs specific base
magpie review --branch develop

# Review specific files
magpie review --files src/foo.ts src/bar.ts
```

### Repository Review

Review an entire repository with feature-based analysis:

```bash
# Full repository review (interactive)
magpie review --repo

# Quick stats only
magpie review --repo --quick

# Deep analysis (no prompts)
magpie review --repo --deep

# Review specific subdirectory
magpie review --repo --path src/api

# List/resume sessions
magpie review --list-sessions
magpie review --session abc123

# Export completed review
magpie review --export review-report.md
```

Repository review includes:
- AI-powered feature detection (identifies logical modules)
- Session persistence (pause/resume reviews)
- Focus area selection (security, performance, architecture, etc.)
- Progress saving between runs

### Topic Discussion

Discuss any technical topic with multiple AI reviewers through adversarial debate:

```bash
# Basic discussion
magpie discuss "Should we use microservices or monolith for our new project?"

# From a file (supports markdown)
magpie discuss /path/to/architecture-proposal.md

# With Devil's Advocate to challenge consensus
magpie discuss "Is Kubernetes overkill for our scale?" -d

# Interactive mode for follow-up Q&A
magpie discuss "How should we handle database migrations?" -i

# List all discuss sessions
magpie discuss --list

# Resume a previous discussion with follow-up
magpie discuss --resume abc123 "What about rollback strategies?"
```

Discussion features:
- **Multi-perspective analysis**: Different AI models debate the topic from their unique viewpoints
- **Devil's Advocate mode** (`-d`): Adds a dedicated contrarian to stress-test ideas
- **Session persistence**: Save/resume discussions for multi-session deep dives
- **Language matching**: Automatically responds in the same language as your topic (Chinese/English)
- **Interactive follow-up**: Continue the discussion with additional questions
- **Project context**: Optionally loads project-specific context for relevant discussions

## Workflow

```
1. Context Gathering (if enabled)
   │  Collects: affected modules, related PRs, call chains
   │  Supports: Go, C++, Python, Java, Scala, TS/JS, Rust, Proto
   ↓
2. Analyzer analyzes PR
   │  Outputs: summary, interface changes, compatibility risks,
   │           interaction risks, specific review focus areas
   ↓
3. [Interactive] Post-analysis Q&A (ask specific reviewers)
   ↓
4. Multi-round debate
   ├─ Round 1: All reviewers give INDEPENDENT opinions (parallel)
   │           CLI reviewers fetch diff + read code via tools
   │           ↓
   ├─ Convergence check: Did reviewers reach consensus?
   │           ↓
   ├─ Round 2+: Reviewers see ALL previous rounds (parallel)
   │            Cross-validate findings, challenge weak arguments
   │            ↓
   └─ ... (repeat until max rounds or convergence)
   ↓
5. Structurizer extracts issues into structured JSON
   ↓
6. Verify+Audit (tool-equipped)
   │  For each issue: Read/Grep actual code to verify
   │  Filters: false positives, by-design patterns, pre-existing issues
   │  Re-calibrates severity based on evidence
   ↓
7. [Optional] Summarizer produces final conclusion (--no-conclusion to skip)
```

### Fair Debate Model

Magpie uses a fair debate model where:

- **Round 1**: Each reviewer gives their independent opinion without seeing others
- **Round 2+**: Each reviewer sees ALL previous rounds' messages
- **Same-round fairness**: All reviewers in the same round see identical information
- **Parallel execution**: Same-round reviewers run concurrently (faster reviews)

This ensures no reviewer has an unfair advantage from execution order.

## Features

### Context Gathering

Before the review begins, Magpie automatically gathers system-level context to help reviewers understand the broader impact of changes:

- **Affected Modules**: Identifies which parts of the system are impacted (core, moderate, low)
- **Related PRs**: Finds relevant past PRs from project history
- **Call Chain Analysis**: Traces how changed code connects to the rest of the system (supports Go, C++, Python, Java, Scala, TypeScript, Rust, Proto)

```
┌─ System Context ─────────────────────────────────────────┐
│ Affected Modules:                                        │
│   • [core] src/orchestrator - Main review orchestration  │
│   • [moderate] src/config - Configuration handling       │
│                                                          │
│ Related PRs:                                             │
│   • #42 - Added streaming support                        │
│   • #38 - Refactored provider interface                  │
└──────────────────────────────────────────────────────────┘
```

Use `--skip-context` to disable, or configure in `contextGatherer` section of config.

### Session Persistence

Reviewers that support sessions maintain context across debate rounds, reducing token usage.

| Provider | Session Support | Notes |
|----------|-----------------|-------|
| `claude-code` | Yes | Full session with explicit ID |
| `codex-cli` | Yes | Full session with explicit ID |
| `qwen-code` | Yes | Full session with explicit ID |
| `minimax` | Yes | Conversation history maintained |
| `gemini-cli` | No | Uses full context each round |
| Other API providers | No | Uses full context each round |

### Parallel Execution

All reviewers in the same round execute concurrently. Results are collected and displayed after all reviewers complete:

```
⠋ Round 1: All reviewers thinking (parallel)...
   ↓ (all reviewers running simultaneously)
[claude-code]: First review...
[gemini-cli]: First review...
   ↓
⠋ Checking convergence...
   ↓
⠋ Round 2: All reviewers thinking (parallel)...
```

### Post-Analysis Q&A (Interactive Mode)

In interactive mode (`-i`), after analysis you can ask specific reviewers questions before the debate begins:

```bash
magpie review 12345 -i

# After analysis...
💡 You can ask specific reviewers questions before the debate begins.
   Format: @reviewer_id question (e.g., @claude What about security?)
   Available: @claude
   Available: @gemini
❓ Ask a question or press Enter to start debate: @claude What about the error handling?
```

### Convergence Detection

Enabled by default. Automatically ends debate when reviewers reach consensus on key points, saving tokens.

```bash
# Convergence detection enabled by default
magpie review 12345

# Disable convergence detection
magpie review 12345 --no-converge
```

Set `defaults.check_convergence: false` in config to disable by default.

### Failure Handling

By default, Magpie is **resilient**: if a single reviewer fails (network error, rate limit, model unavailable), the round continues with the surviving reviewers and only aborts if *all* reviewers fail. The failed reviewer's slot shows `[Review failed: ...]` and is excluded from subsequent rounds.

Use `--fail-fast` to flip to strict mode — any single reviewer failure (or context-gathering failure) immediately terminates the entire flow with an error:

```bash
# Strict mode: abort the moment anything fails
magpie review 12345 --fail-fast
magpie discuss "Should we use microservices?" --fail-fast
```

Useful when you want to guarantee every configured reviewer participated, or when you're debugging provider/auth issues and don't want failures swallowed.

### Markdown Rendering

All outputs (analysis, reviewer comments, final conclusion) are rendered with proper markdown formatting in terminal - headers, bold, tables, code blocks all display correctly.

### Token Usage Tracking

Displays token usage and estimated cost after each review:

```
── Token Usage (Estimated) ──
  analyzer       88 in     438 out
  claude      4,776 in   1,423 out
  gemini      6,069 in     664 out
  summarizer    505 in     322 out
──────────────────────────────────
  Total      11,438 in   2,847 out  ~$0.1429
```

### Cold Jokes

While waiting for AI reviewers, enjoy programmer jokes:

```
⠋ claude is thinking... | Why do programmers confuse Halloween and Christmas? Because Oct 31 = Dec 25
```

Disable them via config if you prefer a quieter spinner:

```yaml
defaults:
  show_jokes: false
```

### Post-Review Discussion Phase (Interactive Mode)

In interactive mode (`-i`), after the debate concludes, you can enter a **discussion phase** to chat with any role (reviewers, analyzer, or summarizer) before the comment posting step:

- Pick any role by number to start a conversation
- Each role maintains a persistent session with full PR context and its original review analysis
- Use `/skip` to exit the entire discussion phase
- Useful for clarifying issues, asking follow-up questions, or getting deeper insights before deciding which comments to post

```
  Available roles:
    [1] claude-code
    [2] gemini-cli
    [3] analyzer
    [4] summarizer

  Pick a role by number (or Enter to exit discussion):
```

### Post-Processing (PR Review)

After the debate concludes, Magpie extracts structured issues and lets you review them one by one:

- **Comment style prompt**: Before the issue loop, you can provide style instructions (e.g., "be concise", "use Chinese") that apply to all generated comments
- **Progress tracking**: Shows running tally of posted/edited/discussed/skipped issues
- **Per-issue actions**:
  - **Post** (`p`) — Posts as an inline comment on the exact PR line
  - **Edit** (`e`) — Edit the comment before posting
  - **Discuss** (`d`) — Start a multi-turn discussion with any role (reviewer/analyzer/summarizer)
  - **Skip** (`s`) — Skip this issue
  - **Quit** (`q`) — Stop processing remaining issues
- **`/skip` and `/drop`**: During discussion, type `/skip` or `/drop` to abandon the current issue
- **Inline comments**: Each issue is posted as an individual inline comment on the specific line in the PR diff. Falls back to a regular PR comment if the line is not in the diff.
- **Auto-explain**: When you choose to discuss, the reviewer automatically explains the issue in detail first (where the problem is, why it's a problem, how to fix it) before you start asking questions.
- **Comment regeneration**: After discussion, the reviewer generates a revised comment. You can post it, post the original, edit, regenerate with new instructions, or skip.
- **`--no-post`**: Use this flag to skip the entire post-processing flow and just see the review output.

### Debug Mode

Use the mock provider to test Magpie workflows without real AI calls:

```bash
# Enable mock mode globally (all models become mock)
# In config: mock: true

# Or use mock as a model name
# reviewers:
#   test-reviewer:
#     model: mock
#     prompt: "test prompt"

# Environment variables
MAGPIE_MOCK_RESPONSE="fixed response text"   # Return fixed text
MAGPIE_MOCK_FILE=/path/to/response.txt       # Return content from file
MAGPIE_MOCK_DELAY=100                         # Delay between words in ms (default: 50)

# Example: test the discussion flow quickly
MAGPIE_MOCK_DELAY=50 magpie review 123 --reviewers test-reviewer
```

## Development

```bash
# Run in dev mode
npm run dev -- review 12345

# Run tests
npm test

# Build
npm run build
```

## License

ISC