OpenCode: Add OpenCode as a new provider

The OpenCode provider allows using a variety of models with an agent harness that can gather more information from the codebase as required (like with claude-code, codex, or gemini-cli). This is an alternative to using OpenRouter directly, where the api provider is more like a chatbot and cannot gather any additional context beyond what was handed to it.
Allow specifying tmp dir when preparing prompt
2026-05-29 16:19:13 -07:00 · 2026-05-29 16:19:13 -07:00 · 2026-05-28 10:24:02 -07:00 · 2026-05-28 10:22:55 -07:00 · 2026-05-28 10:22:53 -07:00 · 2026-05-28 10:22:51 -07:00
25 changed files with 1474 additions and 202 deletions
@@ -1,14 +1,15 @@
 # Magpie

-Multi-AI adversarial PR review tool. Let different AI models review your code like Linus Torvalds, generating more comprehensive reviews through debate.
+Multi-AI adversarial code review tool. Multiple AI models independently review your PR, debate their findings, then a code-aware verifier audits each issue against the actual codebase.

 ## Core Concepts

- **Same Perspective, Different Models**: All reviewers use the same prompt (Linus-style), but are powered by different AI models
- **Natural Adversarial**: Differences between models naturally create disagreements and debates
- **Anti-Sycophancy**: Explicitly tells AI they're debating with other AIs, preventing mutual agreement bias
- **Fair Debate Model**: All reviewers in the same round see identical information - no unfair advantage from execution order
- **Parallel Execution**: Same-round reviewers run concurrently for faster reviews
+- **Code-Aware Review**: CLI-based reviewers (Claude Code, Codex, Gemini CLI) read the actual source files via tools — not just the diff text. They can grep for callers, read surrounding context, and verify their findings before reporting.
+- **Multi-Dimensional Review**: Beyond correctness/security, reviewers check compatibility (rolling upgrade risks, breaking changes), feature interaction (shared state, cross-feature conflicts), and extensibility.
+- **Natural Adversarial**: Different AI models naturally create disagreements and cross-validation through debate.
+- **Integrated Verify+Audit**: After issues are extracted, a tool-equipped verifier reads the actual code to confirm each issue, filter false positives, and re-calibrate severity — all within magpie's pipeline.
+- **Fair Debate Model**: All reviewers in the same round see identical information — no unfair advantage from execution order.
+- **Parallel Execution**: Same-round reviewers run concurrently for faster reviews.

 ## Supported AI Providers

@@ -17,11 +18,13 @@ Multi-AI adversarial PR review tool. Let different AI models review your code li
 | `claude-code` | CLI | Claude Code CLI (uses your subscription, no API key) |
 | `codex-cli` | CLI | OpenAI Codex CLI (uses your subscription, no API key) |
 | `gemini-cli` | CLI | Gemini CLI (uses Google account login, no API key) |
+| `opencode-cli` | CLI | OpenCode CLI — runs any model (typically via OpenRouter) as a code-aware agent (requires backing provider's API key) |
 | `qwen-code` | CLI | Alibaba Qwen Code CLI (uses OAuth login, no API key) |
 | `claude-*` | API | Anthropic API (requires ANTHROPIC_API_KEY) |
 | `gpt-*` | API | OpenAI API (requires OPENAI_API_KEY) |
 | `gemini-*` | API | Google Gemini API (requires GOOGLE_API_KEY) |
 | `minimax` | API | MiniMax API (requires MINIMAX_API_KEY) |
+| `openrouter/*` | API | OpenRouter API, OpenAI-compatible (requires OPENROUTER_API_KEY) |
 | `mock` | Debug | Mock provider for testing (no API key, see [Debug Mode](#debug-mode)) |

 **Recommended**: Use CLI providers (claude-code, codex-cli, gemini-cli, qwen-code) - they're free with your subscriptions and don't require API keys.
@@ -40,6 +43,47 @@ providers:
    base_url: https://my-proxy.example.com
 ```

+### OpenRouter
+
+OpenRouter exposes hundreds of models through a single OpenAI-compatible API. Magpie routes any model whose ID starts with `openrouter/` through OpenRouter:
+
+```yaml
+providers:
+  openrouter:
+    api_key: ${OPENROUTER_API_KEY}
+    # base_url: https://openrouter.ai/api/v1  # optional, this is the default
+
+reviewers:
+  sonnet:
+    model: openrouter/anthropic/claude-3.5-sonnet
+    prompt: |
+      ...
+  llama:
+    model: openrouter/meta-llama/llama-3-70b-instruct
+    prompt: |
+      ...
+```
+
+The portion after `openrouter/` is sent to OpenRouter verbatim, so use any model ID listed at https://openrouter.ai/models.
+
+### OpenCode CLI
+
+Models routed through `openrouter/*` reach the model purely as a chat completion — the reviewer sees only the diff and prompt and cannot read source files. To get a code-aware agent on top of OpenRouter (or any other backing provider), use the `opencode-cli` provider, which wraps the [OpenCode](https://opencode.ai/) CLI:
+
+```yaml
+providers:
+  openrouter:
+    api_key: ${OPENROUTER_API_KEY}
+
+reviewers:
+  sonnet-agent:
+    model: opencode-cli:openrouter/anthropic/claude-sonnet-4
+    prompt: |
+      ...
+```
+
+The portion after `opencode-cli:` is passed verbatim to opencode's `-m provider/model` flag. Reviewers run with a read-only tool allowlist (Read, Grep, Glob, plus `gh`/`git`/`rg`) — matching the claude-code provider's permissions. API keys from `providers.openrouter.api_key` (and `anthropic`/`openai`/`google` if configured) are forwarded into opencode's environment, so you don't need a second copy of your keys.
+
 ## Installation

 ```bash
@@ -102,33 +146,30 @@ reviewers:
  claude:
    model: claude-code
    prompt: |
-      You are a senior engineer reviewing this PR. Be direct and concise like Linus Torvalds,
-      but constructive rather than harsh.
+      You are a senior engineer reviewing this PR. Be precise and evidence-based.
+      Review dimensions: Correctness, Security, Compatibility (rolling upgrade,
+      breaking changes), Feature Interaction (shared state, cross-feature conflicts),
+      Extensibility, Architecture, Performance & Resources.
+      Use Read/Grep tools to verify findings against actual code.

-      Focus on:
-      1. **Correctness** - Will this code work? Edge cases?
-      2. **Security** - Any vulnerabilities? Input validation?
-      3. **Architecture** - Does this fit the overall design? Any coupling issues?
-      4. **Simplicity** - Is this the simplest solution? Over-engineering?
-
-  gemini:
-    model: gemini-cli
+  codex:
+    model: codex-cli
    prompt: |
-      # Same as above...
+      # Same dimensions as above

 # Analyzer - PR analysis (before debate)
 analyzer:
  model: claude-code
  prompt: |
-    You are a senior engineer providing PR context analysis.
    Analyze this PR and provide:
    1. What this PR does
    2. Architecture/design decisions
-    3. Purpose
-    4. Trade-offs
-    5. Things to note
+    3. Affected interfaces/APIs (flag breaking changes)
+    4. Compatibility risks (rolling upgrade, serialization changes)
+    5. Feature interaction risks (callers, shared state)
+    6. Suggested review focus (specific files + line ranges)

-# Summarizer - final conclusion
+# Summarizer - final conclusion + verify+audit
 summarizer:
  model: claude-code
  prompt: |
@@ -177,6 +218,8 @@ Options:
  --git-remote <remote>     Git remote for PR URL detection (default: origin)
  --skip-context            Skip context gathering phase
  --no-post                 Skip post-processing (GitHub comment flow)
+  --no-conclusion           Skip final conclusion generation (for bot/CI use)
+  --fail-fast               Abort the entire review immediately if any reviewer fails
  --plan-only               Generate review plan without executing
  --reanalyze               Force re-analyze features (ignore cache)

@@ -206,6 +249,7 @@ Options:
  --reviewers <ids>         Comma-separated reviewer IDs
  -a, --all                 Use all configured reviewers
  -d, --devil-advocate      Add a Devil's Advocate to challenge consensus
+  --fail-fast               Abort the entire discussion immediately if any reviewer fails
  --list                    List all discuss sessions
  --resume <id>             Resume a discuss session with follow-up question
 ```
@@ -324,24 +368,33 @@ Discussion features:
 ```
 1. Context Gathering (if enabled)
   │  Collects: affected modules, related PRs, call chains
+   │  Supports: Go, C++, Python, Java, Scala, TS/JS, Rust, Proto
   ↓
 2. Analyzer analyzes PR
+   │  Outputs: summary, interface changes, compatibility risks,
+   │           interaction risks, specific review focus areas
   ↓
 3. [Interactive] Post-analysis Q&A (ask specific reviewers)
   ↓
 4. Multi-round debate
   ├─ Round 1: All reviewers give INDEPENDENT opinions (parallel)
-   │           No reviewer sees others' responses yet
+   │           CLI reviewers fetch diff + read code via tools
   │           ↓
   ├─ Convergence check: Did reviewers reach consensus?
   │           ↓
   ├─ Round 2+: Reviewers see ALL previous rounds (parallel)
-   │            Each reviewer responds to others' points
-   │            Same-round reviewers see identical information
+   │            Cross-validate findings, challenge weak arguments
   │            ↓
   └─ ... (repeat until max rounds or convergence)
   ↓
-5. Summarizer produces final conclusion from full debate history
+5. Structurizer extracts issues into structured JSON
+   ↓
+6. Verify+Audit (tool-equipped)
+   │  For each issue: Read/Grep actual code to verify
+   │  Filters: false positives, by-design patterns, pre-existing issues
+   │  Re-calibrates severity based on evidence
+   ↓
+7. [Optional] Summarizer produces final conclusion (--no-conclusion to skip)
 ```

 ### Fair Debate Model
@@ -363,7 +416,7 @@ Before the review begins, Magpie automatically gathers system-level context to h

 - **Affected Modules**: Identifies which parts of the system are impacted (core, moderate, low)
 - **Related PRs**: Finds relevant past PRs from project history
- **Call Chain Analysis**: Traces how changed code connects to the rest of the system
+- **Call Chain Analysis**: Traces how changed code connects to the rest of the system (supports Go, C++, Python, Java, Scala, TypeScript, Rust, Proto)

 ```
 ┌─ System Context ─────────────────────────────────────────┐
@@ -436,6 +489,20 @@ magpie review 12345 --no-converge

 Set `defaults.check_convergence: false` in config to disable by default.

+### Failure Handling
+
+By default, Magpie is **resilient**: if a single reviewer fails (network error, rate limit, model unavailable), the round continues with the surviving reviewers and only aborts if *all* reviewers fail. The failed reviewer's slot shows `[Review failed: ...]` and is excluded from subsequent rounds.
+
+Use `--fail-fast` to flip to strict mode — any single reviewer failure (or context-gathering failure) immediately terminates the entire flow with an error:
+
+```bash
+# Strict mode: abort the moment anything fails
+magpie review 12345 --fail-fast
+magpie discuss "Should we use microservices?" --fail-fast
+```
+
+Useful when you want to guarantee every configured reviewer participated, or when you're debugging provider/auth issues and don't want failures swallowed.
+
 ### Markdown Rendering

 All outputs (analysis, reviewer comments, final conclusion) are rendered with proper markdown formatting in terminal - headers, bold, tables, code blocks all display correctly.
@@ -462,6 +529,13 @@ While waiting for AI reviewers, enjoy programmer jokes:
 ⠋ claude is thinking... | Why do programmers confuse Halloween and Christmas? Because Oct 31 = Dec 25
 ```

+Disable them via config if you prefer a quieter spinner:
+
+```yaml
+defaults:
+  show_jokes: false
+```
+
 ### Post-Review Discussion Phase (Interactive Mode)

 In interactive mode (`-i`), after the debate concludes, you can enter a **discussion phase** to chat with any role (reviewers, analyzer, or summarizer) before the comment posting step:
@@ -199,6 +199,7 @@ interface DiscussOptions {
  list?: boolean
  resume?: string
  devilAdvocate?: boolean
+  failFast?: boolean
 }

 async function runDiscussion(
@@ -230,6 +231,7 @@ async function runDiscussion(
  const isSoloDiscussion = reviewers.length === 1
  const maxRounds = isSoloDiscussion ? 1 : parseInt(options.rounds, 10)
  const checkConvergence = !isSoloDiscussion && options.converge !== false && (config.defaults.check_convergence !== false)
+  const showJokes = config.defaults.show_jokes !== false

  const summarizer: Reviewer = {
    id: 'summarizer',
@@ -302,6 +304,7 @@ async function runDiscussion(
    checkConvergence,
    language: lang,
    interruptState,
+    failFast: !!options.failFast,
    onWaiting: (reviewerId) => {
      flushBuffer()
      if (spinnerRef.spinner) spinnerRef.spinner.stop()
@@ -320,29 +323,30 @@ async function runDiscussion(
                   `${reviewerId} is thinking`

      const updateSpinner = () => {
-        const joke = getRandomJoke()
-        if (spinnerRef.spinner) {
+        if (!spinnerRef.spinner) return
+        const jokeSuffix = showJokes ? ` ${chalk.dim(`| ${getRandomJoke()}`)}` : ''
        if (spinnerRef.parallelStatuses && isParallelRound) {
          const round = parseInt(reviewerId.split('-')[1])
          const statusLine = formatParallelStatus(round, spinnerRef.parallelStatuses)
-            spinnerRef.spinner.text = `${statusLine} ${chalk.dim(`| ${joke}`)}`
+          spinnerRef.spinner.text = `${statusLine}${jokeSuffix}`
        } else {
-            spinnerRef.spinner.text = `${baseLabel}... ${chalk.dim(`| ${joke}`)}`
-          }
+          spinnerRef.spinner.text = `${baseLabel}...${jokeSuffix}`
        }
      }

      spinnerRef.parallelStatuses = null
      spinnerRef.spinner = ora({ text: `${baseLabel}...`, discardStdin: false }).start()
      updateSpinner()
+      if (showJokes) {
        spinnerRef.interval = setInterval(updateSpinner, 8000)
+      }
    },
    onParallelStatus: (round, statuses) => {
      spinnerRef.parallelStatuses = statuses
      if (spinnerRef.spinner) {
-        const joke = getRandomJoke()
+        const jokeSuffix = showJokes ? ` ${chalk.dim(`| ${getRandomJoke()}`)}` : ''
        const statusLine = formatParallelStatus(round, statuses)
-        spinnerRef.spinner.text = `${statusLine} ${chalk.dim(`| ${joke}`)}`
+        spinnerRef.spinner.text = `${statusLine}${jokeSuffix}`
      }
    },
    onMessage: (reviewerId, chunk) => {
@@ -443,6 +447,7 @@ export const discussCommand = new Command('discuss')
  .option('--reviewers <ids>', 'Comma-separated reviewer IDs')
  .option('-a, --all', 'Use all reviewers')
  .option('-d, --devil-advocate', "Add a Devil's Advocate to challenge consensus")
+  .option('--fail-fast', 'Abort the entire discussion immediately if any reviewer fails')
  .option('--list', 'List all discuss sessions')
  .option('--resume <id>', 'Resume a discuss session')
  .action(async (topic: string | undefined, options: DiscussOptions) => {
@@ -79,6 +79,7 @@ export const initCommand = new Command('init')
            if (r.provider === 'anthropic') envVars.add('ANTHROPIC_API_KEY')
            if (r.provider === 'openai') envVars.add('OPENAI_API_KEY')
            if (r.provider === 'google') envVars.add('GOOGLE_API_KEY')
+            if (r.provider === 'openrouter') envVars.add('OPENROUTER_API_KEY')
          })
          envVars.forEach(v => console.log(`  - ${v}`))
        }
@@ -3,7 +3,7 @@ import chalk from 'chalk'
 import ora from 'ora'
 import { execSync } from 'child_process'
 import { loadConfig } from '../config/loader.js'
-import { createProvider } from '../providers/factory.js'
+import { createProvider, isCliModel } from '../providers/factory.js'
 import { DebateOrchestrator } from '../orchestrator/orchestrator.js'
 import type { Reviewer, ReviewerStatus } from '../orchestrator/types.js'
 import { createInterface } from 'readline'
@@ -55,6 +55,8 @@ export const reviewCommand = new Command('review')
  .option('--export <file>', 'Export completed review to markdown')
  .option('--skip-context', 'Skip context gathering phase')
  .option('--no-post', 'Skip post-processing (GitHub comment flow)')
+  .option('--no-conclusion', 'Skip final conclusion generation (bot mode)')
+  .option('--fail-fast', 'Abort the entire review immediately if any reviewer (or context gatherer) fails')
  .action(async (pr: string | undefined, options) => {
    const spinner = ora('Loading configuration...').start()

@@ -224,10 +226,35 @@ export const reviewCommand = new Command('review')
          }
        }

-        // Pre-fetch PR diff and info so all reviewers (including API-only models) get the code
-        let prDiff = ''
+        // Fetch PR metadata (title/body) — always needed
        let prTitle = ''
        let prBody = ''
+        try {
+          const prInfo = JSON.parse(execSync(`gh pr view ${prUrl} --json title,body`, { encoding: 'utf-8', timeout: 30000 }))
+          prTitle = prInfo.title || ''
+          prBody = prInfo.body || ''
+        } catch {
+          // Non-fatal: reviewers can still work without metadata
+        }
+
+        // Check if all reviewers (+ analyzer) are CLI-based.
+        // CLI providers can fetch diff and read code themselves via tools.
+        // API providers need the diff pre-fetched and embedded in the prompt.
+        const allModels = [
+          ...Object.values(config.reviewers).map(r => r.model),
+          config.analyzer.model,
+          config.summarizer.model,
+        ]
+        const allCli = allModels.every(m => isCliModel(m))
+
+        let prPrompt: string
+        if (allCli) {
+          // CLI mode: reviewers fetch diff and read code themselves
+          console.log(chalk.dim(`  CLI-only reviewers detected — reviewers will fetch diff and read code directly`))
+          prPrompt = `Please review ${prUrl}.\n\nTitle: ${prTitle}\n\nDescription:\n${prBody}\n\nYou have full access to the repository. Use \`gh pr diff ${prUrl}\` to get the diff, then use Read/Grep tools to examine the actual source files for context. Review every changed file and function systematically.`
+        } else {
+          // API mode: pre-fetch diff and embed in prompt
+          let prDiff = ''
          let diffTruncationNote = ''
          try {
            prDiff = execSync(`gh pr diff ${prUrl}`, { encoding: 'utf-8', timeout: 60000, maxBuffer: 10 * 1024 * 1024 })
@@ -270,17 +297,11 @@ export const reviewCommand = new Command('review')
              console.error(chalk.yellow(`Warning: Could not pre-fetch PR diff: ${errMsg.slice(0, 100)}`))
            }
          }
-        try {
-          const prInfo = JSON.parse(execSync(`gh pr view ${prUrl} --json title,body`, { encoding: 'utf-8', timeout: 30000 }))
-          prTitle = prInfo.title || ''
-          prBody = prInfo.body || ''
-        } catch {
-          // Non-fatal: reviewers can still work with just the diff
-        }

-        const prPrompt = prDiff
+          prPrompt = prDiff
            ? `Please review ${prUrl}.\n\nTitle: ${prTitle}\n\nDescription:\n${prBody}${diffTruncationNote}\n\nHere is the PR diff:\n\n\`\`\`diff\n${prDiff}\`\`\`\n\nAnalyze these changes and provide your feedback. You already have the complete diff above — do NOT attempt to fetch it again.`
            : `Please review ${prUrl}. Get the PR details and diff using any method available to you, then analyze the changes.`
+        }

        target = {
          type: 'pr',
@@ -361,6 +382,16 @@ export const reviewCommand = new Command('review')
        systemPrompt: config.analyzer.prompt
      }

+      // Create auditor (final judge). Uses config.audit if present; else falls back
+      // to summarizer (caller side just passes undefined so the orchestrator default kicks in).
+      const auditor: Reviewer | undefined = config.audit
+        ? {
+            id: 'auditor',
+            provider: createProvider(config.audit.model, config),
+            systemPrompt: config.audit.prompt
+          }
+        : undefined
+
      // Create context gatherer (if enabled)
      let contextGatherer: ContextGatherer | undefined
      const contextEnabled = !options.skipContext && (config.contextGatherer?.enabled !== false)
@@ -382,6 +413,7 @@ export const reviewCommand = new Command('review')
      const maxRounds = isSoloReview ? 1 : parseInt(options.rounds, 10)
      // Convergence: disable for solo review; otherwise default from config, CLI can override with --no-converge
      const checkConvergence = !isSoloReview && options.converge !== false && (config.defaults.check_convergence !== false)
+      const showJokes = config.defaults.show_jokes !== false

      console.log()
      console.log(chalk.bgBlue.white.bold(` ${target.label} Review `))
@@ -433,6 +465,8 @@ export const reviewCommand = new Command('review')
        checkConvergence,
        language: config.defaults.language,
        interruptState,
+        skipConclusion: options.conclusion === false,
+        failFast: !!options.failFast,
        onWaiting: (reviewerId) => {
          // Flush previous reviewer's buffer before showing spinner
          flushBuffer()
@@ -460,31 +494,31 @@ export const reviewCommand = new Command('review')

          // Show spinner with a joke (and parallel status if available)
          const updateSpinner = () => {
-            const joke = getRandomJoke()
-            if (spinnerRef.spinner) {
+            if (!spinnerRef.spinner) return
+            const jokeSuffix = showJokes ? ` ${chalk.dim(`| ${getRandomJoke()}`)}` : ''
            if (spinnerRef.parallelStatuses && isParallelRound) {
              const round = parseInt(reviewerId.split('-')[1])
              const statusLine = formatParallelStatus(round, spinnerRef.parallelStatuses)
-                spinnerRef.spinner.text = `${statusLine} ${chalk.dim(`| ${joke}`)}`
+              spinnerRef.spinner.text = `${statusLine}${jokeSuffix}`
            } else {
-                spinnerRef.spinner.text = `${baseLabel}... ${chalk.dim(`| ${joke}`)}`
-              }
+              spinnerRef.spinner.text = `${baseLabel}...${jokeSuffix}`
            }
          }

          spinnerRef.parallelStatuses = null  // Reset for new waiting phase
          spinnerRef.spinner = ora({ text: `${baseLabel}...`, discardStdin: false }).start()
          updateSpinner()
-          // Update joke every 15 seconds
+          if (showJokes) {
            spinnerRef.interval = setInterval(updateSpinner, 15000)
+          }
        },
        onParallelStatus: (round, statuses) => {
          spinnerRef.parallelStatuses = statuses
          // Immediately update spinner to show new status
          if (spinnerRef.spinner) {
-            const joke = getRandomJoke()
+            const jokeSuffix = showJokes ? ` ${chalk.dim(`| ${getRandomJoke()}`)}` : ''
            const statusLine = formatParallelStatus(round, statuses)
-            spinnerRef.spinner.text = `${statusLine} ${chalk.dim(`| ${joke}`)}`
+            spinnerRef.spinner.text = `${statusLine}${jokeSuffix}`
          }
        },
        onMessage: (reviewerId, chunk) => {
@@ -606,7 +640,7 @@ export const reviewCommand = new Command('review')
            console.log(marked(fixMarkdown(context.summary)))
          }
        }
-      }, contextGatherer)
+      }, contextGatherer, auditor)

      const result = await orchestrator.runStreaming(target.label, target.prompt)

@@ -9,7 +9,7 @@ export interface ReviewerOption {
  model: string
  description: string
  needsApiKey: boolean
-  provider?: 'anthropic' | 'openai' | 'google'
+  provider?: 'anthropic' | 'openai' | 'google' | 'openrouter'
 }

 export const AVAILABLE_REVIEWERS: ReviewerOption[] = [
@@ -34,6 +34,14 @@ export const AVAILABLE_REVIEWERS: ReviewerOption[] = [
    description: 'Uses your Gemini CLI (Google account, no API key needed)',
    needsApiKey: false
  },
+  {
+    id: 'opencode-cli',
+    name: 'OpenCode (via OpenRouter)',
+    model: 'opencode-cli:openrouter/anthropic/claude-3.5-sonnet',
+    description: 'Runs any OpenRouter model as a code-aware agent via the OpenCode CLI (requires OPENROUTER_API_KEY)',
+    needsApiKey: true,
+    provider: 'openrouter'
+  },
  {
    id: 'claude-api',
    name: 'Claude Sonnet 4.5',
@@ -57,6 +65,14 @@ export const AVAILABLE_REVIEWERS: ReviewerOption[] = [
    description: 'Uses Google AI API (requires GOOGLE_API_KEY)',
    needsApiKey: true,
    provider: 'google'
+  },
+  {
+    id: 'openrouter',
+    name: 'OpenRouter (Claude 3.5 Sonnet)',
+    model: 'openrouter/anthropic/claude-3.5-sonnet',
+    description: 'Uses OpenRouter API (requires OPENROUTER_API_KEY). Change the model field to any OpenRouter-supported ID.',
+    needsApiKey: true,
+    provider: 'openrouter'
  }
 ]

@@ -98,6 +114,7 @@ export function generateConfig(selectedReviewerIds: string[]): string {
  const needsAnthropic = selectedReviewers.some(r => r.provider === 'anthropic')
  const needsOpenai = selectedReviewers.some(r => r.provider === 'openai')
  const needsGoogle = selectedReviewers.some(r => r.provider === 'google')
+  const needsOpenrouter = selectedReviewers.some(r => r.provider === 'openrouter')

  // Build providers section
  let providersSection = '# AI Provider API Keys (use environment variables)\nproviders:'
@@ -116,7 +133,13 @@ export function generateConfig(selectedReviewerIds: string[]): string {
  google:
    api_key: \${GOOGLE_API_KEY}`
  }
-  if (!needsAnthropic && !needsOpenai && !needsGoogle) {
+  if (needsOpenrouter) {
+    providersSection += `
+  openrouter:
+    api_key: \${OPENROUTER_API_KEY}
+    # base_url: https://openrouter.ai/api/v1  # optional, this is the default`
+  }
+  if (!needsAnthropic && !needsOpenai && !needsGoogle && !needsOpenrouter) {
    providersSection += ' {}'  // Empty providers if only CLI tools are used
  }

@@ -142,6 +165,7 @@ defaults:
  max_rounds: 5
  output_format: markdown
  check_convergence: true  # Stop early when reviewers reach consensus
+  show_jokes: true  # Show rotating programmer jokes in the spinner while waiting

 ${reviewersSection}

@@ -83,6 +83,10 @@ function validateConfig(config: MagpieConfig): void {
  }
  validateReviewerConfig('analyzer', config.analyzer)

+  if (config.audit) {
+    validateReviewerConfig('audit', config.audit)
+  }
+
  // Warn (don't throw) if API keys look empty — CLI providers don't need them
  if (!config.providers) return
  for (const [name, prov] of Object.entries(config.providers)) {
@@ -15,6 +15,7 @@ export interface DefaultsConfig {
  check_convergence: boolean
  language?: string  // Output language (e.g., 'zh', 'en', 'ja')
  diff_exclude?: string[]  // Glob patterns for files to exclude from diff (e.g., '*.pb.go', '*generated*')
+  show_jokes?: boolean  // Show rotating programmer jokes in spinner text while waiting (default: true)
 }

 export interface ContextGathererConfigOptions {
@@ -41,13 +42,16 @@ export interface MagpieConfig {
    google?: ProviderConfig
    'claude-code'?: { enabled: boolean }
    'codex-cli'?: { enabled: boolean }
+    'opencode-cli'?: { enabled: boolean }
    'qwen-code'?: { enabled: boolean }
    minimax?: ProviderConfig
+    openrouter?: ProviderConfig
  }
  mock?: boolean
  defaults: DefaultsConfig
  reviewers: Record<string, ReviewerConfig>
  summarizer: ReviewerConfig
  analyzer: ReviewerConfig
+  audit?: ReviewerConfig  // Omniscient final judge; falls back to summarizer if absent
  contextGatherer?: ContextGathererConfigOptions
 }
@@ -3,35 +3,86 @@ import { spawnSync } from 'child_process'
 import type { RawReference } from '../types.js'

 /**
- * Extract function/class names from diff
+ * Common keywords to exclude from symbol extraction (language-spanning)
+ */
+const STOP_SYMBOLS = new Set([
+  // JS/TS
+  'get', 'set', 'new', 'for', 'if', 'do', 'var', 'let', 'const', 'return',
+  'else', 'case', 'break', 'continue', 'switch', 'while', 'try', 'catch',
+  'throw', 'typeof', 'void', 'delete', 'import', 'export', 'default', 'from',
+  'async', 'await', 'yield', 'class', 'extends', 'super', 'this',
+  // Go
+  'func', 'type', 'struct', 'interface', 'map', 'chan', 'range', 'defer',
+  'select', 'nil', 'err', 'error', 'string', 'bool', 'int', 'int32', 'int64',
+  'uint', 'uint32', 'uint64', 'float32', 'float64', 'byte', 'rune', 'len',
+  'cap', 'make', 'append', 'copy', 'close', 'panic', 'recover', 'println',
+  'true', 'false', 'init', 'main',
+  // C/C++
+  'void', 'int', 'char', 'bool', 'auto', 'long', 'short', 'unsigned',
+  'signed', 'float', 'double', 'size_t', 'nullptr', 'static', 'const',
+  'virtual', 'override', 'inline', 'explicit', 'template', 'typename',
+  'namespace', 'using', 'public', 'private', 'protected',
+  // Proto
+  'message', 'service', 'rpc', 'enum', 'oneof', 'optional', 'repeated',
+  'required', 'reserved', 'returns', 'option',
+  // Python
+  'def', 'self', 'cls', 'None', 'True', 'False', 'pass', 'with', 'lambda',
+  // Java/Scala
+  'public', 'private', 'protected', 'static', 'final', 'abstract', 'synchronized',
+  'val', 'var', 'object', 'trait', 'extends', 'with', 'override',
+])
+
+/**
+ * Extract function/class/struct names from diff (multi-language)
 */
 export function extractSymbolsFromDiff(diff: string): string[] {
  const symbols: Set<string> = new Set()

-  // Match function definitions: function name(, async function name(, const name = (, etc.
-  const functionPatterns = [
+  const patterns: RegExp[] = [
+    // JS/TS: function name(, async function name(
    /^\+.*(?:function|async function)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(/gm,
+    // JS/TS: const name = (, const name = async (
    /^\+.*(?:const|let|var)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*=\s*(?:async\s*)?\(/gm,
+    // JS/TS: const name = (...) =>
    /^\+.*(?:const|let|var)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*=\s*(?:async\s*)?\([^)]*\)\s*=>/gm,
+    // JS/TS: class Name
+    /^\+.*class\s+([a-zA-Z_][a-zA-Z0-9_]*)/gm,
+    // JS/TS: method definitions in classes
+    /^\+\s+(?:async\s+)?([a-zA-Z_][a-zA-Z0-9_]*)\s*\([^)]*\)\s*[:{]/gm,
+    // JS/TS: export declarations
+    /^\+.*export\s+(?:const|let|var|function|class|async function)\s+([a-zA-Z_][a-zA-Z0-9_]*)/gm,
+    // Go: func Name(, func (receiver) Name(
+    /^\+.*func\s+(?:\([^)]*\)\s+)?([A-Z][a-zA-Z0-9_]*)\s*\(/gm,
+    // Go: type Name struct/interface
+    /^\+.*type\s+([A-Z][a-zA-Z0-9_]*)\s+(?:struct|interface)\b/gm,
+    // C/C++: return-type FunctionName(
+    /^\+.*(?:void|int|bool|char|auto|Status|string|std::string|size_t|int32_t|int64_t|uint32_t|uint64_t|float|double)\s+([A-Z][a-zA-Z0-9_]*)\s*\(/gm,
+    // C/C++: ClassName::MethodName(
+    /^\+.*([A-Z][a-zA-Z0-9_]*)::\s*([A-Z][a-zA-Z0-9_]*)\s*\(/gm,
+    // C/C++: class/struct Name
+    /^\+.*(?:class|struct)\s+([A-Z][a-zA-Z0-9_]*)/gm,
+    // Proto: message Name, service Name, rpc Name
+    /^\+\s*(?:message|service)\s+([A-Z][a-zA-Z0-9_]*)/gm,
+    /^\+\s*rpc\s+([A-Z][a-zA-Z0-9_]*)\s*\(/gm,
+    // Python: def name(, class Name
+    /^\+\s*def\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(/gm,
+    /^\+\s*class\s+([A-Z][a-zA-Z0-9_]*)/gm,
+    // Java/Scala: public/private type Name(
+    /^\+\s*(?:public|private|protected)?\s*(?:static\s+)?(?:def|void|int|boolean|String|long|double|float|[A-Z][a-zA-Z0-9_<>]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)\s*\(/gm,
  ]

-  // Match class definitions
-  const classPattern = /^\+.*class\s+([a-zA-Z_][a-zA-Z0-9_]*)/gm
-
-  // Match method definitions in classes
-  const methodPattern = /^\+\s+(?:async\s+)?([a-zA-Z_][a-zA-Z0-9_]*)\s*\([^)]*\)\s*[:{]/gm
-
-  // Match exported names
-  const exportPattern = /^\+.*export\s+(?:const|let|var|function|class|async function)\s+([a-zA-Z_][a-zA-Z0-9_]*)/gm
-
-  for (const pattern of [...functionPatterns, classPattern, methodPattern, exportPattern]) {
+  for (const pattern of patterns) {
    let match
    while ((match = pattern.exec(diff)) !== null) {
-      const name = match[1]
-      // Filter out common keywords and short names
-      if (name && name.length > 2 && !['get', 'set', 'new', 'for', 'if', 'do'].includes(name)) {
+      // For C++ ClassName::MethodName pattern, capture both parts
+      const name = match[2] || match[1]
+      if (name && name.length > 2 && !STOP_SYMBOLS.has(name)) {
        symbols.add(name)
      }
+      // Also add the class name for Class::Method patterns
+      if (match[2] && match[1] && match[1].length > 2 && !STOP_SYMBOLS.has(match[1])) {
+        symbols.add(match[1])
+      }
    }
  }

@@ -53,7 +104,8 @@ export function findReferences(symbols: string[], cwd: string = process.cwd()):
        '-n', '-H', '--no-heading',
        '-F',
        '-e', symbol,
-        '--type', 'ts', '--type', 'js',
+        '--type-add', 'code:*.{go,cpp,cc,cxx,h,hpp,hxx,c,py,java,scala,ts,tsx,js,jsx,rs,proto,cs}',
+        '--type', 'code',
      ], { cwd, encoding: 'utf-8', maxBuffer: 5 * 1024 * 1024 })

      const output = result.stdout || ''
@@ -105,16 +105,44 @@ export function deduplicateIssues(

 /**
 * Extract suggested review focus areas from analyzer output.
- * Looks for a "## Suggested Review Focus" section with bullet points.
+ * Matches the focus section heading in several flavors:
+ *   - "## Suggested Review Focus" (English heading)
+ *   - "## 建议的 review 重点" (Chinese heading with space)
+ *   - "## 建议的review重点" (Chinese heading no space)
+ *   - "**建议的 review 重点**" (bold variant)
+ *   - "**Suggested Review Focus**" (English bold variant)
+ * Reads until the next heading (##, **bold heading**) or end of section.
 */
 export function parseFocusAreas(analysis: string): string[] {
-  const match = analysis.match(/## Suggested Review Focus\s*\n([\s\S]*?)(?=\n##|\n*$)/)
+  // Heading pattern: either a markdown heading (##) or a standalone bold line (**...**)
+  // Title text matches Chinese or English variants.
+  const titlePattern = '(?:Suggested\\s+Review\\s+Focus|建议的\\s*review\\s*重点)'
+  // Optional leading numbering like "6.", "6、", "6）" before the title (analyzer may inline-number sections).
+  const numberPrefix = '(?:\\d+[\\.、\\)）]\\s*)?'
+  const headingRegex = new RegExp(
+    // Either: line starting with ## (optional number prefix), then title (optionally wrapped in **)
+    // Or: a standalone bold line **title** (with optional number prefix inside)
+    `(?:^|\\n)(?:#{1,6}\\s*${numberPrefix}\\*{0,2}${titlePattern}\\*{0,2}|\\*\\*${numberPrefix}${titlePattern}\\*\\*)[^\\n]*\\n([\\s\\S]*?)(?=\\n#{1,6}\\s|\\n\\*\\*[^\\n*]+\\*\\*\\s*\\n|$)`,
+    'i'
+  )
+  const match = analysis.match(headingRegex)
  if (!match) return []

-  const lines = match[1].trim().split('\n')
-  return lines
-    .map(line => line.replace(/^[-*]\s*/, '').trim())
-    .filter(line => line.length > 0)
+  const body = match[1].trim()
+  if (!body) return []
+
+  // Pull out lines that look like bulleted items.
+  // Supported markers: "-", "*", "1.", "1)", "1、", "①", "•", and Chinese full-width number variants.
+  const bulletRegex = /^\s*(?:[-*•·]|[①-⑳]|[\d]+[\.、\)）])\s+/u
+  const lines = body.split('\n')
+  const items: string[] = []
+  for (const raw of lines) {
+    if (!bulletRegex.test(raw)) continue
+    const stripped = raw.replace(bulletRegex, '').trim()
+    if (stripped.length === 0) continue
+    items.push(stripped)
+  }
+  return items
 }

 const STOP_WORDS = new Set(['the', 'a', 'in', 'of', 'is', 'to', 'and', 'for', 'with', 'this', 'that', 'it'])
@@ -79,16 +79,20 @@ export class DebateOrchestrator {
  private taskPrompt: string = ''  // Original task prompt (contains PR number, etc.)
  private lastSeenIndex: Map<string, number> = new Map()  // Track what each reviewer has seen

+  private auditor: Reviewer  // Final judge. Falls back to summarizer if not configured.
+
  constructor(
    reviewers: Reviewer[],
    summarizer: Reviewer,
    analyzer: Reviewer,
    options: OrchestratorOptions,
-    contextGatherer?: ContextGatherer
+    contextGatherer?: ContextGatherer,
+    auditor?: Reviewer
  ) {
    this.reviewers = reviewers
    this.summarizer = summarizer
    this.analyzer = analyzer
+    this.auditor = auditor || summarizer
    this.contextGatherer = contextGatherer || null
    this.options = options
  }
@@ -187,7 +191,14 @@ Reviews from Round ${roundsCompleted}:
 ${messagesText}

 First, provide a brief reasoning (2-3 sentences) explaining your judgment.
-Then on the LAST line, respond with EXACTLY one word: CONVERGED or NOT_CONVERGED`
+
+Output your verdict on the LAST LINE with EXACTLY this format (no punctuation, no extra words):
+
+CONVERGED
+
+or
+
+NOT_CONVERGED`

    const messages: Message[] = [{ role: 'user', content: prompt }]
    const response = await this.summarizer.provider.chat(
@@ -197,8 +208,10 @@ Then on the LAST line, respond with EXACTLY one word: CONVERGED or NOT_CONVERGED

    // Parse response - extract verdict from last line, rest is reasoning
    const lines = response.trim().split('\n')
-    const lastLine = lines[lines.length - 1].trim().toUpperCase()
-    const verdict = lastLine.split(/\s+/)[0]
+    const lastLine = lines[lines.length - 1].trim()
+    // Strip all non-letter characters and uppercase to match verdict robustly:
+    // "CONVERGED.", "Verdict: converged", "**CONVERGED**" all work.
+    const verdict = lastLine.replace(/[^A-Za-z_]/g, '').toUpperCase()
    const isConverged = verdict === 'CONVERGED'

    // Extract reasoning (everything except the last line)
@@ -298,15 +311,24 @@ Then on the LAST line, respond with EXACTLY one word: CONVERGED or NOT_CONVERGED

      this.checkInterrupt()

-      // Get final conclusion directly from conversation history
-      const finalConclusion = await this.getFinalConclusion()
+      let finalConclusion = ''
+      let verifiedConclusion: string | undefined

-      // Verify the conclusion against the actual PR/code
-      const verifiedConclusion = await this.verifyConclusion(finalConclusion)
+      if (!this.options.skipConclusion) {
+        finalConclusion = await this.getFinalConclusion()
+      }

      // End summarizer session for clean JSON extraction call
      this.summarizer.provider.endSession?.()
-      const parsedIssues = await this.extractIssues()
+      let parsedIssues = await this.extractIssues()
+
+      if (parsedIssues.length > 0) {
+        parsedIssues = await this.verifyIssues(parsedIssues)
+      }
+
+      if (finalConclusion && !this.options.skipConclusion) {
+        verifiedConclusion = await this.verifyConclusion(finalConclusion)
+      }

      return {
        prNumber: label,
@@ -352,6 +374,9 @@ Then on the LAST line, respond with EXACTLY one word: CONVERGED or NOT_CONVERGED
              const diff = this.extractDiffFromPrompt(prompt)
              this.gatheredContext = await this.contextGatherer!.gather(diff, label, 'main')
            } catch (error) {
+              if (this.options.failFast) {
+                throw new Error(`Context gathering failed (fail-fast): ${error instanceof Error ? error.message : String(error)}`)
+              }
              logger.warn('Context gathering failed:', error)
            }
          })()
@@ -489,6 +514,10 @@ Then on the LAST line, respond with EXACTLY one word: CONVERGED or NOT_CONVERGED
                duration: (endTime - startTime) / 1000
              }
              this.options.onParallelStatus?.(round, statuses)
+              if (this.options.failFast) {
+                // Re-throw so Promise.all rejects immediately and aborts the whole flow
+                throw new Error(`Reviewer ${reviewer.id} failed in round ${round} (fail-fast): ${err instanceof Error ? err.message : String(err)}`)
+              }
              logger.warn(`Reviewer ${reviewer.id} failed in round ${round}:`, err)
              return { reviewer, fullResponse: '', inputText: '', failed: true as const, error: err }
            }
@@ -536,18 +565,34 @@ Then on the LAST line, respond with EXACTLY one word: CONVERGED or NOT_CONVERGED
      }

      this.checkInterrupt()
-      this.options.onWaiting?.('summarizer')
-      const finalConclusion = await this.getFinalConclusion()

-      // Verify the conclusion against the actual PR/code
-      this.options.onWaiting?.('verifier')
-      const verifiedConclusion = await this.verifyConclusion(finalConclusion)
+      let finalConclusion = ''
+      let verifiedConclusion: string | undefined
+
+      if (!this.options.skipConclusion) {
+        this.options.onWaiting?.('summarizer')
+        finalConclusion = await this.getFinalConclusion()
+      }

      // End summarizer session before structurization so it gets a clean,
      // non-session call. The session context (convergence + conclusion) would
      // pollute the JSON extraction and --resume ignores custom system prompts.
      this.summarizer.provider.endSession?.()
-      const parsedIssues = await this.extractIssues()
+      let parsedIssues = await this.extractIssues()
+
+      // Verify+Audit: check each issue against actual code using tools.
+      // This replaces both the old text-only verifyConclusion and the
+      // downstream audit step in li-bot.
+      if (parsedIssues.length > 0) {
+        this.options.onWaiting?.('verifier')
+        parsedIssues = await this.verifyIssues(parsedIssues)
+      }
+
+      // Legacy: if conclusion was generated and skipConclusion is false,
+      // also verify conclusion text (for CLI interactive mode)
+      if (finalConclusion && !this.options.skipConclusion) {
+        verifiedConclusion = await this.verifyConclusion(finalConclusion)
+      }

      return {
        prNumber: label,
@@ -609,9 +654,32 @@ ${contextSection}${focusSection}${callChainSection}Here is the analysis:

 ${this.analysis}

-You are [${currentReviewerId}]. Review EVERY changed file and EVERY changed function/block — do not skip any.
-For each change, check: correctness, security, performance, error handling, edge cases, maintainability.
-If you reviewed a file and found no issues, say so briefly. Do not stop early.${this.langSuffix}`
+You are [${currentReviewerId}]. Review the PR systematically.
+
+For every issue you raise, you MUST include:
+1. The specific \`file:line\` — only lines inside diff hunks (lines outside hunks are wasted, GitHub can't anchor them)
+2. A quote of the offending code (1-3 lines max)
+3. The concrete failure scenario — what input or state triggers it, what happens, what the user/system experiences as a result
+4. A self-assessed severity (use these definitions exactly):
+   - critical = data corruption, security hole, guaranteed crash on common input
+   - high     = will trigger under realistic conditions, observable user-facing breakage
+   - medium   = edge case with plausible trigger, missing error handling
+   - low      = code quality, minor concern
+   - nitpick  = style-only preference (won't be posted)
+
+DO NOT REPORT:
+- Build script / CI polish (LD_PATH ordering, include order, dead asserts in build helpers, etc.)
+- Missing comments / docstrings unless load-bearing for correctness
+- "Forward-compat risk" / "if someone later adds X" without a concrete trigger
+- Dead code unless it carries real risk
+- Style preferences (naming, formatting, brace style)
+- Issues outside the diff hunk unless severity >= high
+- Theoretically-correct-but-impossible cases (e.g., int64 * byte_width overflow on 64-bit systems)
+
+If a file has nothing meaningful wrong, skip it. Do NOT produce filler.
+Brevity is a feature — 5 well-evidenced issues > 20 weak ones.
+
+Use \`gh pr diff\` and Read/Grep to verify your claims before reporting.${this.langSuffix}`

      return [{ role: 'user', content: prompt }]
    }
@@ -654,21 +722,20 @@ If you reviewed a file and found no issues, say so briefly. Do not stop early.${

      return [{
        role: 'user',
-        content: `You are [${currentReviewerId}]. Here's what others said in the previous round:\n\n${newContent}\n\nDo three things:\n1. Continue your own exhaustive review — are there changed files or functions you haven't covered yet? Cover them now.\n2. Point out what the other reviewers MISSED — which files or changes did they skip or gloss over?\n3. Respond to their points — agree where valid, challenge where you disagree.${this.langSuffix}`
+        content: `You are [${currentReviewerId}]. Here's what others said in the previous round:\n\n${newContent}\n\nDo this:\n1. If the others' findings are correct and you have nothing substantive to add, say "I agree with [reviewer]'s findings, no additional issues." That is a fine outcome — do not pad.\n2. If you disagree with any of their claims, challenge with code evidence — quote the line that disproves their concern.\n3. ONLY add new issues if they are concrete (file:line + code quote + failure scenario) AND genuinely missed by the others. Do not manufacture issues to look productive — padding hurts review quality.${this.langSuffix}`
      }]
    }

    // Non-session mode: full context with all previous rounds
    const debateContext = `You are [${currentReviewerId}] in a code review debate with [${otherReviewerIds.join('], [')}].
-Your shared goal: find ALL real issues in the code — leave nothing uncovered.
+Your shared goal: converge on the real issues — quality over quantity.

 IMPORTANT:
 - You are [${currentReviewerId}], the other reviewer${otherReviewerIds.length > 1 ? 's are' : ' is'} [${otherReviewerIds.join('], [')}]
- Continue your own exhaustive review — cover any changed files or functions you haven't addressed yet
- Point out what others MISSED — which files or changes did they skip or gloss over?
- Challenge weak arguments - don't agree just to be polite
- Acknowledge good points and build on them
- If you disagree, explain why with evidence`
+- If the others' findings are correct and you have nothing substantive to add, say "I agree with [reviewer]'s findings, no additional issues." That is a fine outcome — do not pad.
+- If you disagree with any claim, challenge with code evidence — quote the line that disproves the concern.
+- ONLY add new issues if they are concrete (file:line + code quote + failure scenario) AND genuinely missed by the others. Do not manufacture issues to look productive — padding hurts review quality.
+- Acknowledge good points and build on them.`

    let prompt = `Task: ${this.taskPrompt}

@@ -770,11 +837,11 @@ Output ONLY a JSON block (no other text):
  "issues": [
    {
      "severity": "critical|high|medium|low|nitpick",
-      "category": "security|performance|error-handling|style|correctness|architecture",
+      "category": "correctness|security|performance|concurrency|resource-leak|error-handling|build|testing|documentation|architecture|compatibility|style",
      "file": "path/to/file",
      "line": 42,
      "title": "One-line summary",
-      "description": "Detailed markdown explanation (see rules below)",
+      "description": "Concise explanation for GitHub PR comment (see rules below)",
      "suggestedFix": "Brief one-line fix summary",
      "raisedBy": ["reviewer-id-1", "reviewer-id-2"]
    }
@@ -784,11 +851,18 @@ Output ONLY a JSON block (no other text):

 Rules:
 - Include every issue mentioned by any reviewer
- The "description" field will be posted as a GitHub PR comment. Make it comprehensive markdown covering: (1) What the problem is, (2) Why it matters (impact/risk), (3) The original problematic code quoted in a code block, (4) The suggested fix shown as code, (5) Why the fix is correct
+- "description" field — write this as if you were a senior engineer leaving an inline PR comment. Must capture: (1) WHAT — the problem with a brief code quote (1-3 lines) anchored at line, (2) WHY — what makes this a bug / what assumption is broken / what invariant is violated (this is critical — audit will judge against this), (3) FAILURE — concrete scenario that triggers it and what the user/system experiences, (4) FIX — suggested fix if non-obvious. 1-3 sentences total, no boilerplate headers, no severity labels, no "raised by [X]" metadata. Plain prose only.
+- "category" MUST be one of the 12 values listed above. Choose the closest match, do not invent new categories.
 - If multiple reviewers mention the same issue, list all their IDs in raisedBy
 - Use the exact reviewer IDs: ${reviewerIds}
- If a file path or line number is mentioned, include it; otherwise omit the field
- Severity: critical = blocks merge, high = should fix, medium = worth fixing, low = minor, nitpick = style only${changedFilesConstraint}${this.options.language ? `\n- Write the "title", "description", and "suggestedFix" fields in ${this.options.language}. Keep JSON keys and severity/category values in English.` : ''}`
+- "line" field: REQUIRED for every issue. If the reviewer's text doesn't pin a specific line but anchors to a function or block, look at the diff hunk and pick the most representative line yourself. If you genuinely cannot anchor an issue to any line in the diff hunk, DROP that issue (don't emit it). Issues without lines cannot be posted as inline comments and waste reader attention.
+- Severity — use the rubric exactly. Do NOT bias systematically low or high:
+  critical = data corruption, security hole, guaranteed crash on common input
+  high     = will trigger under realistic conditions, observable user-facing breakage
+  medium   = edge case with plausible trigger, missing error handling
+  low      = code quality, minor concern
+  nitpick  = style-only preference
+  If the reviewer's reasoning supports a higher severity, use the higher one.${changedFilesConstraint}${this.options.language ? `\n- Write the "title", "description", and "suggestedFix" fields in ${this.options.language}. Keep JSON keys and severity/category values in English.` : ''}`

    const systemPrompt = 'You extract structured issues from code review text. Output only valid JSON.'
    const chatOpts = { disableTools: true }
@@ -898,4 +972,190 @@ Then provide your **Verified Final Conclusion** that:
    this.trackTokens('summarizer', prompt + (systemPrompt || ''), response)
    return response
  }
+
+  /**
+   * Audit (omniscient final judge): for every reviewer-flagged issue, verify against
+   * actual code (Read/Grep/Glob + `gh pr diff`); rewrite weak descriptions; drop false
+   * positives; add issues reviewers missed (especially cross-file pattern repetition).
+   * Returns the post-audit issue list.
+   */
+  private async verifyIssues(issues: MergedIssue[]): Promise<MergedIssue[]> {
+    const issuesText = issues.map((iss, i) =>
+      `### Issue ${i} [severity: ${iss.severity}] [category: ${iss.category}]\nfile: ${iss.file}${iss.line ? `:${iss.line}` : ''}\ntitle: ${iss.title}\ndescription: ${iss.description}${iss.suggestedFix ? `\nsuggestedFix: ${iss.suggestedFix}` : ''}`
+    ).join('\n\n')
+
+    // Optional repo-specific conventions file at ~/.magpie/house-rules/<owner>_<repo>.md.
+    // Parse owner/repo from the PR URL embedded in taskPrompt.
+    let houseRules = ''
+    try {
+      const { readFileSync, existsSync } = await import('fs')
+      const { join } = await import('path')
+      const { homedir } = await import('os')
+      const repoMatch = this.taskPrompt.match(/github\.com\/([^/\s]+)\/([^/\s]+)\/pull\//)
+      if (repoMatch) {
+        const owner = repoMatch[1]
+        const repo = repoMatch[2]
+        const hrPath = join(homedir(), '.magpie', 'house-rules', `${owner}_${repo}.md`)
+        if (existsSync(hrPath)) {
+          houseRules = readFileSync(hrPath, 'utf-8').trim()
+          logger.info(`Audit using house-rules from ${hrPath}`)
+        }
+      }
+    } catch { /* no house rules — that's fine */ }
+
+    const prompt = `${this.taskPrompt}
+
+You have access to Read, Grep, Glob, and Bash. Run \`gh pr diff\` (the URL is in the task above) to see the actual changes, then Read the touched files. **Read the code before judging — never guess.**
+
+## Issues raised by reviewers
+
+${issuesText}
+${houseRules ? `\n## Repository conventions (MUST respect — these override reviewer claims)\n\n${houseRules}\n` : ''}
+## Your job
+
+### Task 1: Verify each issue above
+
+For every numbered issue, decide a verdict:
+
+- **keep** — issue is real and the description is fine as-is. You may adjust severity.
+- **rewrite** — issue is real but the description is weak (machine-sounding, missing evidence, vague, or includes decoration). Write a clean replacement.
+- **drop** — false positive. Must give a \`reason\` (one of):
+  * \`codebase-convention\` — violates repo idiom (e.g. AssertInfo throws, doesn't abort; assert in writer_c.cpp is invariant not input validation)
+  * \`pre-existing\` — not introduced by this PR and unrelated to PR touch
+  * \`theoretically-correct-but-impossible\` — true in theory but real-world impossible (e.g. int64*byte_width overflow on 64-bit)
+  * \`style-out-of-scope\` — pure style, PR doesn't touch that concern
+  * \`false-claim\` — reviewer misread the code
+
+For every keep/rewrite you MUST include \`evidence\` quoting the actual code you Read (file:line + the line itself).
+
+### Task 2: Find issues reviewers MISSED
+
+After verifying, scan the diff yourself:
+
+a) **Coverage** — did reviewers skip files or functions in the diff? Read what they didn't.
+b) **Cross-file pattern repetition** — for every kept/rewritten issue, grep the entire diff for the same pattern in other files. New occurrence = new issue.
+c) **Architecture** — does this fix break an abstraction, introduce coupling, violate a pattern visible elsewhere?
+d) **Orthogonal interactions** — grep callers/consumers of touched interfaces; flag any module that should be updated together.
+
+New issues use \`verdict: "new"\`. Same evidence rules apply.
+
+## Output JSON (only this — no narrative, no preamble)
+
+\`\`\`json
+{
+  "verifiedIssues": [
+    {
+      "verdict": "keep" | "rewrite" | "drop" | "new",
+      "originalIndex": 0,
+      "file": "internal/...",
+      "line": 42,
+      "severity": "critical" | "high" | "medium" | "low" | "nitpick",
+      "category": "correctness",
+      "body": "Plain prose, 1-3 sentences.",
+      "evidence": "at file.cpp:118 saw \`if (!p) goto cleanup\` — confirms ...",
+      "reason": "codebase-convention"
+    }
+  ]
+}
+\`\`\`
+
+## Hard rules
+
+- For verdict=keep/rewrite/drop: \`originalIndex\` is REQUIRED (references the issue number above).
+- For verdict=keep/rewrite/new: \`file\` + \`line\` + \`severity\` + \`category\` + \`body\` + \`evidence\` are REQUIRED. \`line\` MUST be inside a diff hunk — run \`gh pr diff\` and verify.
+- For verdict=drop: \`reason\` is REQUIRED. Other fields ignored.
+- For verdict=keep: \`body\` may be omitted (signals "original description is fine"). If you set it, that replaces the original.
+- \`body\` must be plain prose. NO emoji decorations, NO \`[meta]\` tags, NO "Severity: X" labels, NO "raised by Y" suffix. Write like a senior engineer leaves an inline comment.
+- No evidence = no issue. Don't ship anything you didn't verify with code reads.
+- If repository conventions above conflict with a reviewer claim, conventions win.
+- Every issue must appear in \`verifiedIssues\` (every keep/rewrite/drop + any new).${this.langSuffix}`
+
+    const messages: Message[] = [{ role: 'user', content: prompt }]
+    const systemPrompt = this.withLang(this.auditor.systemPrompt)
+
+    try {
+      const response = await this.auditor.provider.chat(messages, systemPrompt)
+      this.trackTokens('verifier', prompt + (systemPrompt || ''), response)
+
+      // Parse the audit result
+      const jsonMatch = response.match(/```json\s*([\s\S]*?)\s*```/)
+      const jsonStr = jsonMatch?.[1] || response
+      const match = jsonStr.match(/\{[\s\S]*"verifiedIssues"\s*:\s*\[[\s\S]*?\]\s*\}/)
+      if (!match) {
+        logger.warn('Audit returned unparseable output; keeping original issues')
+        return issues
+      }
+
+      const parsed = JSON.parse(match[0])
+      if (!Array.isArray(parsed.verifiedIssues)) {
+        logger.warn('Audit verifiedIssues field is not an array; keeping originals')
+        return issues
+      }
+
+      type V = {
+        verdict: 'keep' | 'rewrite' | 'drop' | 'new'
+        originalIndex?: number
+        file?: string
+        line?: number
+        severity?: MergedIssue['severity']
+        category?: string
+        body?: string
+        evidence?: string
+        reason?: string
+      }
+
+      const result: MergedIssue[] = []
+      const droppedOrigIdx = new Set<number>()
+      let dropCount = 0, rewriteCount = 0, newCount = 0
+
+      for (const v of parsed.verifiedIssues as V[]) {
+        if (v.verdict === 'drop') {
+          if (typeof v.originalIndex === 'number') droppedOrigIdx.add(v.originalIndex)
+          dropCount++
+          continue
+        }
+        if (v.verdict === 'new') {
+          if (!v.file || typeof v.line !== 'number' || !v.body || !v.evidence) continue
+          result.push({
+            severity: (v.severity || 'low'),
+            category: v.category || 'general',
+            file: v.file,
+            line: v.line,
+            title: v.body.split(/[.!?\n]/)[0].slice(0, 100),
+            description: v.body,
+            raisedBy: ['auditor'],
+            descriptions: [v.body],
+            verdict: 'new',
+            body: v.body,
+            evidence: v.evidence
+          })
+          newCount++
+          continue
+        }
+        // keep or rewrite
+        if (typeof v.originalIndex !== 'number' || v.originalIndex < 0 || v.originalIndex >= issues.length) {
+          continue
+        }
+        const orig = issues[v.originalIndex]
+        if (droppedOrigIdx.has(v.originalIndex)) continue  // already dropped, skip duplicate
+        const merged: MergedIssue = {
+          ...orig,
+          severity: v.severity || orig.severity,
+          file: v.file || orig.file,
+          line: typeof v.line === 'number' ? v.line : orig.line,
+          verdict: v.verdict,
+          body: v.body,  // undefined for keep-no-change is fine
+          evidence: v.evidence
+        }
+        if (v.verdict === 'rewrite') rewriteCount++
+        result.push(merged)
+      }
+
+      logger.info(`Audit: ${result.length - newCount} kept/rewritten (${rewriteCount} rewrites), ${dropCount} dropped, ${newCount} new`)
+      return result
+    } catch (err) {
+      logger.warn('Audit failed; returning original issues:', err)
+      return issues
+    }
+  }
 }
@@ -56,6 +56,8 @@ export interface OrchestratorOptions {
  onPostAnalysisQA?: () => Promise<{ target: string; question: string } | null>
  onContextGathered?: (context: GatheredContext) => void  // Context gathering complete callback
  interruptState?: { interrupted: boolean }  // External interrupt signal (e.g., Ctrl+C)
+  skipConclusion?: boolean  // Skip getFinalConclusion + old verifyConclusion (bot mode)
+  failFast?: boolean  // Abort the entire flow as soon as any reviewer (or context gatherer) fails
 }

 /** Structured issue from a reviewer */
@@ -83,4 +85,10 @@ export interface ReviewerOutput {
 export interface MergedIssue extends ReviewIssue {
  raisedBy: string[]       // reviewer IDs who found this issue
  descriptions: string[]   // each reviewer's description
+
+  // Populated by the audit stage (verifyIssues). Absent if audit didn't run.
+  verdict?: 'keep' | 'rewrite' | 'drop' | 'new'
+  body?: string            // Audit-authored post text (replaces description for posting). Plain prose.
+  evidence?: string        // Audit's cited code reference (file:line + quote)
+  auditReason?: string     // For verdict=drop: drop reason category
 }
@@ -4,6 +4,12 @@ import { CliSessionHelper } from './session-helper.js'
 import { preparePromptForCli } from '../utils/prompt-file.js'
 import { withRetry } from '../utils/retry.js'

+// Tools magpie reviewers are pre-approved to use without prompting.
+// Read-only file/code access plus the specific Bash commands needed
+// to inspect PRs (gh), git history, and search (rg). General Bash,
+// Edit, and Write are intentionally NOT included.
+const ALLOWED_TOOLS = 'Read,Grep,Glob,Bash(gh:*),Bash(git:*),Bash(rg:*)'
+
 export class ClaudeCodeProvider implements AIProvider {
  name = 'claude-code'
  private cwd: string
@@ -37,17 +43,29 @@ export class ClaudeCodeProvider implements AIProvider {
    const prompt = this.session.shouldSendFullHistory()
      ? this.session.buildPrompt(messages, systemPrompt)
      : this.session.buildPromptLastOnly(messages)
+    try {
      const result = await withRetry(() => this.runClaude(prompt, systemPrompt, options))
      this.session.markMessageSent()
      return result
+    } catch (err) {
+      this.session.start(this.session.sessionName)
+      throw err
+    }
  }

  async *chatStream(messages: Message[], systemPrompt?: string): AsyncGenerator<string, void, unknown> {
    const prompt = this.session.shouldSendFullHistory()
      ? this.session.buildPrompt(messages, systemPrompt)
      : this.session.buildPromptLastOnly(messages)
+    try {
      yield* this.runClaudeStream(prompt, systemPrompt)
      this.session.markMessageSent()
+    } catch (err) {
+      // Reset to a fresh session ID so the next round doesn't try to --resume
+      // or --session-id a dead/stuck session
+      this.session.start(this.session.sessionName)
+      throw err
+    }
  }

  // Spawn env: clear CLAUDECODE to avoid nested session detection when run from Claude Code
@@ -62,8 +80,7 @@ export class ClaudeCodeProvider implements AIProvider {

    return new Promise((resolve, reject) => {
      // Build args based on session state
-      // Use --dangerously-skip-permissions to allow network access (e.g., gh commands)
-      const args = ['-p', '-', '--dangerously-skip-permissions']
+      const args = ['-p', '-', '--effort', 'xhigh', '--allowed-tools', ALLOWED_TOOLS]
      if (this.cliModel) {
        args.push('--model', this.cliModel)
      }
@@ -125,9 +142,11 @@ export class ClaudeCodeProvider implements AIProvider {
  private async *runClaudeStream(prompt: string, systemPrompt?: string): AsyncGenerator<string, void, unknown> {
    const { prompt: stdinPrompt, cleanup } = preparePromptForCli(prompt)

-    // Build args based on session state
-    // Use --dangerously-skip-permissions to allow network access (e.g., gh commands)
-    const args = ['-p', '-', '--dangerously-skip-permissions']
+    // Build args based on session state.
+    // Use --output-format stream-json --verbose so that tool activity (Read, Bash, etc.)
+    // produces stdout events, preventing the inactivity timeout from killing Claude
+    // while it's actively investigating code.
+    const args = ['-p', '-', '--allowed-tools', ALLOWED_TOOLS, '--effort', 'xhigh', '--output-format', 'stream-json', '--verbose']
    if (this.cliModel) {
      args.push('--model', this.cliModel)
    }
@@ -153,6 +172,7 @@ export class ClaudeCodeProvider implements AIProvider {
    let done = false
    let error: Error | null = null
    let lastActivity = Date.now()
+    let lineBuf = ''

    // Timeout checker - kill if no activity for too long
    const timeoutChecker = this.timeout > 0 ? setInterval(() => {
@@ -173,13 +193,30 @@ export class ClaudeCodeProvider implements AIProvider {

    child.stdout.on('data', (data) => {
      lastActivity = Date.now()
-      const chunk = data.toString()
+      // Parse stream-json: each line is a JSON event.
+      // Every event (tool_use, tool_result, assistant, etc.) updates lastActivity.
+      // We only yield the final result text to the caller.
+      lineBuf += data.toString()
+      let idx
+      while ((idx = lineBuf.indexOf('\n')) !== -1) {
+        const line = lineBuf.slice(0, idx).trim()
+        lineBuf = lineBuf.slice(idx + 1)
+        if (!line) continue
+        try {
+          const event = JSON.parse(line)
+          if (event.type === 'result' && typeof event.result === 'string') {
+            const chunk = event.result
            if (resolveNext) {
              resolveNext({ chunk })
              resolveNext = null
            } else {
              chunks.push(chunk)
            }
+          }
+        } catch {
+          // Not valid JSON, ignore
+        }
+      }
    })

    let stderrOutput = ''
@@ -41,21 +41,38 @@ export class CodexCliProvider implements AIProvider {
    const prompt = this.sessionEnabled && !this.session.shouldSendFullHistory()
      ? this.session.buildPromptLastOnly(messages)
      : this.session.buildPrompt(messages, systemPrompt)
+    try {
      const result = await withRetry(() => this.runCodex(prompt))
      this.session.markMessageSent()
      return result
+    } catch (err) {
+      this.startSession(this.session.sessionName)
+      throw err
+    }
  }

  async *chatStream(messages: Message[], systemPrompt?: string): AsyncGenerator<string, void, unknown> {
    const prompt = this.sessionEnabled && !this.session.shouldSendFullHistory()
      ? this.session.buildPromptLastOnly(messages)
      : this.session.buildPrompt(messages, systemPrompt)
+    try {
      yield* this.runCodexStream(prompt)
      this.session.markMessageSent()
+    } catch (err) {
+      this.startSession(this.session.sessionName)
+      throw err
+    }
  }

  private buildArgs(): string[] {
-    const baseArgs = ['--json', '--dangerously-bypass-approvals-and-sandbox']
+    // workspace-write (not read-only) because codex's read-only sandbox
+    // also blocks network, which breaks `gh pr diff` for reviewers.
+    const baseArgs = [
+      '--json',
+      '--sandbox', 'workspace-write',
+      '-c', 'approval_policy="never"',
+      '-c', 'sandbox_workspace_write.network_access=true',
+    ]
    if (this.cliModel) {
      baseArgs.push('--model', this.cliModel)
    }
@@ -7,6 +7,7 @@ import { ClaudeCodeProvider } from './claude-code.js'
 import { CodexCliProvider } from './codex-cli.js'
 import { GeminiCliProvider } from './gemini-cli.js'
 import { GeminiProvider } from './gemini.js'
+import { OpencodeCliProvider } from './opencode-cli.js'
 import { QwenCodeProvider } from './qwen-code.js'
 import { MiniMaxProvider } from './minimax.js'
 import { MockProvider } from './mock.js'
@@ -14,9 +15,20 @@ import { checkCliBinary } from './cli-check.js'

 // Parse CLI model string: 'gemini-cli:gemini-2.5-pro' → { provider: 'gemini-cli', cliModel: 'gemini-2.5-pro' }
 // Plain 'gemini-cli' → { provider: 'gemini-cli', cliModel: undefined }
-const CLI_PROVIDERS = ['claude-code', 'codex-cli', 'gemini-cli', 'qwen-code'] as const
+const CLI_PROVIDERS = ['claude-code', 'codex-cli', 'gemini-cli', 'opencode-cli', 'qwen-code'] as const
 type CliProviderName = typeof CLI_PROVIDERS[number]

+const OPENROUTER_PREFIX = 'openrouter/'
+const DEFAULT_OPENROUTER_BASE_URL = 'https://openrouter.ai/api/v1'
+
+// OpenRouter model IDs look like 'openrouter/<vendor>/<model>',
+// e.g. 'openrouter/anthropic/claude-3.5-sonnet'. The prefix routes to
+// the OpenAI client (OpenRouter is OpenAI-compatible); the rest is the
+// model ID the OpenRouter API expects.
+function stripOpenRouterPrefix(model: string): string {
+  return model.slice(OPENROUTER_PREFIX.length)
+}
+
 export function parseCliModel(model: string): { provider: string; cliModel?: string } {
  for (const cli of CLI_PROVIDERS) {
    if (model === cli) {
@@ -29,7 +41,16 @@ export function parseCliModel(model: string): { provider: string; cliModel?: str
  return { provider: model }
 }

-export function getProviderForModel(model: string): 'anthropic' | 'openai' | 'google' | 'claude-code' | 'codex-cli' | 'gemini-cli' | 'qwen-code' | 'minimax' | 'mock' {
+/** Check if a model string maps to a CLI-based provider (has tool access / can read files) */
+export function isCliModel(model: string): boolean {
+  const { provider } = parseCliModel(model)
+  return (CLI_PROVIDERS as readonly string[]).includes(provider)
+}
+
+export function getProviderForModel(model: string): 'anthropic' | 'openai' | 'google' | 'claude-code' | 'codex-cli' | 'gemini-cli' | 'opencode-cli' | 'qwen-code' | 'minimax' | 'mock' | 'openrouter' {
+  if (model.startsWith(OPENROUTER_PREFIX)) {
+    return 'openrouter'
+  }
  const { provider } = parseCliModel(model)
  if ((CLI_PROVIDERS as readonly string[]).includes(provider)) {
    return provider as CliProviderName
@@ -86,11 +107,41 @@ export function createProvider(model: string, config: MagpieConfig): AIProvider
    return new QwenCodeProvider({ cliModel })
  }

+  // OpenCode CLI is the one CLI provider that needs upstream API keys —
+  // it routes to OpenRouter (or another provider) for the actual model call.
+  // We forward whatever keys magpie already has configured.
+  if (providerName === 'opencode-cli') {
+    checkCliBinary('opencode', 'OpenCode')
+    return new OpencodeCliProvider({ cliModel, config })
+  }
+
  // Mock provider for debug mode — no API key needed
  if (providerName === 'mock') {
    return new MockProvider()
  }

+  // OpenRouter is OpenAI-compatible: route through the OpenAI client,
+  // strip the 'openrouter/' prefix from the model, and point at OpenRouter's API.
+  if (providerName === 'openrouter') {
+    const openRouterModel = stripOpenRouterPrefix(model).trim()
+    if (!openRouterModel) {
+      throw new Error(`Invalid OpenRouter model "${model}": must include a model ID after "${OPENROUTER_PREFIX}" (e.g. "openrouter/anthropic/claude-3.5-sonnet").`)
+    }
+    const providerConfig = config.providers['openrouter']
+    const apiKey = providerConfig?.api_key || process.env.OPENROUTER_API_KEY || ''
+    if (!apiKey) {
+      throw new Error('OpenRouter API key is required. Set OPENROUTER_API_KEY env var or providers.openrouter.api_key in config.')
+    }
+    // NOTE: the returned provider's `.name` will be 'openai', not 'openrouter',
+    // because OpenRouter requests are dispatched through the OpenAI client.
+    // Logs/UI keyed on provider name will show 'openai' for OpenRouter traffic.
+    return new OpenAIProvider({
+      apiKey,
+      model: openRouterModel,
+      baseURL: providerConfig?.base_url || DEFAULT_OPENROUTER_BASE_URL,
+    })
+  }
+
  // MiniMax uses API key from config or env
  if (providerName === 'minimax') {
    const providerConfig = config.providers['minimax']
@@ -41,17 +41,27 @@ export class GeminiCliProvider implements AIProvider {
    const prompt = this.sessionEnabled && !this.session.shouldSendFullHistory()
      ? this.session.buildPromptLastOnly(messages)
      : this.session.buildPrompt(messages, systemPrompt)
+    try {
      const result = await withRetry(() => this.runGemini(prompt))
      this.session.markMessageSent()
      return result
+    } catch (err) {
+      this.startSession(this.session.sessionName)
+      throw err
+    }
  }

  async *chatStream(messages: Message[], systemPrompt?: string): AsyncGenerator<string, void, unknown> {
    const prompt = this.sessionEnabled && !this.session.shouldSendFullHistory()
      ? this.session.buildPromptLastOnly(messages)
      : this.session.buildPrompt(messages, systemPrompt)
+    try {
      yield* this.runGeminiStream(prompt)
      this.session.markMessageSent()
+    } catch (err) {
+      this.startSession(this.session.sessionName)
+      throw err
+    }
  }

  private runGemini(prompt: string): Promise<string> {
@@ -135,6 +145,7 @@ export class GeminiCliProvider implements AIProvider {
    let error: Error | null = null
    let lastActivity = Date.now()
    let lineBuf = ''  // Buffer for NDJSON line parsing
+    let stderrBuf = ''

    // Timeout checker - kill if no activity for too long
    const timeoutChecker = this.timeout > 0 ? setInterval(() => {
@@ -146,7 +157,8 @@ export class GeminiCliProvider implements AIProvider {
        }, 5000)
        forceKill.unref()
        done = true
-        error = new Error(`Gemini CLI timed out after ${this.timeout / 1000}s of inactivity`)
+        const stderr = stderrBuf.trim()
+        error = new Error(`Gemini CLI timed out after ${this.timeout / 1000}s of inactivity${stderr ? ': ' + stderr.slice(-500) : ''}`)
        if (resolveNext) {
          resolveNext({ chunk: null })
        }
@@ -186,8 +198,10 @@ export class GeminiCliProvider implements AIProvider {
      }
    })

-    child.stderr.on('data', (_data) => {
+    child.stderr.on('data', (data) => {
      lastActivity = Date.now()  // Activity on stderr also counts
+      stderrBuf += data.toString()
+      if (stderrBuf.length > 10000) stderrBuf = stderrBuf.slice(-10000)
    })

    child.on('close', (code) => {
@@ -208,7 +222,8 @@ export class GeminiCliProvider implements AIProvider {
      }
      done = true
      if (code !== 0 && !error) {
-        error = new Error(`Gemini CLI exited with code ${code}`)
+        const stderr = stderrBuf.trim()
+        error = new Error(`Gemini CLI exited with code ${code}${stderr ? ': ' + stderr.slice(-500) : ''}`)
      }
      if (resolveNext) {
        resolveNext({ chunk: null })
@@ -0,0 +1,355 @@
+import { spawn } from 'child_process'
+import type { AIProvider, Message, CliProviderOptions, ChatOptions } from './types.js'
+import type { MagpieConfig } from '../config/types.js'
+import { CliSessionHelper } from './session-helper.js'
+import { preparePromptForCli } from '../utils/prompt-file.js'
+import { withRetry } from '../utils/retry.js'
+
+// Read-only tool allowlist for opencode reviewers, mirroring claude-code's
+// ALLOWED_TOOLS. Injected via the OPENCODE_CONFIG_CONTENT env var so we don't
+// touch the user's own opencode.json. With --dangerously-skip-permissions,
+// explicit "deny" entries still block — unspecified categories auto-allow,
+// which keeps us forward-compatible with new opencode tools.
+//
+// IMPORTANT — bash rule order: opencode applies the LAST matching pattern,
+// not the most specific one. The catch-all `'*': 'deny'` MUST come first,
+// followed by the specific allows, or every gh/git/rg call gets denied and
+// opencode drops the bash tool from the model's available tool list entirely.
+const PERMISSION_CONFIG = JSON.stringify({
+  $schema: 'https://opencode.ai/config.json',
+  permission: {
+    read: 'allow',
+    grep: 'allow',
+    glob: 'allow',
+    list: 'allow',
+    todowrite: 'allow',
+    edit: 'deny',
+    task: 'deny',
+    webfetch: 'deny',
+    websearch: 'deny',
+    // Large prompts (>100KB) are materialized to a file via preparePromptForCli
+    // and we pass tmpDir: this.cwd so that file lives inside --dir <cwd>.
+    // That keeps external_directory denied: a prompt injection cannot trick
+    // the reviewer into reading ~/.ssh, /etc/passwd, or anything else outside
+    // the repo.
+    external_directory: 'deny',
+    bash: {
+      '*': 'deny',
+      'gh *': 'allow',
+      'git *': 'allow',
+      'rg *': 'allow',
+    },
+  },
+})
+
+// Magpie provider key → opencode env var. Forwarded so the user only needs
+// to configure each key once (in magpie's config) rather than also exporting
+// it to opencode's environment.
+const API_KEY_FORWARDS: Array<{ env: string; providerKey: 'openrouter' | 'anthropic' | 'openai' | 'google' }> = [
+  { env: 'OPENROUTER_API_KEY', providerKey: 'openrouter' },
+  { env: 'ANTHROPIC_API_KEY', providerKey: 'anthropic' },
+  { env: 'OPENAI_API_KEY', providerKey: 'openai' },
+  { env: 'GOOGLE_API_KEY', providerKey: 'google' },
+]
+
+export interface OpencodeCliProviderOptions extends CliProviderOptions {
+  /** MagpieConfig is needed so we can forward API keys to opencode's env. */
+  config?: MagpieConfig
+}
+
+export class OpencodeCliProvider implements AIProvider {
+  name = 'opencode-cli'
+  private cwd: string
+  private timeout: number  // ms, 0 = no timeout
+  private cliModel?: string
+  private config?: MagpieConfig
+  private session = new CliSessionHelper()
+  // Like codex-cli: opencode generates its own session id and returns it in
+  // the first response's event stream. We never pre-generate one — that
+  // would risk telling opencode to "continue" a session it has never seen.
+  private sessionEnabled = false
+
+  get sessionId() { return this.session.sessionId }
+
+  constructor(options?: OpencodeCliProviderOptions) {
+    this.cwd = process.cwd()
+    this.timeout = 15 * 60 * 1000  // 15 minutes
+    this.cliModel = options?.cliModel
+    this.config = options?.config
+  }
+
+  setCwd(cwd: string) {
+    this.cwd = cwd
+  }
+
+  startSession(name?: string): void {
+    this.sessionEnabled = true
+    this.session.start(name)
+    this.session.sessionId = undefined  // Captured from the first response, not pre-generated
+  }
+
+  endSession(): void {
+    this.sessionEnabled = false
+    this.session.end()
+  }
+
+  async chat(messages: Message[], systemPrompt?: string, _options?: ChatOptions): Promise<string> {
+    const prompt = this.session.shouldSendFullHistory()
+      ? this.session.buildPrompt(messages, systemPrompt)
+      : this.session.buildPromptLastOnly(messages)
+    try {
+      const result = await withRetry(() => this.runOpencode(prompt))
+      this.session.markMessageSent()
+      return result
+    } catch (err) {
+      if (this.sessionEnabled) this.startSession(this.session.sessionName)
+      throw err
+    }
+  }
+
+  async *chatStream(messages: Message[], systemPrompt?: string): AsyncGenerator<string, void, unknown> {
+    const prompt = this.session.shouldSendFullHistory()
+      ? this.session.buildPrompt(messages, systemPrompt)
+      : this.session.buildPromptLastOnly(messages)
+    try {
+      yield* this.runOpencodeStream(prompt)
+      this.session.markMessageSent()
+    } catch (err) {
+      if (this.sessionEnabled) this.startSession(this.session.sessionName)
+      throw err
+    }
+  }
+
+  private spawnEnv(): NodeJS.ProcessEnv {
+    const env: NodeJS.ProcessEnv = { ...process.env, OPENCODE_CONFIG_CONTENT: PERMISSION_CONFIG }
+    if (this.config) {
+      for (const { env: envKey, providerKey } of API_KEY_FORWARDS) {
+        const pc = this.config.providers[providerKey] as { api_key?: string } | undefined
+        if (pc?.api_key) {
+          env[envKey] = pc.api_key
+        }
+      }
+    }
+    return env
+  }
+
+  private buildArgs(): string[] {
+    // opencode run reads stdin and concatenates with positional args, so we
+    // can deliver the prompt via stdin like the other CLI providers.
+    // --dangerously-skip-permissions auto-allows unspecified categories;
+    // explicit "deny" entries in PERMISSION_CONFIG still block.
+    const args = [
+      'run',
+      '--format', 'json',
+      '--dir', this.cwd,
+      '--dangerously-skip-permissions',
+    ]
+    if (this.cliModel) {
+      args.push('-m', this.cliModel)
+    }
+    // Pass the captured session id on follow-up turns. We never use
+    // --continue (which resumes opencode's globally-last session and would
+    // race when multiple magpie reviewers run concurrently), and we never
+    // pass an unseen id on the first turn (opencode generates the id).
+    if (this.sessionEnabled && this.session.sessionId && !this.session.isFirstMessage) {
+      args.push('--session', this.session.sessionId)
+    }
+    return args
+  }
+
+  // Event schema (verified against opencode 1.15.11):
+  //   {type:"step_start", sessionID:"ses_...", part:{...}}
+  //   {type:"text",       sessionID:"ses_...", part:{type:"text", text:"..."}}
+  // Each model turn emits one consolidated `text` event — no streaming deltas.
+  // Tool-use events are ignored for text extraction.
+  private extractEventText(event: unknown): string {
+    if (!event || typeof event !== 'object') return ''
+    const e = event as { type?: unknown; sessionID?: unknown; part?: { type?: unknown; text?: unknown } }
+
+    if (this.sessionEnabled && !this.session.sessionId && typeof e.sessionID === 'string') {
+      this.session.sessionId = e.sessionID
+    }
+
+    if (e.type === 'text' && e.part?.type === 'text' && typeof e.part.text === 'string') {
+      return e.part.text
+    }
+    return ''
+  }
+
+  private parseJsonOutput(output: string): string {
+    let text = ''
+    for (const line of output.split('\n')) {
+      const trimmed = line.trim()
+      if (!trimmed) continue
+      try {
+        text += this.extractEventText(JSON.parse(trimmed))
+      } catch {
+        // not JSON — ignore
+      }
+    }
+    return text
+  }
+
+  private runOpencode(prompt: string): Promise<string> {
+    // Write the spilled prompt file inside --dir <cwd> so the read tool can
+    // reach it; external_directory: 'deny' would otherwise block /tmp paths.
+    const { prompt: stdinPrompt, cleanup } = preparePromptForCli(prompt, { tmpDir: this.cwd })
+
+    return new Promise((resolve, reject) => {
+      const args = this.buildArgs()
+      const child = spawn('opencode', args, {
+        cwd: this.cwd,
+        stdio: ['pipe', 'pipe', 'pipe'],
+        env: this.spawnEnv(),
+      })
+
+      let output = ''
+      let error = ''
+
+      child.stdout.on('data', (data) => {
+        output += data.toString()
+      })
+
+      child.stderr.on('data', (data) => {
+        error += data.toString()
+      })
+
+      child.on('close', (code) => {
+        cleanup()
+        if (code !== 0) {
+          reject(new Error(`OpenCode CLI exited with code ${code}: ${error}`))
+        } else {
+          resolve(this.parseJsonOutput(output).trim())
+        }
+      })
+
+      child.on('error', (err) => {
+        cleanup()
+        reject(new Error(`Failed to run opencode CLI: ${err.message}`))
+      })
+
+      child.stdin.on('error', () => {})
+      child.stdin.write(stdinPrompt)
+      child.stdin.end()
+    })
+  }
+
+  private async *runOpencodeStream(prompt: string): AsyncGenerator<string, void, unknown> {
+    // Write the spilled prompt file inside --dir <cwd> so the read tool can
+    // reach it; external_directory: 'deny' would otherwise block /tmp paths.
+    const { prompt: stdinPrompt, cleanup } = preparePromptForCli(prompt, { tmpDir: this.cwd })
+
+    const args = this.buildArgs()
+    const child = spawn('opencode', args, {
+      cwd: this.cwd,
+      stdio: ['pipe', 'pipe', 'pipe'],
+      env: this.spawnEnv(),
+    })
+
+    const chunks: string[] = []
+    let resolveNext: ((value: { chunk: string | null }) => void) | null = null
+    let done = false
+    let error: Error | null = null
+    let lastActivity = Date.now()
+    let lineBuf = ''
+    let stderrOutput = ''
+
+    const timeoutChecker = this.timeout > 0 ? setInterval(() => {
+      if (Date.now() - lastActivity > this.timeout) {
+        child.kill('SIGTERM')
+        const forceKill = setTimeout(() => {
+          try { child.kill('SIGKILL') } catch {}
+        }, 5000)
+        forceKill.unref()
+        done = true
+        error = new Error(`OpenCode CLI timed out after ${this.timeout / 1000}s of inactivity`)
+        if (resolveNext) {
+          resolveNext({ chunk: null })
+        }
+      }
+    }, 10000) : null
+
+    const pushChunk = (chunk: string) => {
+      if (!chunk) return
+      if (resolveNext) {
+        resolveNext({ chunk })
+        resolveNext = null
+      } else {
+        chunks.push(chunk)
+      }
+    }
+
+    child.stdout.on('data', (data) => {
+      lastActivity = Date.now()
+      lineBuf += data.toString()
+      let idx
+      while ((idx = lineBuf.indexOf('\n')) !== -1) {
+        const line = lineBuf.slice(0, idx).trim()
+        lineBuf = lineBuf.slice(idx + 1)
+        if (!line) continue
+        try {
+          const event = JSON.parse(line) as Record<string, unknown>
+          const piece = this.extractEventText(event)
+          if (piece) pushChunk(piece)
+        } catch {
+          // Not JSON, ignore
+        }
+      }
+    })
+
+    child.stderr.on('data', (data) => {
+      lastActivity = Date.now()
+      stderrOutput += data.toString()
+    })
+
+    child.on('close', (code) => {
+      cleanup()
+      if (timeoutChecker) clearInterval(timeoutChecker)
+      if (lineBuf.trim()) {
+        try {
+          const event = JSON.parse(lineBuf.trim()) as Record<string, unknown>
+          const piece = this.extractEventText(event)
+          if (piece) pushChunk(piece)
+        } catch {}
+      }
+      done = true
+      if (code !== 0 && !error) {
+        error = new Error(`OpenCode CLI exited with code ${code}${stderrOutput ? ': ' + stderrOutput.slice(0, 500) : ''}`)
+      }
+      if (resolveNext) {
+        resolveNext({ chunk: null })
+      }
+    })
+
+    child.on('error', (err) => {
+      cleanup()
+      if (timeoutChecker) clearInterval(timeoutChecker)
+      done = true
+      error = new Error(`Failed to run opencode CLI: ${err.message}`)
+      if (resolveNext) {
+        resolveNext({ chunk: null })
+      }
+    })
+
+    child.stdin.on('error', () => {})
+    child.stdin.write(stdinPrompt)
+    child.stdin.end()
+
+    while (!done || chunks.length > 0) {
+      if (chunks.length > 0) {
+        yield chunks.shift()!
+      } else if (!done) {
+        const result = await new Promise<{ chunk: string | null }>((resolve) => {
+          resolveNext = resolve
+        })
+        if (result.chunk !== null) {
+          yield result.chunk
+        }
+      }
+    }
+
+    if (error) {
+      throw error
+    }
+  }
+}
@@ -37,17 +37,27 @@ export class QwenCodeProvider implements AIProvider {
    const prompt = this.session.shouldSendFullHistory()
      ? this.session.buildPrompt(messages, systemPrompt)
      : this.session.buildPromptLastOnly(messages)
+    try {
      const result = await withRetry(() => this.runQwen(prompt, systemPrompt, options))
      this.session.markMessageSent()
      return result
+    } catch (err) {
+      this.session.start(this.session.sessionName)
+      throw err
+    }
  }

  async *chatStream(messages: Message[], systemPrompt?: string): AsyncGenerator<string, void, unknown> {
    const prompt = this.session.shouldSendFullHistory()
      ? this.session.buildPrompt(messages, systemPrompt)
      : this.session.buildPromptLastOnly(messages)
+    try {
      yield* this.runQwenStream(prompt, systemPrompt)
      this.session.markMessageSent()
+    } catch (err) {
+      this.session.start(this.session.sessionName)
+      throw err
+    }
  }

  private runQwen(prompt: string, systemPrompt?: string, options?: ChatOptions): Promise<string> {
@@ -23,14 +23,25 @@ export interface PreparedPrompt {
  cleanup: () => void
 }

-export function preparePromptForCli(prompt: string): PreparedPrompt {
+export interface PreparePromptOptions {
+  /**
+   * Directory to materialize the prompt file in when it exceeds the size
+   * threshold. Defaults to os.tmpdir(). Override when the consuming CLI
+   * cannot read outside a specific root — e.g. opencode-cli denies reads
+   * outside its --dir, so the prompt file must live inside the repo.
+   */
+  tmpDir?: string
+}
+
+export function preparePromptForCli(prompt: string, options?: PreparePromptOptions): PreparedPrompt {
  if (Buffer.byteLength(prompt, 'utf-8') <= PROMPT_SIZE_THRESHOLD) {
    return { prompt, cleanup: () => {} }
  }

  registerExitHandler()

-  const tmpFile = join(tmpdir(), `magpie_prompt_${Date.now()}_${Math.random().toString(36).slice(2)}.txt`)
+  const dir = options?.tmpDir ?? tmpdir()
+  const tmpFile = join(dir, `magpie_prompt_${Date.now()}_${Math.random().toString(36).slice(2)}.txt`)
  writeFileSync(tmpFile, prompt, 'utf-8')
  activeTempFiles.add(tmpFile)

@@ -102,7 +102,7 @@ describe('deduplicateIssues', () => {
 })

 describe('parseFocusAreas', () => {
-  it('should extract focus areas from analysis', () => {
+  it('should extract focus areas from English analysis', () => {
    const analysis = `## What this PR does
 Some analysis here.

@@ -114,6 +114,37 @@ Some analysis here.
    const focus = parseFocusAreas(analysis)
    expect(focus).toHaveLength(3)
    expect(focus[0]).toContain('Security')
+    expect(focus[1]).toContain('Performance')
+    expect(focus[2]).toContain('Error handling')
+  })
+
+  it('should extract focus areas from Chinese analysis', () => {
+    const analysis = `## 这个 PR 做了什么
+一些分析内容。
+
+## 建议的 review 重点
+- 安全性：登录处理函数的输入校验
+- 性能：新增的数据库查询
+- 错误处理：异步路径缺少 try/catch`
+
+    const focus = parseFocusAreas(analysis)
+    expect(focus).toHaveLength(3)
+    expect(focus[0]).toContain('安全性')
+    expect(focus[1]).toContain('性能')
+    expect(focus[2]).toContain('错误处理')
+  })
+
+  it('should support bold-heading variant with Chinese title', () => {
+    const analysis = `**建议的 review 重点**
+1. src/auth.ts 的鉴权改动
+2. 新增的并发逻辑
+
+其他段落...`
+
+    const focus = parseFocusAreas(analysis)
+    expect(focus).toHaveLength(2)
+    expect(focus[0]).toContain('src/auth.ts')
+    expect(focus[1]).toContain('并发')
  })

  it('should return empty array if no focus section', () => {
@@ -63,4 +63,26 @@ describe('DebateOrchestrator resilience', () => {
    await expect(orchestrator.runStreaming('test', 'Review this code'))
      .rejects.toThrow('All reviewers failed')
  })
+
+  it('should abort immediately when failFast is enabled and any reviewer fails', async () => {
+    const goodProvider = makeProvider('good', 'LGTM, no issues found.')
+    const badProvider = makeFailingProvider('bad')
+
+    const reviewers = [
+      makeReviewer('good-reviewer', goodProvider),
+      makeReviewer('bad-reviewer', badProvider),
+    ]
+    const summarizer = makeReviewer('summarizer', makeProvider('sum', 'Final conclusion.'))
+    const analyzer = makeReviewer('analyzer', makeProvider('analyzer', 'Analysis done.'))
+
+    const orchestrator = new DebateOrchestrator(reviewers, summarizer, analyzer, {
+      maxRounds: 1,
+      interactive: false,
+      checkConvergence: false,
+      failFast: true,
+    })
+
+    await expect(orchestrator.runStreaming('test', 'Review this code'))
+      .rejects.toThrow(/bad-reviewer.*fail-fast/)
+  })
 })
@@ -50,7 +50,6 @@ describe('DebateOrchestrator', () => {
    expect(result.prNumber).toBe('123')
    expect(result.analysis).toBe('PR analysis result')
    expect(result.messages.length).toBe(4) // 2 reviewers * 2 rounds
-    expect(result.summaries.length).toBe(2)
    expect(result.finalConclusion).toBe('Final conclusion')
  })

@@ -1,9 +1,13 @@
 // tests/providers/factory.test.ts
-import { describe, it, expect } from 'vitest'
+import { describe, it, expect, vi, afterEach } from 'vitest'
 import { createProvider, getProviderForModel } from '../../src/providers/factory.js'
 import type { MagpieConfig } from '../../src/config/types.js'

 describe('Provider Factory', () => {
+  afterEach(() => {
+    vi.unstubAllEnvs()
+  })
+
  const mockConfig: MagpieConfig = {
    providers: {
      anthropic: { api_key: 'ant-key' },
@@ -39,6 +43,17 @@ describe('Provider Factory', () => {
    it('should return codex-cli for codex-cli model', () => {
      expect(getProviderForModel('codex-cli')).toBe('codex-cli')
    })
+
+    it('should return opencode-cli for opencode-cli model (with and without :model suffix)', () => {
+      expect(getProviderForModel('opencode-cli')).toBe('opencode-cli')
+      expect(getProviderForModel('opencode-cli:openrouter/anthropic/claude-sonnet-4')).toBe('opencode-cli')
+    })
+
+    it('should return openrouter for openrouter/ prefixed models', () => {
+      expect(getProviderForModel('openrouter/anthropic/claude-3.5-sonnet')).toBe('openrouter')
+      expect(getProviderForModel('openrouter/meta-llama/llama-3-70b-instruct')).toBe('openrouter')
+      expect(getProviderForModel('openrouter/openai/gpt-4o')).toBe('openrouter')
+    })
  })

  describe('createProvider', () => {
@@ -76,6 +91,16 @@ describe('Provider Factory', () => {
      expect(provider.name).toBe('codex-cli')
    })

+    it('should create opencode-cli provider with no extra config', () => {
+      const provider = createProvider('opencode-cli', mockConfig)
+      expect(provider.name).toBe('opencode-cli')
+    })
+
+    it('should create opencode-cli provider with a model suffix', () => {
+      const provider = createProvider('opencode-cli:openrouter/anthropic/claude-sonnet-4', mockConfig)
+      expect(provider.name).toBe('opencode-cli')
+    })
+
    it('should pass base_url through to API providers', () => {
      const configWithBaseUrl: MagpieConfig = {
        ...mockConfig,
@@ -95,5 +120,28 @@ describe('Provider Factory', () => {
      const provider = createProvider('claude-sonnet-4-20250514', mockConfig)
      expect(provider.name).toBe('anthropic')
    })
+
+    it('should create openrouter provider (via openai client) with api key from config', () => {
+      const configWithOpenrouter: MagpieConfig = {
+        ...mockConfig,
+        providers: { ...mockConfig.providers, openrouter: { api_key: 'or-key' } }
+      }
+      const provider = createProvider('openrouter/anthropic/claude-3.5-sonnet', configWithOpenrouter)
+      // OpenRouter is routed through the OpenAI client, so .name === 'openai'
+      expect(provider.name).toBe('openai')
+    })
+
+    it('should pick up OPENROUTER_API_KEY env var when config is absent', () => {
+      vi.stubEnv('OPENROUTER_API_KEY', 'env-or-key')
+      const provider = createProvider('openrouter/anthropic/claude-3.5-sonnet', mockConfig)
+      expect(provider.name).toBe('openai')
+    })
+
+    it('should throw when OpenRouter has no api key configured', () => {
+      vi.stubEnv('OPENROUTER_API_KEY', '')
+      expect(() =>
+        createProvider('openrouter/anthropic/claude-3.5-sonnet', mockConfig)
+      ).toThrow(/OpenRouter API key/)
+    })
  })
 })
@@ -1,15 +1,21 @@
 import { describe, it, expect, vi } from 'vitest'
 import { OpenAIProvider } from '../../src/providers/openai'
+import { createProvider } from '../../src/providers/factory'
+import type { MagpieConfig } from '../../src/config/types'

 let lastConstructorOptions: Record<string, unknown> = {}
+let lastCreateOptions: Record<string, unknown> = {}

 vi.mock('openai', () => ({
  default: class MockOpenAI {
    chat = {
      completions: {
-        create: vi.fn().mockResolvedValue({
+        create: vi.fn().mockImplementation((opts: Record<string, unknown>) => {
+          lastCreateOptions = opts
+          return Promise.resolve({
            choices: [{ message: { content: 'Mock response' } }]
          })
+        })
      }
    }
    constructor(options: Record<string, unknown>) {
@@ -40,3 +46,49 @@ describe('OpenAIProvider', () => {
    expect(lastConstructorOptions.baseURL).toBeUndefined()
  })
 })
+
+describe('OpenRouter via OpenAI client', () => {
+  const baseConfig: MagpieConfig = {
+    providers: {},
+    defaults: { max_rounds: 3, output_format: 'markdown' },
+    reviewers: {},
+    summarizer: { model: 'openrouter/anthropic/claude-3.5-sonnet', prompt: '' },
+    analyzer: { model: 'openrouter/anthropic/claude-3.5-sonnet', prompt: '' }
+  }
+
+  it('strips the openrouter/ prefix from the model and defaults baseURL to OpenRouter', async () => {
+    const config: MagpieConfig = {
+      ...baseConfig,
+      providers: { openrouter: { api_key: 'or-key' } }
+    }
+    const provider = createProvider('openrouter/anthropic/claude-3.5-sonnet', config)
+    expect(lastConstructorOptions.apiKey).toBe('or-key')
+    expect(lastConstructorOptions.baseURL).toBe('https://openrouter.ai/api/v1')
+
+    // Invoke chat() so the stripped model reaches chat.completions.create
+    await provider.chat([{ role: 'user', content: 'hi' }])
+    expect(lastCreateOptions.model).toBe('anthropic/claude-3.5-sonnet')
+  })
+
+  it('honors a custom base_url from config and forwards the stripped model', async () => {
+    const config: MagpieConfig = {
+      ...baseConfig,
+      providers: {
+        openrouter: { api_key: 'or-key', base_url: 'https://my-openrouter-proxy.example.com/v1' }
+      }
+    }
+    const provider = createProvider('openrouter/meta-llama/llama-3-70b-instruct', config)
+    expect(lastConstructorOptions.baseURL).toBe('https://my-openrouter-proxy.example.com/v1')
+
+    await provider.chat([{ role: 'user', content: 'hi' }])
+    expect(lastCreateOptions.model).toBe('meta-llama/llama-3-70b-instruct')
+  })
+
+  it('throws when the model is just "openrouter/" with no ID after it', () => {
+    const config: MagpieConfig = {
+      ...baseConfig,
+      providers: { openrouter: { api_key: 'or-key' } }
+    }
+    expect(() => createProvider('openrouter/', config)).toThrow(/must include a model ID/)
+  })
+})
@@ -0,0 +1,109 @@
+// Verifies the JSON event parser against captured opencode 1.15.11 output.
+// The schema is internal to opencode; if it changes, these tests fail loudly
+// rather than the provider silently returning empty reviewer responses.
+import { describe, it, expect } from 'vitest'
+import { OpencodeCliProvider } from '../../src/providers/opencode-cli.js'
+
+// Real captures from `opencode run --format json -m openrouter/openai/gpt-4o-mini`.
+const STEP_START_EVENT = '{"type":"step_start","timestamp":1780089625130,"sessionID":"ses_abc","part":{"id":"prt_1","messageID":"msg_1","sessionID":"ses_abc","type":"step-start"}}'
+const TEXT_EVENT = '{"type":"text","timestamp":1780089625396,"sessionID":"ses_abc","part":{"id":"prt_2","messageID":"msg_1","sessionID":"ses_abc","type":"text","text":"ok","time":{"start":1780089625131,"end":1780089625393}}}'
+
+// Access private parser methods. They're pure logic and worth testing directly;
+// extracting them into a separate module just for visibility would be churn.
+type ParserHandle = {
+  extractEventText(event: unknown): string
+  parseJsonOutput(output: string): string
+}
+function asParser(p: OpencodeCliProvider): ParserHandle {
+  return p as unknown as ParserHandle
+}
+
+describe('OpencodeCliProvider parser', () => {
+  describe('extractEventText', () => {
+    it('returns the text from a text-part event', () => {
+      const parser = asParser(new OpencodeCliProvider())
+      expect(parser.extractEventText(JSON.parse(TEXT_EVENT))).toBe('ok')
+    })
+
+    it('returns empty for a step_start event', () => {
+      const parser = asParser(new OpencodeCliProvider())
+      expect(parser.extractEventText(JSON.parse(STEP_START_EVENT))).toBe('')
+    })
+
+    it('returns empty for an unknown event type', () => {
+      const parser = asParser(new OpencodeCliProvider())
+      expect(parser.extractEventText({ type: 'tool.use', sessionID: 'ses_abc', tool: 'read' })).toBe('')
+    })
+
+    it('returns empty for non-object inputs', () => {
+      const parser = asParser(new OpencodeCliProvider())
+      expect(parser.extractEventText(null)).toBe('')
+      expect(parser.extractEventText('text')).toBe('')
+      expect(parser.extractEventText(42)).toBe('')
+    })
+  })
+
+  describe('parseJsonOutput', () => {
+    it('concatenates text across multiple events, ignoring others', () => {
+      const provider = new OpencodeCliProvider()
+      const output = [
+        STEP_START_EVENT,
+        '{"type":"text","sessionID":"ses_abc","part":{"type":"text","text":"hello "}}',
+        '{"type":"tool.use","sessionID":"ses_abc"}',
+        '{"type":"text","sessionID":"ses_abc","part":{"type":"text","text":"world"}}',
+      ].join('\n')
+      expect(asParser(provider).parseJsonOutput(output)).toBe('hello world')
+    })
+
+    it('skips blank lines and malformed JSON', () => {
+      const provider = new OpencodeCliProvider()
+      const output = [
+        '',
+        'not valid json',
+        TEXT_EVENT,
+        '{ partial',
+        '',
+      ].join('\n')
+      expect(asParser(provider).parseJsonOutput(output)).toBe('ok')
+    })
+
+    it('returns empty when no text events are present', () => {
+      const provider = new OpencodeCliProvider()
+      expect(asParser(provider).parseJsonOutput(STEP_START_EVENT)).toBe('')
+    })
+  })
+
+  describe('session id capture', () => {
+    it('does not capture sessionID when sessions are disabled', () => {
+      const provider = new OpencodeCliProvider()
+      asParser(provider).parseJsonOutput(TEXT_EVENT)
+      expect(provider.sessionId).toBeUndefined()
+    })
+
+    it('captures sessionID from the first event after startSession', () => {
+      const provider = new OpencodeCliProvider()
+      provider.startSession('reviewer-1')
+      expect(provider.sessionId).toBeUndefined()  // not pre-generated
+      asParser(provider).parseJsonOutput(STEP_START_EVENT)
+      expect(provider.sessionId).toBe('ses_abc')
+    })
+
+    it('does not overwrite a captured sessionID with a later event', () => {
+      const provider = new OpencodeCliProvider()
+      provider.startSession('reviewer-1')
+      asParser(provider).parseJsonOutput(STEP_START_EVENT)
+      const laterEvent = '{"type":"text","sessionID":"ses_different","part":{"type":"text","text":"x"}}'
+      asParser(provider).parseJsonOutput(laterEvent)
+      expect(provider.sessionId).toBe('ses_abc')
+    })
+
+    it('clears sessionID on endSession', () => {
+      const provider = new OpencodeCliProvider()
+      provider.startSession('reviewer-1')
+      asParser(provider).parseJsonOutput(TEXT_EVENT)
+      expect(provider.sessionId).toBe('ses_abc')
+      provider.endSession()
+      expect(provider.sessionId).toBeUndefined()
+    })
+  })
+})
@@ -1,5 +1,7 @@
 import { describe, it, expect } from 'vitest'
-import { existsSync } from 'fs'
+import { existsSync, mkdtempSync, rmSync } from 'fs'
+import { tmpdir } from 'os'
+import { join, dirname } from 'path'
 import { preparePromptForCli } from '../../src/utils/prompt-file.js'

 describe('preparePromptForCli', () => {
@@ -23,4 +25,23 @@ describe('preparePromptForCli', () => {
    result.cleanup()
    expect(existsSync(tmpPath)).toBe(false)
  })
+
+  it('writes the spilled prompt into the supplied tmpDir', () => {
+    const customDir = mkdtempSync(join(tmpdir(), 'magpie-tmpdir-test-'))
+    try {
+      const largePrompt = 'y'.repeat(200 * 1024)
+      const result = preparePromptForCli(largePrompt, { tmpDir: customDir })
+
+      const pathMatch = result.prompt.match(/\/.*magpie_prompt_\S+/)
+      expect(pathMatch).toBeTruthy()
+      const tmpPath = pathMatch![0]
+      expect(dirname(tmpPath)).toBe(customDir)
+      expect(existsSync(tmpPath)).toBe(true)
+
+      result.cleanup()
+      expect(existsSync(tmpPath)).toBe(false)
+    } finally {
+      rmSync(customDir, { recursive: true, force: true })
+    }
+  })
 })
Author	SHA1	Message	Date
tgrosinger	2163ea45d2	OpenCode: Add OpenCode as a new provider The OpenCode provider allows using a variety of models with an agent harness that can gather more information from the codebase as required (like with claude-code, codex, or gemini-cli). This is an alternative to using OpenRouter directly, where the api provider is more like a chatbot and cannot gather any additional context beyond what was handed to it.	2026-05-29 16:19:13 -07:00
tgrosinger	e4790ac77e	Allow specifying tmp dir when preparing prompt	2026-05-29 16:19:13 -07:00
tgrosinger	f642e58070	Claude: Use default xhigh effort With opus-4.8, Claude defaults to "high". Bump up one level for review.	2026-05-28 10:24:02 -07:00
tgrosinger	d666f7e08b	Codex: Restrict permissions	2026-05-28 10:22:55 -07:00
tgrosinger	9e7989671a	Add a flag to disable jokes	2026-05-28 10:22:53 -07:00
tgrosinger	a8578beacd	OpenRouter: Add OpenRouter as a new provider	2026-05-28 10:22:51 -07:00
tgrosinger	823333a4f5	Claude: Remove dangerously-skip-permissions Instead, hard-code a list of allowed tools for claude that gives it general read access.	2026-05-27 20:26:09 -07:00
Li Liu	cafd72bcd5	fix: claude-code provider explicitly passes --effort max settings.json effortLevel="max" gets silently demoted to "xhigh" by the schema validator; the CLI flag form is honored correctly. Pass --effort max explicitly so every claude-code invocation (reviewer / analyzer / summarizer / audit) actually runs at the real max effort tier rather than the demoted xhigh tier from settings.json fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 00:20:32 +00:00
Li Liu	d2ec538dbf	feat: audit reads house-rules from ~/.magpie/house-rules/<owner>_<repo>.md Per-repo conventions now live at a stable user-level path instead of being read from cwd. Audit extracts owner/repo from the PR URL in taskPrompt and looks up ~/.magpie/house-rules/<owner>_<repo>.md. Works for both bot mode and CLI mode without anyone needing to stage files into the worktree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 00:07:39 +00:00
Li Liu	30be792070	fix: assign verifyIssues return value back to parsedIssues	2026-05-26 23:23:11 +00:00
Li Liu	e3fd28c0f0	feat: omniscient audit + tightened reviewer/structurizer/analyzer prompts Major changes: 1. Audit (verifyIssues) rewrite — now THE final judge instead of a severity-recalibrator: - Inputs structured issues + Task line with PR URL (audit fetches diff itself via `gh pr diff` + Read/Grep/Glob). Does NOT consume reviewer chat or pre-stuffed diff. - Output schema extended: verdict (keep/rewrite/drop/new), body, evidence, reason. - Can DELETE false positives (not just downgrade), REWRITE weak descriptions, ADD missed issues — especially cross-file pattern repetition. - Optional .magpie-house-rules.md picked up from cwd as authoritative repo conventions. - New config block `audit:` with claude-opus-4-7 + max effort by default. 2. Reviewer prompts (Round 1 + Round 2): - Add severity vocabulary at reviewer stage (was only at structurizer before). - Add reverse rubric: do NOT report build script polish, missing comments, forward- compat hypotheticals, style preferences, theoretical-but-impossible cases. - Require file:line + code quote + failure scenario for every issue. - Drop "Review EVERY file / don't stop early" — brevity over completeness. - Round 2: drop "find what others MISSED" anti-pattern; agreeing is fine. 3. Structurizer: - line field now REQUIRED (drop issues that can't be anchored to a hunk line). - Description must capture WHY + FAILURE scenario + FIX (so audit has basis to verify). - Drop "STRICT — choose LOWER" severity bias. 4. Analyzer: add 6th "建议的 review 重点" section; parseFocusAreas now matches English + Chinese headings, with-/no-space, bold variant; handles `-` `*` `•` `·` ①-⑳ `1.`/`1、`/`1)` bullets. 5. Convergence judge: fix parse bug (verdict swallowed by trailing punctuation); explicit one-word verdict format constraint. Schema additions: - MergedIssue: verdict, body, evidence, auditReason - MagpieConfig: audit?: ReviewerConfig Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 22:34:33 +00:00
Li Liu	6862947368	test: drop obsolete summaries assertion after summary-step removal The per-reviewer summary step was removed in `0f03726`, dropping the summaries field from DebateResult, but this test still asserted on it and failed. Remove the stale assertion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 14:30:06 -07:00
Li Liu	629ed8b00e	feat: add --fail-fast option to abort review/discuss on any reviewer failure By default the orchestrator is resilient: a single reviewer (or context gatherer) failure is logged and the round continues with the survivors, aborting only when all reviewers fail. The new --fail-fast flag flips to strict mode — any reviewer or context-gathering failure re-throws immediately and terminates the whole flow. Wired through the review and discuss commands via OrchestratorOptions.failFast, with a regression test and README docs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-26 14:30:06 -07:00
Li Liu	da7097c1b6	fix: use stream-json output to prevent false inactivity timeout Claude CLI in -p mode only outputs text to stdout when generating the final response. During tool-heavy reviews (reading files, running commands), no stdout/stderr is produced, causing the 900s inactivity timeout to kill actively-working Claude processes. Switch runClaudeStream to --output-format stream-json --verbose, which emits JSON events for every tool call, tool result, and assistant message. This keeps lastActivity alive during tool execution. The final result text is extracted from the "result" event. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-05-06 12:43:28 +00:00
Li Liu	20d5434e13	fix: reset CLI session on error to prevent stale session reuse When chat/chatStream throws (timeout, crash, etc.), the session ID was left intact, causing subsequent rounds to --resume a dead session and fail with "Session ID already in use". Now all 4 CLI providers reset to a fresh session ID on error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-28 03:04:41 +00:00
Li Liu	577121675c	docs: update README for code-aware review pipeline Reflect the new architecture: code-aware reviewers, integrated verify+audit, multi-language context gathering, --no-conclusion flag, and updated review dimensions (compatibility, extensibility, feature interaction). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 11:01:51 +00:00
Li Liu	afaa4d8f90	feat: major review pipeline overhaul — code-aware reviewers, integrated verify+audit - Reviewers now fetch diff and read code themselves (CLI providers) instead of receiving pre-embedded diff text. Enables verification of issues against actual code context during review. - Merge audit into magpie as verifyIssues() with Read/Grep/Glob tools, replacing the separate downstream audit step in li-bot. - Add --no-conclusion flag to skip summarize step (bot mode). - Context gatherer: support Go/C++/Proto/Python/Java/Scala symbol extraction and multi-language grep (was JS/TS only). - Structurizer: standardize categories to 12 enums, add strict severity definitions, simplify description template for direct GitHub posting. - Add isCliModel() helper to detect CLI vs API providers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 10:59:19 +00:00
Li Liu	a28009101b	fix: capture gemini CLI stderr for diagnostics Stream mode was discarding stderr (_data), making it impossible to diagnose exit code 1 failures. Now buffers stderr (capped at 10KB) and appends the last 500 chars to error messages on crash or timeout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-24 06:29:13 +00:00