The OpenCode provider allows using a variety of models with an agent
harness that can gather more information from the codebase as required
(like with claude-code, codex, or gemini-cli).
This is an alternative to using OpenRouter directly, where the api
provider is more like a chatbot and cannot gather any additional context
beyond what was handed to it.
settings.json effortLevel="max" gets silently demoted to "xhigh" by the
schema validator; the CLI flag form is honored correctly. Pass --effort max
explicitly so every claude-code invocation (reviewer / analyzer / summarizer /
audit) actually runs at the real max effort tier rather than the demoted
xhigh tier from settings.json fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-repo conventions now live at a stable user-level path instead of being
read from cwd. Audit extracts owner/repo from the PR URL in taskPrompt and
looks up ~/.magpie/house-rules/<owner>_<repo>.md. Works for both bot mode
and CLI mode without anyone needing to stage files into the worktree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The per-reviewer summary step was removed in 0f03726, dropping the
summaries field from DebateResult, but this test still asserted on it
and failed. Remove the stale assertion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
By default the orchestrator is resilient: a single reviewer (or context
gatherer) failure is logged and the round continues with the survivors,
aborting only when all reviewers fail.
The new --fail-fast flag flips to strict mode — any reviewer or
context-gathering failure re-throws immediately and terminates the
whole flow. Wired through the review and discuss commands via
OrchestratorOptions.failFast, with a regression test and README docs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Claude CLI in -p mode only outputs text to stdout when generating the
final response. During tool-heavy reviews (reading files, running
commands), no stdout/stderr is produced, causing the 900s inactivity
timeout to kill actively-working Claude processes.
Switch runClaudeStream to --output-format stream-json --verbose, which
emits JSON events for every tool call, tool result, and assistant
message. This keeps lastActivity alive during tool execution. The final
result text is extracted from the "result" event.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When chat/chatStream throws (timeout, crash, etc.), the session ID was
left intact, causing subsequent rounds to --resume a dead session and
fail with "Session ID already in use". Now all 4 CLI providers reset
to a fresh session ID on error.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reviewers now fetch diff and read code themselves (CLI providers)
instead of receiving pre-embedded diff text. Enables verification
of issues against actual code context during review.
- Merge audit into magpie as verifyIssues() with Read/Grep/Glob tools,
replacing the separate downstream audit step in li-bot.
- Add --no-conclusion flag to skip summarize step (bot mode).
- Context gatherer: support Go/C++/Proto/Python/Java/Scala symbol
extraction and multi-language grep (was JS/TS only).
- Structurizer: standardize categories to 12 enums, add strict severity
definitions, simplify description template for direct GitHub posting.
- Add isCliModel() helper to detect CLI vs API providers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stream mode was discarding stderr (_data), making it impossible to
diagnose exit code 1 failures. Now buffers stderr (capped at 10KB)
and appends the last 500 chars to error messages on crash or timeout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The summarizer now generates the final conclusion directly from the full
debate conversation history, eliminating the intermediate step where each
reviewer was asked to summarize their own points. This saves one round of
API calls per reviewer without losing information.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After the summarizer produces the final conclusion, a new verification step
re-examines it against the original PR diff/code to confirm correctness,
flag false positives, catch missed issues, and produce an authoritative
verified final conclusion.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Allow CLI providers (claude-code, codex-cli, gemini-cli, qwen-code) to
accept a model override using colon syntax (e.g., `claude-code:claude-opus-4-6`).
Display the model name alongside reviewer IDs in review and discuss output.
Update default Gemini API model to gemini-3.1-pro-preview.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
gh pr diff returns HTTP 406 for PRs with >20,000 lines of diff.
Added fallback: fetch full diff via commit diff API endpoint (no line
limit), split into per-file sections, prioritize core files over tests,
and truncate to 15k lines to fit model context windows.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduce an optional General Discussion phase for PR reviews that sits
between review completion and the issue-by-issue comment loop. Users can
select participants, ask general questions, and resolve issues inline
(/post, /skip, /edit, /discuss, /issues) so only remaining issues flow
into the per-issue loop.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move language instructions from user message suffix to system prompt prefix
where LLMs give them higher weight. Add langPrefix/withLang() to orchestrator
so all reviewer/analyzer/summarizer calls get the language requirement.
Also wire discuss command to use config.defaults.language instead of only
following user input language.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add extractDiffLineRanges() to include valid line ranges in structurizer prompt
- Add content-based matching (extractCodeFromBody + findLineByContent) as fallback
- Add full diff fallback via gh api when per-file patches are null
- Widen nearest-line threshold from 20 to 50 for better coverage
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace execSync with spawnSync in getFileHistory() and getPRDetails()
to prevent shell injection through file paths and PR numbers. Add input
validation for prNumber (must be a positive integer).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace execSync with spawnSync in findReferences() to prevent shell
injection through malicious symbol names in PR diffs. Use -F (fixed-string)
and -e flags for safe argument passing to ripgrep.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the AI reviewer references a line not exactly in a diff hunk,
find the nearest valid diff line (within 20 lines) and post inline
there instead of falling back to file-level. Also constrain the
structurizer prompt to only reference files actually in the diff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Allow convergence check on round 1: independent reviewers reaching
the same conclusions is valid convergence
- Adapt convergence prompt to distinguish independent reviews (round 1)
from cross-examined reviews (round 2+)
- Stop any running spinner in onRoundComplete before printing round
results to prevent terminal output artifacts
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Add stdin EPIPE error handlers to prevent unhandled exception crashes
when child processes exit before reading all input
- Clear CLAUDECODE env var to avoid nested session detection when
running from within Claude Code
- Write large prompts (>100KB) to temp files instead of stdin to bypass
CLI prompt size limits; CLI tools read the file via their built-in
file access capabilities
- Capture stderr in claude-code streaming mode for better error reporting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix readline hanging after ora spinner pauses stdin: add process.stdin.resume()
before rl.question() at all spinner→input transition points across review, discuss,
and interactive modules
- Fix Ctrl+C not working during analysis/debate: replace silent flag-only SIGINT
handler with double-press-to-force-exit pattern; add InterruptedError + checkpoint
checks in orchestrator between analysis, debate rounds, and summarization
- Add diff filtering: new diff-filter utility with built-in patterns for generated
files (*.pb.go, vendor/**, lockfiles, etc.) and user-configurable diff_exclude
- Add language support to context gatherer: pass config.defaults.language to
ContextGatherer so System Context section respects language setting
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
- Add interactivePostReviewDiscussion() for chatting with any role after review conclusion
- Show all roles (reviewers + analyzer + summarizer) in comment discussion picker
- Add comment style prompt before issue loop to style-guide first-gen comments
- Add /skip and /drop to abandon issues mid-discussion
- Add defaults.language config for localized output (e.g. language: zh)
- Expose getAnalyzer() and getSummarizer() from orchestrator
- Share reviewer sessions across discussion and comment phases
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
Allow users to configure `base_url` per provider to connect to
compatible third-party endpoints (Azure OpenAI, Ollama, vLLM, one-api,
etc.). All four API providers (Anthropic, OpenAI, Gemini, MiniMax) now
accept an optional `base_url` in config which is passed through to
their respective SDKs.
Co-authored-by: Cursor <cursoragent@cursor.com>
When a bare PR number is given while in a fork's local clone, the code
now uses `gh pr view` to resolve the actual upstream repo before falling
back to git remote detection. This prevents 404 errors when posting
review comments on PRs that live on the upstream repository.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reviewer prompts previously allowed LLMs to stop after finding a handful
of issues. Now every round demands systematic coverage of all changed
files/functions, and debate rounds additionally require reviewers to
identify what others missed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Classify comments against PR diff before posting. When comments cannot
be placed inline (line not in diff), show user the fallback plan and
offer choices: post all with fallback, inline only, retry as inline,
or skip.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use GitHub Reviews API to batch-post comments as a proper code review.
Parse PR diff to determine valid line placements, with three-level
fallback: inline (exact line) → file-level (attached to file) → global.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Show a dim "Enter to end discussion" hint at the input prompt that
automatically clears when the user starts typing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Maintain a single session per reviewer across all issue discussions,
so that context (PR diff, gathered context, debate history) is sent
once at session start and subsequent issues use --resume for
claude-code provider. This enables cross-issue context sharing and
eliminates cold-start overhead per issue.
- Add ReviewerSessionState interface and session management helpers
- buildInitialSessionContext: rich first message with full PR context
- getOrCreateSession: lazy session creation with provider startSession
- Use DebateResult type instead of inline type for better type safety
- Wrap discussion flow in try/finally to guarantee session cleanup
- Synthetic assistant reply after initial context to maintain
user/assistant alternation for API-based providers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When reviewing a PR from a different repository using a full GitHub URL,
the commenter commands (gh pr view, gh pr comment) defaulted to the
current directory's git remote, causing "Could not resolve PullRequest"
errors. Now extracts owner/repo from the PR URL and passes --repo flag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>