159 Commits

Author SHA1 Message Date
tgrosinger 2163ea45d2 OpenCode: Add OpenCode as a new provider
The OpenCode provider allows using a variety of models with an agent
harness that can gather more information from the codebase as required
(like with claude-code, codex, or gemini-cli).

This is an alternative to using OpenRouter directly, where the api
provider is more like a chatbot and cannot gather any additional context
beyond what was handed to it.
2026-05-29 16:19:13 -07:00
tgrosinger e4790ac77e Allow specifying tmp dir when preparing prompt 2026-05-29 16:19:13 -07:00
tgrosinger f642e58070 Claude: Use default xhigh effort
With opus-4.8, Claude defaults to "high". Bump up one level for review.
2026-05-28 10:24:02 -07:00
tgrosinger d666f7e08b Codex: Restrict permissions 2026-05-28 10:22:55 -07:00
tgrosinger 9e7989671a Add a flag to disable jokes 2026-05-28 10:22:53 -07:00
tgrosinger a8578beacd OpenRouter: Add OpenRouter as a new provider 2026-05-28 10:22:51 -07:00
tgrosinger 823333a4f5 Claude: Remove dangerously-skip-permissions
Instead, hard-code a list of allowed tools for claude that gives it
general read access.
2026-05-27 20:26:09 -07:00
Li Liu cafd72bcd5 fix: claude-code provider explicitly passes --effort max
settings.json effortLevel="max" gets silently demoted to "xhigh" by the
schema validator; the CLI flag form is honored correctly. Pass --effort max
explicitly so every claude-code invocation (reviewer / analyzer / summarizer /
audit) actually runs at the real max effort tier rather than the demoted
xhigh tier from settings.json fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 00:20:32 +00:00
Li Liu d2ec538dbf feat: audit reads house-rules from ~/.magpie/house-rules/<owner>_<repo>.md
Per-repo conventions now live at a stable user-level path instead of being
read from cwd. Audit extracts owner/repo from the PR URL in taskPrompt and
looks up ~/.magpie/house-rules/<owner>_<repo>.md. Works for both bot mode
and CLI mode without anyone needing to stage files into the worktree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 00:07:39 +00:00
Li Liu 30be792070 fix: assign verifyIssues return value back to parsedIssues 2026-05-26 23:23:11 +00:00
Li Liu e3fd28c0f0 feat: omniscient audit + tightened reviewer/structurizer/analyzer prompts
Major changes:

1. Audit (verifyIssues) rewrite — now THE final judge instead of a severity-recalibrator:
   - Inputs structured issues + Task line with PR URL (audit fetches diff itself via
     `gh pr diff` + Read/Grep/Glob). Does NOT consume reviewer chat or pre-stuffed diff.
   - Output schema extended: verdict (keep/rewrite/drop/new), body, evidence, reason.
   - Can DELETE false positives (not just downgrade), REWRITE weak descriptions, ADD
     missed issues — especially cross-file pattern repetition.
   - Optional .magpie-house-rules.md picked up from cwd as authoritative repo conventions.
   - New config block `audit:` with claude-opus-4-7 + max effort by default.

2. Reviewer prompts (Round 1 + Round 2):
   - Add severity vocabulary at reviewer stage (was only at structurizer before).
   - Add reverse rubric: do NOT report build script polish, missing comments, forward-
     compat hypotheticals, style preferences, theoretical-but-impossible cases.
   - Require file:line + code quote + failure scenario for every issue.
   - Drop "Review EVERY file / don't stop early" — brevity over completeness.
   - Round 2: drop "find what others MISSED" anti-pattern; agreeing is fine.

3. Structurizer:
   - line field now REQUIRED (drop issues that can't be anchored to a hunk line).
   - Description must capture WHY + FAILURE scenario + FIX (so audit has basis to verify).
   - Drop "STRICT — choose LOWER" severity bias.

4. Analyzer: add 6th "建议的 review 重点" section; parseFocusAreas now matches
   English + Chinese headings, with-/no-space, bold variant; handles `-` `*` `•`
   `·` ①-⑳ `1.`/`1、`/`1)` bullets.

5. Convergence judge: fix parse bug (verdict swallowed by trailing punctuation);
   explicit one-word verdict format constraint.

Schema additions:
- MergedIssue: verdict, body, evidence, auditReason
- MagpieConfig: audit?: ReviewerConfig

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 22:34:33 +00:00
Li Liu 6862947368 test: drop obsolete summaries assertion after summary-step removal
The per-reviewer summary step was removed in 0f03726, dropping the
summaries field from DebateResult, but this test still asserted on it
and failed. Remove the stale assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 14:30:06 -07:00
Li Liu 629ed8b00e feat: add --fail-fast option to abort review/discuss on any reviewer failure
By default the orchestrator is resilient: a single reviewer (or context
gatherer) failure is logged and the round continues with the survivors,
aborting only when all reviewers fail.

The new --fail-fast flag flips to strict mode — any reviewer or
context-gathering failure re-throws immediately and terminates the
whole flow. Wired through the review and discuss commands via
OrchestratorOptions.failFast, with a regression test and README docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 14:30:06 -07:00
Li Liu da7097c1b6 fix: use stream-json output to prevent false inactivity timeout
Claude CLI in -p mode only outputs text to stdout when generating the
final response. During tool-heavy reviews (reading files, running
commands), no stdout/stderr is produced, causing the 900s inactivity
timeout to kill actively-working Claude processes.

Switch runClaudeStream to --output-format stream-json --verbose, which
emits JSON events for every tool call, tool result, and assistant
message. This keeps lastActivity alive during tool execution. The final
result text is extracted from the "result" event.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-06 12:43:28 +00:00
Li Liu 20d5434e13 fix: reset CLI session on error to prevent stale session reuse
When chat/chatStream throws (timeout, crash, etc.), the session ID was
left intact, causing subsequent rounds to --resume a dead session and
fail with "Session ID already in use". Now all 4 CLI providers reset
to a fresh session ID on error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 03:04:41 +00:00
Li Liu 577121675c docs: update README for code-aware review pipeline
Reflect the new architecture: code-aware reviewers, integrated
verify+audit, multi-language context gathering, --no-conclusion flag,
and updated review dimensions (compatibility, extensibility,
feature interaction).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 11:01:51 +00:00
Li Liu afaa4d8f90 feat: major review pipeline overhaul — code-aware reviewers, integrated verify+audit
- Reviewers now fetch diff and read code themselves (CLI providers)
  instead of receiving pre-embedded diff text. Enables verification
  of issues against actual code context during review.
- Merge audit into magpie as verifyIssues() with Read/Grep/Glob tools,
  replacing the separate downstream audit step in li-bot.
- Add --no-conclusion flag to skip summarize step (bot mode).
- Context gatherer: support Go/C++/Proto/Python/Java/Scala symbol
  extraction and multi-language grep (was JS/TS only).
- Structurizer: standardize categories to 12 enums, add strict severity
  definitions, simplify description template for direct GitHub posting.
- Add isCliModel() helper to detect CLI vs API providers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 10:59:19 +00:00
Li Liu a28009101b fix: capture gemini CLI stderr for diagnostics
Stream mode was discarding stderr (_data), making it impossible to
diagnose exit code 1 failures. Now buffers stderr (capped at 10KB)
and appends the last 500 chars to error messages on crash or timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 06:29:13 +00:00
Li Liu 0f0372656d refactor: remove redundant per-reviewer summary step before final conclusion
The summarizer now generates the final conclusion directly from the full
debate conversation history, eliminating the intermediate step where each
reviewer was asked to summarize their own points. This saves one round of
API calls per reviewer without losing information.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:56:47 +08:00
Li Liu 5ff7f2ea1a feat: add verified conclusion step that cross-checks summary against actual code
After the summarizer produces the final conclusion, a new verification step
re-examines it against the original PR diff/code to confirm correctness,
flag false positives, catch missed issues, and produce an authoritative
verified final conclusion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 19:32:08 +08:00
Li Liu 91d0b2388f feat: support specifying models for CLI providers via colon syntax
Allow CLI providers (claude-code, codex-cli, gemini-cli, qwen-code) to
accept a model override using colon syntax (e.g., `claude-code:claude-opus-4-6`).
Display the model name alongside reviewer IDs in review and discuss output.
Update default Gemini API model to gemini-3.1-pro-preview.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:28:24 +08:00
Li Liu 62b61eb3e0 fix: handle large PR diffs that exceed GitHub's 20k line limit
gh pr diff returns HTTP 406 for PRs with >20,000 lines of diff.
Added fallback: fetch full diff via commit diff API endpoint (no line
limit), split into per-file sections, prioritize core files over tests,
and truncate to 15k lines to fit model context windows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 18:20:59 +08:00
Li Liu aec6f67509 feat: add general discussion phase before issue-by-issue review
Introduce an optional General Discussion phase for PR reviews that sits
between review completion and the issue-by-issue comment loop. Users can
select participants, ask general questions, and resolve issues inline
(/post, /skip, /edit, /discuss, /issues) so only remaining issues flow
into the per-issue loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 17:39:11 +08:00
Li Liu b78cf9cc24 fix: strengthen language enforcement by injecting instructions into system prompt prefix
Move language instructions from user message suffix to system prompt prefix
where LLMs give them higher weight. Add langPrefix/withLang() to orchestrator
so all reviewer/analyzer/summarizer calls get the language requirement.
Also wire discuss command to use config.defaults.language instead of only
following user input language.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:19:20 +08:00
Li Liu 8a8dab1ec4 fix: improve inline PR comment placement with content-based matching and full diff fallback
- Add extractDiffLineRanges() to include valid line ranges in structurizer prompt
- Add content-based matching (extractCodeFromBody + findLineByContent) as fallback
- Add full diff fallback via gh api when per-file patches are null
- Widen nearest-line threshold from 20 to 50 for better coverage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 18:12:23 +08:00
Li Liu ad7f0d1e0c fix: auto-select all reviewers in non-TTY mode to prevent hanging
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:58:52 +08:00
Li Liu 4fe80c808c chore: move typescript to devDependencies, remove unused readline package
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:58:11 +08:00
Li Liu 613c1b479c fix: add SIGKILL fallback after SIGTERM timeout in CLI providers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:57:24 +08:00
Li Liu 64dd42a045 fix: add retry with backoff to all CLI providers and Gemini streaming
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:57:04 +08:00
Li Liu 5df0e65e5e feat: pre-flight check for CLI binaries with friendly error messages
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:55:12 +08:00
Li Liu c2d0f583af fix: register exit handler to clean up temp prompt files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:55:09 +08:00
Li Liu 6bc75ca3ea fix: single reviewer failure no longer terminates entire review session
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:53:05 +08:00
Li Liu 38ff61471a fix: prevent command injection in history-collector via spawnSync
Replace execSync with spawnSync in getFileHistory() and getPRDetails()
to prevent shell injection through file paths and PR numbers. Add input
validation for prNumber (must be a positive integer).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:51:12 +08:00
Li Liu 9a6aaca563 fix: prevent command injection in reference-collector via spawnSync
Replace execSync with spawnSync in findReferences() to prevent shell
injection through malicious symbol names in PR diffs. Use -F (fixed-string)
and -e flags for safe argument passing to ripgrep.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 13:51:08 +08:00
Li Liu b541b509d7 fix: improve inline PR comment placement with near-line matching
When the AI reviewer references a line not exactly in a diff hunk,
find the nearest valid diff line (within 20 lines) and post inline
there instead of falling back to file-level. Also constrain the
structurizer prompt to only reference files actually in the diff.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 23:29:14 +08:00
Buqian Zheng 14a6f42a8e fix: allow round-1 convergence and stop spinner before round output (#9)
- Allow convergence check on round 1: independent reviewers reaching
  the same conclusions is valid convergence
- Adapt convergence prompt to distinguish independent reviews (round 1)
  from cross-examined reviews (round 2+)
- Stop any running spinner in onRoundComplete before printing round
  results to prevent terminal output artifacts

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 00:30:31 +08:00
Li Liu 87abf873cd fix: handle large PR review failures in CLI providers
- Add stdin EPIPE error handlers to prevent unhandled exception crashes
  when child processes exit before reading all input
- Clear CLAUDECODE env var to avoid nested session detection when
  running from within Claude Code
- Write large prompts (>100KB) to temp files instead of stdin to bypass
  CLI prompt size limits; CLI tools read the file via their built-in
  file access capabilities
- Capture stderr in claude-code streaming mode for better error reporting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 21:40:11 +08:00
Buqian Zheng 7156c3a5ed fix: stdin resume after ora spinner, Ctrl+C handling, diff filtering, and context gatherer i18n (#8)
- Fix readline hanging after ora spinner pauses stdin: add process.stdin.resume()
  before rl.question() at all spinner→input transition points across review, discuss,
  and interactive modules
- Fix Ctrl+C not working during analysis/debate: replace silent flag-only SIGINT
  handler with double-press-to-force-exit pattern; add InterruptedError + checkpoint
  checks in orchestrator between analysis, debate rounds, and summarization
- Add diff filtering: new diff-filter utility with built-in patterns for generated
  files (*.pb.go, vendor/**, lockfiles, etc.) and user-configurable diff_exclude
- Add language support to context gatherer: pass config.defaults.language to
  ContextGatherer so System Context section respects language setting

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2026-03-02 21:58:58 +08:00
Li Liu a472b45323 feat: add post-review discussion phase, comment style prompt, /skip, and language config
- Add interactivePostReviewDiscussion() for chatting with any role after review conclusion
- Show all roles (reviewers + analyzer + summarizer) in comment discussion picker
- Add comment style prompt before issue loop to style-guide first-gen comments
- Add /skip and /drop to abandon issues mid-discussion
- Add defaults.language config for localized output (e.g. language: zh)
- Expose getAnalyzer() and getSummarizer() from orchestrator
- Share reviewer sessions across discussion and comment phases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
2026-03-02 12:09:10 +08:00
Li Liu c11554c781 feat: add --no-post flag to skip GitHub comment flow
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
2026-02-27 08:53:20 +00:00
Li Liu 22cfae3347 fix: pass prompt via stdin instead of args to avoid E2BIG on large diffs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Li Liu <li.liu@zilliz.com>
2026-02-27 08:29:19 +00:00
ChrisPan bf896593b3 feat: add custom API base URL support for all API providers (#6)
Allow users to configure `base_url` per provider to connect to
compatible third-party endpoints (Azure OpenAI, Ollama, vLLM, one-api,
etc.). All four API providers (Anthropic, OpenAI, Gemini, MiniMax) now
accept an optional `base_url` in config which is passed through to
their respective SDKs.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-26 22:55:17 +08:00
Li Liu 20dba4f0f3 fix: resolve correct upstream repo when reviewing PRs from a fork
When a bare PR number is given while in a fork's local clone, the code
now uses `gh pr view` to resolve the actual upstream repo before falling
back to git remote detection. This prevents 404 errors when posting
review comments on PRs that live on the upstream repository.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 22:52:37 +08:00
Li Liu aabf5924e7 fix: require exhaustive file-by-file review to eliminate partial findings
Reviewer prompts previously allowed LLMs to stop after finding a handful
of issues. Now every round demands systematic coverage of all changed
files/functions, and debate rounds additionally require reviewers to
identify what others missed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:16:15 +08:00
Li Liu 2b0e1ba711 refactor: comprehensive codebase improvements across 7 phases
Phase A - Quick fixes:
- Remove debug logging that leaked prompt content (qwen-code)
- Fix orchestrator session leak with try/finally cleanup
- CJK-aware token estimation for better accuracy
- Issue parser validation (line > 0, endLine >= line, non-empty fields)
- Improved similarity matching with stop words filtering and description weight

Phase B - Medium fixes:
- Add retry utility with exponential backoff for API providers
- Config validation at load time (required fields, empty API key warnings)
- GitHub PR comment deduplication (skip already-posted comments)
- Ctrl+C graceful exit for interactive comment review

Phase C - Structured logging:
- Logger class with debug/info/warn/error levels (MAGPIE_LOG_LEVEL env var)

Phase D - Type safety:
- Replace `any` types with proper types across discuss.ts, review.ts,
  issue-parser.ts, commenter.ts, repo-orchestrator.ts, history-collector.ts

Phase E - Session helper extraction:
- CliSessionHelper class shared by 4 CLI providers, reducing duplication

Phase F - Split review.ts (1991 → 6 files):
- review.ts (command + action), interactive.ts, repo-review.ts,
  session-cmds.ts, utils.ts, types.ts

Phase G - Tests:
- 6 new test files (retry, logger, session-helper, issue-parser-enhanced,
  loader-validation, orchestrator-session)
- Fix pre-existing test failures (commenter, anthropic)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 22:46:46 +08:00
Li Liu fe080cd8e1 feat: prompt user before fallback when inline comment placement fails
Classify comments against PR diff before posting. When comments cannot
be placed inline (line not in diff), show user the fallback plan and
offer choices: post all with fallback, inline only, retry as inline,
or skip.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:03:28 +08:00
Li Liu 5083785500 fix: post review comments as inline code annotations instead of global PR comments
Use GitHub Reviews API to batch-post comments as a proper code review.
Parse PR diff to determine valid line placements, with three-level
fallback: inline (exact line) → file-level (attached to file) → global.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 21:00:20 +08:00
Li Liu 28c1d460c7 feat: add disappearing placeholder hint for discussion input
Show a dim "Enter to end discussion" hint at the input prompt that
automatically clears when the user starts typing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 20:36:03 +08:00
Li Liu 79e30b082a feat: persistent reviewer sessions in post-processing discussion
Maintain a single session per reviewer across all issue discussions,
so that context (PR diff, gathered context, debate history) is sent
once at session start and subsequent issues use --resume for
claude-code provider. This enables cross-issue context sharing and
eliminates cold-start overhead per issue.

- Add ReviewerSessionState interface and session management helpers
- buildInitialSessionContext: rich first message with full PR context
- getOrCreateSession: lazy session creation with provider startSession
- Use DebateResult type instead of inline type for better type safety
- Wrap discussion flow in try/finally to guarantee session cleanup
- Synthetic assistant reply after initial context to maintain
  user/assistant alternation for API-based providers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 20:34:22 +08:00
Li Liu 99ee72433a fix: support cross-repo PR comment posting via --repo flag
When reviewing a PR from a different repository using a full GitHub URL,
the commenter commands (gh pr view, gh pr comment) defaulted to the
current directory's git remote, causing "Could not resolve PullRequest"
errors. Now extracts owner/repo from the PR URL and passes --repo flag.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 20:06:07 +08:00