commands/security-review.ts at main · nonbinary.computer/claude-code

nonbinary.computer / claude-code
forked from oppi.li/claude-code
fork atom
source dump of claude code
fork atom
claude-code / commands / security-review.ts
at main 243 lines 12 kB view raw
wrap content
oppi.li dump from zip 9d ago
63aada3f
  1import { parseFrontmatter } from '../utils/frontmatterParser.js'
  2import { parseSlashCommandToolsFromFrontmatter } from '../utils/markdownConfigLoader.js'
  3import { executeShellCommandsInPrompt } from '../utils/promptShellExecution.js'
  4import { createMovedToPluginCommand } from './createMovedToPluginCommand.js'
  5
  6const SECURITY_REVIEW_MARKDOWN = `---
  7allowed-tools: Bash(git diff:*), Bash(git status:*), Bash(git log:*), Bash(git show:*), Bash(git remote show:*), Read, Glob, Grep, LS, Task
  8description: Complete a security review of the pending changes on the current branch
  9---
 10
 11You are a senior security engineer conducting a focused security review of the changes on this branch.
 12
 13GIT STATUS:
 14
 15\`\`\`
 16!\`git status\`
 17\`\`\`
 18
 19FILES MODIFIED:
 20
 21\`\`\`
 22!\`git diff --name-only origin/HEAD...\`
 23\`\`\`
 24
 25COMMITS:
 26
 27\`\`\`
 28!\`git log --no-decorate origin/HEAD...\`
 29\`\`\`
 30
 31DIFF CONTENT:
 32
 33\`\`\`
 34!\`git diff origin/HEAD...\`
 35\`\`\`
 36
 37Review the complete diff above. This contains all code changes in the PR.
 38
 39
 40OBJECTIVE:
 41Perform a security-focused code review to identify HIGH-CONFIDENCE security vulnerabilities that could have real exploitation potential. This is not a general code review - focus ONLY on security implications newly added by this PR. Do not comment on existing security concerns.
 42
 43CRITICAL INSTRUCTIONS:
 441. MINIMIZE FALSE POSITIVES: Only flag issues where you're >80% confident of actual exploitability
 452. AVOID NOISE: Skip theoretical issues, style concerns, or low-impact findings
 463. FOCUS ON IMPACT: Prioritize vulnerabilities that could lead to unauthorized access, data breaches, or system compromise
 474. EXCLUSIONS: Do NOT report the following issue types:
 48   - Denial of Service (DOS) vulnerabilities, even if they allow service disruption
 49   - Secrets or sensitive data stored on disk (these are handled by other processes)
 50   - Rate limiting or resource exhaustion issues
 51
 52SECURITY CATEGORIES TO EXAMINE:
 53
 54**Input Validation Vulnerabilities:**
 55- SQL injection via unsanitized user input
 56- Command injection in system calls or subprocesses
 57- XXE injection in XML parsing
 58- Template injection in templating engines
 59- NoSQL injection in database queries
 60- Path traversal in file operations
 61
 62**Authentication & Authorization Issues:**
 63- Authentication bypass logic
 64- Privilege escalation paths
 65- Session management flaws
 66- JWT token vulnerabilities
 67- Authorization logic bypasses
 68
 69**Crypto & Secrets Management:**
 70- Hardcoded API keys, passwords, or tokens
 71- Weak cryptographic algorithms or implementations
 72- Improper key storage or management
 73- Cryptographic randomness issues
 74- Certificate validation bypasses
 75
 76**Injection & Code Execution:**
 77- Remote code execution via deseralization
 78- Pickle injection in Python
 79- YAML deserialization vulnerabilities
 80- Eval injection in dynamic code execution
 81- XSS vulnerabilities in web applications (reflected, stored, DOM-based)
 82
 83**Data Exposure:**
 84- Sensitive data logging or storage
 85- PII handling violations
 86- API endpoint data leakage
 87- Debug information exposure
 88
 89Additional notes:
 90- Even if something is only exploitable from the local network, it can still be a HIGH severity issue
 91
 92ANALYSIS METHODOLOGY:
 93
 94Phase 1 - Repository Context Research (Use file search tools):
 95- Identify existing security frameworks and libraries in use
 96- Look for established secure coding patterns in the codebase
 97- Examine existing sanitization and validation patterns
 98- Understand the project's security model and threat model
 99
100Phase 2 - Comparative Analysis:
101- Compare new code changes against existing security patterns
102- Identify deviations from established secure practices
103- Look for inconsistent security implementations
104- Flag code that introduces new attack surfaces
105
106Phase 3 - Vulnerability Assessment:
107- Examine each modified file for security implications
108- Trace data flow from user inputs to sensitive operations
109- Look for privilege boundaries being crossed unsafely
110- Identify injection points and unsafe deserialization
111
112REQUIRED OUTPUT FORMAT:
113
114You MUST output your findings in markdown. The markdown output should contain the file, line number, severity, category (e.g. \`sql_injection\` or \`xss\`), description, exploit scenario, and fix recommendation.
115
116For example:
117
118# Vuln 1: XSS: \`foo.py:42\`
119
120* Severity: High
121* Description: User input from \`username\` parameter is directly interpolated into HTML without escaping, allowing reflected XSS attacks
122* Exploit Scenario: Attacker crafts URL like /bar?q=<script>alert(document.cookie)</script> to execute JavaScript in victim's browser, enabling session hijacking or data theft
123* Recommendation: Use Flask's escape() function or Jinja2 templates with auto-escaping enabled for all user inputs rendered in HTML
124
125SEVERITY GUIDELINES:
126- **HIGH**: Directly exploitable vulnerabilities leading to RCE, data breach, or authentication bypass
127- **MEDIUM**: Vulnerabilities requiring specific conditions but with significant impact
128- **LOW**: Defense-in-depth issues or lower-impact vulnerabilities
129
130CONFIDENCE SCORING:
131- 0.9-1.0: Certain exploit path identified, tested if possible
132- 0.8-0.9: Clear vulnerability pattern with known exploitation methods
133- 0.7-0.8: Suspicious pattern requiring specific conditions to exploit
134- Below 0.7: Don't report (too speculative)
135
136FINAL REMINDER:
137Focus on HIGH and MEDIUM findings only. Better to miss some theoretical issues than flood the report with false positives. Each finding should be something a security engineer would confidently raise in a PR review.
138
139FALSE POSITIVE FILTERING:
140
141> You do not need to run commands to reproduce the vulnerability, just read the code to determine if it is a real vulnerability. Do not use the bash tool or write to any files.
142>
143> HARD EXCLUSIONS - Automatically exclude findings matching these patterns:
144> 1. Denial of Service (DOS) vulnerabilities or resource exhaustion attacks.
145> 2. Secrets or credentials stored on disk if they are otherwise secured.
146> 3. Rate limiting concerns or service overload scenarios.
147> 4. Memory consumption or CPU exhaustion issues.
148> 5. Lack of input validation on non-security-critical fields without proven security impact.
149> 6. Input sanitization concerns for GitHub Action workflows unless they are clearly triggerable via untrusted input.
150> 7. A lack of hardening measures. Code is not expected to implement all security best practices, only flag concrete vulnerabilities.
151> 8. Race conditions or timing attacks that are theoretical rather than practical issues. Only report a race condition if it is concretely problematic.
152> 9. Vulnerabilities related to outdated third-party libraries. These are managed separately and should not be reported here.
153> 10. Memory safety issues such as buffer overflows or use-after-free-vulnerabilities are impossible in rust. Do not report memory safety issues in rust or any other memory safe languages.
154> 11. Files that are only unit tests or only used as part of running tests.
155> 12. Log spoofing concerns. Outputting un-sanitized user input to logs is not a vulnerability.
156> 13. SSRF vulnerabilities that only control the path. SSRF is only a concern if it can control the host or protocol.
157> 14. Including user-controlled content in AI system prompts is not a vulnerability.
158> 15. Regex injection. Injecting untrusted content into a regex is not a vulnerability.
159> 16. Regex DOS concerns.
160> 16. Insecure documentation. Do not report any findings in documentation files such as markdown files.
161> 17. A lack of audit logs is not a vulnerability.
162>
163> PRECEDENTS -
164> 1. Logging high value secrets in plaintext is a vulnerability. Logging URLs is assumed to be safe.
165> 2. UUIDs can be assumed to be unguessable and do not need to be validated.
166> 3. Environment variables and CLI flags are trusted values. Attackers are generally not able to modify them in a secure environment. Any attack that relies on controlling an environment variable is invalid.
167> 4. Resource management issues such as memory or file descriptor leaks are not valid.
168> 5. Subtle or low impact web vulnerabilities such as tabnabbing, XS-Leaks, prototype pollution, and open redirects should not be reported unless they are extremely high confidence.
169> 6. React and Angular are generally secure against XSS. These frameworks do not need to sanitize or escape user input unless it is using dangerouslySetInnerHTML, bypassSecurityTrustHtml, or similar methods. Do not report XSS vulnerabilities in React or Angular components or tsx files unless they are using unsafe methods.
170> 7. Most vulnerabilities in github action workflows are not exploitable in practice. Before validating a github action workflow vulnerability ensure it is concrete and has a very specific attack path.
171> 8. A lack of permission checking or authentication in client-side JS/TS code is not a vulnerability. Client-side code is not trusted and does not need to implement these checks, they are handled on the server-side. The same applies to all flows that send untrusted data to the backend, the backend is responsible for validating and sanitizing all inputs.
172> 9. Only include MEDIUM findings if they are obvious and concrete issues.
173> 10. Most vulnerabilities in ipython notebooks (*.ipynb files) are not exploitable in practice. Before validating a notebook vulnerability ensure it is concrete and has a very specific attack path where untrusted input can trigger the vulnerability.
174> 11. Logging non-PII data is not a vulnerability even if the data may be sensitive. Only report logging vulnerabilities if they expose sensitive information such as secrets, passwords, or personally identifiable information (PII).
175> 12. Command injection vulnerabilities in shell scripts are generally not exploitable in practice since shell scripts generally do not run with untrusted user input. Only report command injection vulnerabilities in shell scripts if they are concrete and have a very specific attack path for untrusted input.
176>
177> SIGNAL QUALITY CRITERIA - For remaining findings, assess:
178> 1. Is there a concrete, exploitable vulnerability with a clear attack path?
179> 2. Does this represent a real security risk vs theoretical best practice?
180> 3. Are there specific code locations and reproduction steps?
181> 4. Would this finding be actionable for a security team?
182>
183> For each finding, assign a confidence score from 1-10:
184> - 1-3: Low confidence, likely false positive or noise
185> - 4-6: Medium confidence, needs investigation
186> - 7-10: High confidence, likely true vulnerability
187
188START ANALYSIS:
189
190Begin your analysis now. Do this in 3 steps:
191
1921. Use a sub-task to identify vulnerabilities. Use the repository exploration tools to understand the codebase context, then analyze the PR changes for security implications. In the prompt for this sub-task, include all of the above.
1932. Then for each vulnerability identified by the above sub-task, create a new sub-task to filter out false-positives. Launch these sub-tasks as parallel sub-tasks. In the prompt for these sub-tasks, include everything in the "FALSE POSITIVE FILTERING" instructions.
1943. Filter out any vulnerabilities where the sub-task reported a confidence less than 8.
195
196Your final reply must contain the markdown report and nothing else.`
197
198export default createMovedToPluginCommand({
199  name: 'security-review',
200  description:
201    'Complete a security review of the pending changes on the current branch',
202  progressMessage: 'analyzing code changes for security risks',
203  pluginName: 'security-review',
204  pluginCommand: 'security-review',
205  async getPromptWhileMarketplaceIsPrivate(_args, context) {
206    // Parse frontmatter from the markdown
207    const parsed = parseFrontmatter(SECURITY_REVIEW_MARKDOWN)
208
209    // Parse allowed tools from frontmatter
210    const allowedTools = parseSlashCommandToolsFromFrontmatter(
211      parsed.frontmatter['allowed-tools'],
212    )
213
214    // Execute bash commands in the prompt
215    const processedContent = await executeShellCommandsInPrompt(
216      parsed.content,
217      {
218        ...context,
219        getAppState() {
220          const appState = context.getAppState()
221          return {
222            ...appState,
223            toolPermissionContext: {
224              ...appState.toolPermissionContext,
225              alwaysAllowRules: {
226                ...appState.toolPermissionContext.alwaysAllowRules,
227                command: allowedTools,
228              },
229            },
230          }
231        },
232      },
233      'security-review',
234    )
235
236    return [
237      {
238        type: 'text',
239        text: processedContent,
240      },
241    ]
242  },
243})