OSWE Code Review Cheat Sheet

Purpose: A compact, exam-focused cheat sheet for source-code review during OSWE-style assessments. Includes: review mindset, fast grep commands, dangerous sinks per language, common bypasses, and a checklist. Updated: added SSRF, XXE, Prototype Pollution, Eval Filter Bypass, CORS/CSRF chaining guidance.


Quick mindset (always)

  1. Input → Flow → Sink — identify sources, follow transformations, find sinks that execute/interpret/use data.

  2. Assume attacker control — any user-controllable value reaching a sink without clear, correct validation/whitelist is risky.

  3. Prioritize: dynamic evaluation, file includes, SQL/NoSQL, template rendering, deserialization, OS/DB commands, path handling, SSRF/SSO flows, and auth logic.

  4. Look for non-standard sanitization (custom regexes, blacklists) and unsafe use of framework helpers.


Fast discovery (grep/snippets)

# General dangerous keywords
grep -RIn --line-number -E "eval|exec|system\(|popen|subprocess|shell_exec|passthru|include\(|require\(|execFile|Runtime.getRuntime|ProcessBuilder|fetch\(|requests\.get|HttpClient" .

# SQL usage (simple)
grep -RIn --line-number -E "SELECT|INSERT|UPDATE|DELETE|EXECUTE|query\(" .

# Template & serialization
grep -RIn --line-number -E "render_template|render\(|template\.|unserialize|pickle\.loads|yaml\.load|JSON\.parse" .

# File/Path, Uploads
grep -RIn --line-number -E "open\(|fopen\(|move_uploaded_file|save\(|File\.write|file_put_contents|fwrite" .

# SSRF / URL fetches
grep -RIn --line-number -E "requests\.get|requests\.post|HttpClient|fetch\(|curl_exec|file_get_contents\(|urllib\.request" .

# CORS / CSRF
grep -RIn --line-number -E "Access-Control-Allow-Origin|CORS|Cross-Origin|csrf|XSRF|SameSite|session\.cookie" .

Top dangerous sinks & why

  • eval / dynamic code evaluation → arbitrary code execution (RCE / SSTI).

  • Template rendering with user input → server-side template injection (SSTI).

  • Direct SQL construction (string concatenation) → SQL injection.

  • Deserialization (pickle, unserialize) → arbitrary object execution.

  • OS command invocation with user input → command injection.

  • File include/require with user input → local/remote file inclusion (LFI/RFI).

  • XML parsing without secure config → XXE.

  • Returning raw user content without proper escaping → XSS.

  • Unsafely handling paths → path traversal.

  • Server-Side Request Forgery (SSRF) → attacker-induced outbound requests (internal host probing, metadata access).

  • Prototype Pollution → attacker-controlled object properties altering application logic (Node/JS).

  • CORS/CSRF chaining → misconfigurations or chained flaws enabling cross-origin or cross-site attacks.


Additional Vulnerabilities (expanded)

Server-Side Request Forgery (SSRF)

What it is: SSRF occurs when an application fetches remote resources (URLs) based on user input and the attacker can control the destination. This can let attackers reach internal services, cloud metadata endpoints, port scans, or cause unexpected behaviors.

Common sinks:

  • HTTP clients: requests.get(), fetch(), HttpClient, curl_exec(), file_get_contents(url) and similar.

  • Image/file retrieval endpoints that accept a URL to fetch.

  • URL previewers, webhook handlers, RSS readers, remote health-checkers.

Checks during code review:

  • Identify where the app performs outbound requests using user-provided URLs.

  • Look for lack of hostname/IP validation, lack of allowed-scheme checks (e.g., http, https only), and absence of network-level restrictions.

  • Does code allow redirects? Follow redirects may land on internal hosts.

  • Is DNS resolution performed on user input without validation (e.g., resolving localhost/127.0.0.1)?

Mitigations / safe alternatives:

  • Implement a strict allowlist of target hosts or domains.

  • Normalize and resolve the host, then verify the resolved IP is not in private ranges (10/8, 172.16/12, 192.168/16, 169.254/16, ::1, fc00::/7).

  • Restrict allowed schemes (only https when appropriate), and avoid allowing file://, gopher://, ftp://, dict://, file: etc.

  • Use an isolated proxy with outbound restrictions for user-supplied fetches.

  • Limit redirects and enforce a maximum redirect count; validate the final host after redirects.

Greps / signposts:

grep -RIn --line-number -E "requests\.get|fetch\(|curl_exec|file_get_contents|urllib\.request|HttpClient" .

Example (lab safe):

  • Look for code that takes ?url= and passes it straight to an HTTP client. Flag it and suggest allowlisting + IP resolution checks.


XML External Entity (XXE)

What it is: XXE lets an attacker cause the XML parser to load external resources or local files (via ENTITY declarations). It can expose local files, cause SSRF, or memory exhaustion.

Common sinks:

  • DocumentBuilderFactory / SAXParser in Java without secure features.

  • xml.etree.ElementTree.fromstring() in Python when using older libs or unsafe parsers.

  • libxml2-backed parsers in PHP (simplexml_load_string) without disabling external entities.

Checks during review:

  • Does the code parse XML from user-controlled sources? (uploads, endpoints, SOAP services)

  • Are parser features to disable DOCTYPE and external entities set? (e.g., disallow-doctype-decl, disable external general entities).

  • Is the application relying on third-party libraries that parse XML internally (uploads, feeds)?

Mitigations / safe alternatives:

  • Disable DOCTYPE and external entity resolution for XML parsers.

  • Use simple, non-XML formats when possible (JSON).

  • Use secure XML parser configurations (documented for each platform).

  • If external entities are required, fetch resources from a controlled/isolated environment.

Quick grep:

grep -RIn --line-number -E "DocumentBuilderFactory|SAXParser|xml\.etree|simplexml_load_string|libxml" .

Prototype Pollution (Node.js / JS)

What it is: Prototype pollution happens when attackers can modify Object.prototype (or other prototypes), introducing or changing properties that affect application logic — often via unsafe merges/assignments of user-supplied objects.

Common sinks / patterns:

  • Merging user objects into configuration or templates using _.merge, Object.assign, deep recursive merges without validation.

  • Accepting arbitrary JSON and passing it to code that treats properties as configuration flags (req.body merged into options).

  • Using libraries with known prototype-pollution vulnerabilities.

Checks during review:

  • Where user JSON is merged into internal objects/config (options = {...defaults, ...req.body}).

  • Look for usage of lodash.merge, merge, deepmerge, or similar functions on user-controlled objects.

  • Check any code that uses property names dynamically (e.g., obj[foo]) where foo can be __proto__ or constructor.prototype.

  • Validate whether the code sanitizes keys like __proto__, constructor, prototype, or symbol properties.

Mitigations / safe alternatives:

  • Reject or sanitize keys that are __proto__, constructor, prototype, or those containing __proto__ segments.

  • Use shallow copies instead of deep merges when merging user input into internal config.

  • Use libraries with built-in protections or pass a whitelist of allowed keys only.

  • Deep-clone using safe utilities that explicitly block prototype chain changes.

Quick grep:

grep -RIn --line-number -E "Object\.assign|_.merge|merge\(|deepmerge|JSON\.parse" .

Eval-filter bypass (filters meant to stop eval)

What it is: Developers sometimes attempt to block eval() by filtering keywords or characters. Determined attackers can bypass naive filters by using alternative code paths or language features that achieve the same effect (e.g., Function constructor in JS, assert() in PHP, or indirect eval via template engines).

Common risky patterns:

  • Naive string filters that remove the substring eval but not equivalent functionality (new Function(), setTimeout(str, 0), exec variants).

  • Relying on blacklists (blocklist) rather than whitelists, which tend to be incomplete.

Checks during review:

  • Search for filter functions or regex-based sanitizers that operate on source strings (e.g., str.replace(/eval/gi, '')).

  • Look for other dynamic-eval-capable APIs: JS Function, setTimeout, setInterval (string args), vm module; Python exec, eval, compile; PHP assert(); Ruby eval/instance_eval.

  • Check if user input ever reaches those APIs indirectly (templating, code generation, REPL endpoints, admin consoles).

Mitigations / safer alternatives:

  • Avoid dynamic evaluation entirely; refactor logic to use explicit paths or interpreters with a safe subset.

  • Use whitelists for allowed operations or tokens and parse/interpret them with safe parsers.

  • For unavoidable dynamic behavior, sandbox the execution environment (restricted runtime, containerization, strict timeouts, resource caps).

Quick grep:

grep -RIn --line-number -E "eval\(|new Function|setTimeout\(|vm\.runInNewContext|assert\(" .

CORS / CSRF Chaining (misconfig + chained logic)

What it is: CORS misconfigurations (e.g., Access-Control-Allow-Origin: * or echoing back Origin without validation) can be combined with other issues (exposed credentials, lax auth logic) to allow cross-origin attacks. CSRF chaining is when a CSRF (state-changing) request is combined with other server behaviors to escalate impact.

CORS checks during review:

  • Search for code that sets Access-Control-Allow-Origin dynamically based on Origin header; check whether it validates against an allowlist.

  • Examine Access-Control-Allow-Credentials: true together with wildcard origins — that is insecure (browsers block, but some code echoes origin incorrectly).

  • Check Access-Control-Expose-Headers and Vary headers handling (caching pitfalls).

CSRF chaining checks:

  • Look for state-changing endpoints (POST/PUT/DELETE) that lack CSRF protections (tokens, SameSite cookies).

  • Check whether application relies solely on Origin/Referer for CSRF protection — may be insufficient if chained with CORS misconfig.

  • Identify flows where a low-privileged action can be chained to perform sensitive action (e.g., an endpoint that sets an email address + another that triggers password reset to that email).

Mitigations / safe alternatives:

  • For CORS: use strict allowlists, avoid echoing Origin without validation, and avoid Access-Control-Allow-Credentials: true with wildcard origins.

  • For CSRF: enforce anti-CSRF tokens or SameSite=strict/lax cookies for session cookies, validate origin for critical operations in addition to tokens.

  • During review, flag any place where cross-origin requests can result in state changes without proper CSRF tokens or where credentials are exposed by CORS.

Quick grep:

grep -RIn --line-number -E "Access-Control-Allow-Origin|Access-Control-Allow-Credentials|CORS|csrf|SameSite|Origin" .

Language & Framework Cheat Lists (abridged)

For each: Pattern → Risk → Safer alternative. (Sections for major languages retained from main sheet.)

... (Python, PHP, Node, Java, C#, Ruby sections omitted here for brevity; they are unchanged from the main sheet content and still included in the full markdown file.)


Common input sources to monitor

Query params, POST bodies, headers, cookies, multipart filenames, JSON payloads, WebSocket messages, path params, template names, dynamic import names, metadata fields (Content-Disposition), and any DB fields originally supplied by users.


Typical bypass / evasion techniques

  • Encoding: %27, %2f, \x27, 0x27.

  • CHR()/CHAR() in SQL to reconstruct quotes.

  • Concatenation: CONCAT(), + to build strings.

  • Comments: --, /* ... */ to truncate rest of a query.

  • Time-based / Blindatives: SLEEP(5) for blind SQLi.

  • WAF evasion: Unicode, homoglyphs, mixed encodings, comment insertion.

  • For eval filters: alternative APIs (Function vs eval), indirect execution via templating or callbacks.

  • For SSRF/XXE: nested redirects, DNS rebinding, use of intermediary services (e.g., URL shorteners) — flag any URL resolution chain.


Quick manual testing payloads (educational)

Use only in labs or authorized tests.

  • SQL classic: ' OR '1'='1

  • Time-based blind: ' OR IF(SUBSTR((SELECT password FROM users LIMIT 1),1,1)='a', SLEEP(5), 0) --

  • SSTI (Jinja2 basic): {{7*7}} (look for 49)

  • LFI attempts: ../../../../etc/passwd

  • XXE (example, labs only): <!DOCTYPE root [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><data>&xxe;</data>

  • SSRF marker: provide a controllable URL (in labs) and check app's outbound fetch behavior.

  • Prototype pollution (test): sending {"__proto__": {"polluted": true}} to endpoints merging body into objects (labs only).


OSWE exam checklist (compact)


Suggested personal repo layout

oswe-cheatsheet/
  README.md
  CHECKLIST.md
  payloads/
    sql/
    ssti/
    lfi/
    ssrf/
    xxe/
    prototype_pollution/
  grep-snippets.md
  language-guides/
    python.md
    php.md
    node.md

References (must-read)

  • OWASP Web Security Testing Guide

  • PayloadsAllTheThings (GitHub)

  • PortSwigger Web Security Academy

  • HackTricks (Book)

  • Semgrep rules library


Notes & ethics

This sheet is for learning, authorized testing, and exam prep only. Never run offensive tests against systems you do not own or have explicit permission to assess.

Last updated