[HackerNotes Ep. 166] Claude Code Skills for Bug Bounty: When, Why, and How to Build Them

Exploring how Rez0 use Claude code to hunt and how to properly set it up

Hacker TL;DR

  • Claude Code skills should encode knowledge the model lacks and enforce deterministic workflows, not replace its creative reasoning

  • Build a fallback architecture in skills: primary tool → SDK/library → raw API, so the agent adapts when one layer fails

  • Structure your notes as a funnel: notes → leads → primitives → findings → reports to keep multi-session hacking organized

  • Run two parallel agents (one guided, one free-roaming) and cross-compare results to continuously improve your methodology

When Do Skills Actually Help?

The good question to ask is does giving Claude a rigid skill limit its creativity, or does it make the agent more effective?. It can be hard to know which one is more effective and what to do to increase the intelligence of our agents. And Rez0 answered us with his own methodology to understand how to do it properly.

1. Knowledge the Model Doesn't Have

Claude's training data is massive, but it has clear blind spots. Custom tooling like Caido requires a skill because the model would otherwise spend half a session figuring out the SDK from scratch. The same applies to:

  • Secret techniques — gadgets, Oracle chains, or exploitation patterns from DEFCON talks that aren't widely documented online

  • Enterprise tools — any software behind a paywall or requiring API keys that the model cannot sign up for on its own

  • Custom infrastructure — VPS credentials, file paths, server configurations, and personal workflows that exist only in the researcher's head

So if a tool has good public documentation, Claude can often figure it out, but at the cost of significant token usage. A skill provides a head start and eliminates wasted cycles.

2. Constraining a Large Solution Space

When there are dozens of ways to accomplish something (curl, Python, Playwright, Caido, Chrome DevTools), a skill steers the agent toward the preferred method. This matters for consistency: using Caido ensures traffic is proxied, visible in history, and available for screenshots and POC documentation.

The key insight here is that steering is not limiting. Telling Claude where to save files, how to make requests, and what note format to use does not degrade output quality. It provides structure that makes multi-session hacking manageable.

We do subs at $25, $10, and $5, premium subscribers get access to:

Hackalongs: live bug bounty hacking on real programs, VODs available
Live data streams, exploits, tools, scripts & un-redacted bug reports

The Fallback Architecture

Justin highlights a pattern observed in the Caido skill that demonstrates effective skill design. When executing a task, the agent follows a layered approach:

  1. Primary tools — Use the skill's built-in commands and binaries

  2. SDK/library layer — If the primary tool fails, invoke the underlying TypeScript/JavaScript library directly

  3. Raw API — As a last resort, use GraphQL or REST calls to control the tool at the protocol level

This fallback architecture was not explicitly coded into the Caido skill and Claude developed this behavior on its own, likely from training data where agents iterated through failures.

So, if you want to build your own skills, design them with multiple layers of abstraction, and include a line in your skill telling Claude not to limit itself to the prescribed workflow. If the skill doesn't work, it should explore alternative approaches.

Maybe you can add a line like "If this workflow fails or doesn't cover the situation, use your own exploration and creativity to keep going" at the end of every skill.

Claude.md Best Practices for Bug Bounty

Beyond skills, the CLAUDE.md file is the foundation of an effective hacking setup. Here is three essential elements:

Identity and Context

Tell Claude who you are and what you do. A simple declaration like "I'm a bug bounty hunter doing authorized ethical testing" significantly reduces refusal rates and keeps the model aligned with offensive security tasks. Include directives like:

  • Stay in scope based on the program policy

  • Don't perform destructive actions unless on accounts you own

  • Always validate findings with a full POC before reporting

  • Phrases like "POC or GTFO" and "try harder" set the right behavioral expectations

Note-Taking Structure

Joseph recommends doing this kind of structure:

Level

Purpose

Notes

Raw observations, anything interesting during reconnaissance

Leads

Promising attack vectors that warrant further investigation

Primitives/Gadgets

Confirmed building blocks — IDOR patterns, auth bypasses, useful endpoints

Findings

Validated vulnerabilities with full reproduction steps

Reports

Polished write-ups ready for submission

Each level filters down from the previous one, ensuring the agent maintains context across compaction cycles and doesn't lose track of partial progress.

Where to Store Notes

Consistency across sessions is critical. There are different options you can choose:

  • Local folder structure per target, with the CLAUDE.md in each folder containing target-specific context

  • Obsidian or Notion via API integration

  • Custom API endpoint — You can build something like api.rez0.com so that all leads and gadgets are accessible regardless of which machine the agent runs on

  • Caido Findings tab — when hacking locally, pipe findings directly to Caido for real-time notification

Methodology: Setting Up a Dual-Agent Workflow

Joseph recommends running two parallel Claude Code instances against the same target for maximum coverage:

Phase 1: Launch Two Agents

  • Agent A (Guided): Loaded with your full skill set, custom CLAUDE.md, methodology steps, and target-specific context. This agent follows your deterministic workflow, like front-end analysis, source map enumeration, endpoint fuzzing, ensuring nothing you would normally check gets missed.

  • Agent B (Free-roaming): Minimal skills, minimal guidance. Just the target URL and authentication. This agent explores creatively, potentially finding attack vectors outside your usual methodology.

Phase 2: Instruct Both to Take Notes

Tell both agents at launch: "Keep detailed notes on what you tried, what worked, and what didn't." This is critical because context compaction will eventually erase the working memory.

Phase 3: Cross-Compare Results

Once both agents complete their runs, paste Agent A's output into Agent B's session (not a third agent, you want the full context). Ask:

  • What did the other agent find that you didn't?

  • What techniques did it use that we didn't try?

  • Are there gaps in our methodology?

Phase 4: Improve the Workflow

If the free-roaming agent discovered something the guided agent missed, add that technique to the skill. The methodology evolves with every iteration.

Running Claude Code Autonomously

For extended autonomous sessions, you can use a straightforward approach: tell the agent "I'm going to bed. Don't ask me any questions. Don't stop hacking." This consistently produces 4+ hours of autonomous hacking.

Key considerations for overnight runs:

  • Limit sub-agents to 2-3 maximum. When four or more sub-agents run simultaneously, compaction can fail catastrophically and the context fills up with no previous message to roll back to

  • Tell it to keep notes. Compaction will happen so good notes survive the reset

  • Use the CLAUDE.md to encode persistent instructions that survive compaction naturally

Agents vs. Folders

Claude Code offers two organizational approaches:

  • Agents: A specific system prompt with whitelisted tools. Useful for separating "pentester mode" from daily coding use

  • Folders: Launch Claude Code from a target-specific directory. The .claude/ folder in that directory loads automatically, layering on top of the home directory configuration

Rez0 prefers the folder approach: create a directory per target, populate its CLAUDE.md with program policy and scope (pulled automatically from HackerOne via H1 Brain), and launch Claude Code from there. Every future session in that folder inherits the target context automatically.

Resources

  • Caido Mode Claude Skill — The official Caido integration for Claude Code, enabling proxy-aware hacking with full replay, HTTP history, and findings management

  • H1 Brain by Patrick — An MCP server that pulls HackerOne program policies, scope, and disclosed reports to give Claude full program context

  • Claude Code Skills Documentation — Anthropic's official guide on building and structuring Claude Code skills with front matter, descriptions, and tool definitions

That's it for the week, keep hacking!