[HackerNotes Ep.85] Practical Applications of DEFCON 32 Web Research

The team are back with a whole bunch of tips and tricks off of the back of the research dropped at Defcon.

Hacker TLDR;

  • Web timing attacks: New timing attack techniques by James Kettle make theoretical threats practical, allowing for the exploitation of what was once theoretical-based vectors. Some of the New techniques dropped include:

    • Dual packet sync: This technique reduces latency by ensuring that HTTP headers and the body are sent within a single packet, preventing the server from processing headers prematurely. This synchronization is crucial for precision in timing attacks, minimizing the impact of network jitter and server load variations.

    • DNS label overflow: Using a DNS label of 65 characters would result in any resolver failing to look up the domain; the difference in timing between a successful resolution and a failed resolution can be detected.

  • Splitting the email atom: Gareth Heyes explores email RFC quirks, uncovering how subtle formatting can trick email systems and applications into misinterpreting addresses, allowing attackers to reroute or manipulate email delivery through:

    • Encoded word techniques: These attacks exploit email encoding rules from older RFCs and manipulate how email addresses are parsed, potentially rerouting messages or causing mismatches between the web application and SMTP server.

    • UUCP source routing: This technique allows attackers to specify additional routes in email addresses using the ! and % symbols. By crafting addresses like user!domain or user%@domain.com, attackers can reroute emails to unintended destinations, bypassing security controls.

    • Unicode/Punycode overflows: By embedding complex characters in email addresses, attackers can cause parsing errors or buffer overflows when the system decodes Punycode back into Unicode, potentially allowing security bypasses and payloads to be embedded into the addresses.

  • Gotta cache em’ all: Extensive research on discrepancies between web servers and CDNs URL parsers, allowing for web cache deception on a range of different web servers → CDN configurations via exploiting:

    • Origin delimiters: Exploiting discrepancies in how web servers and CDNs interpret delimiters (e.g., / or ;), allowing attackers to poison or retrieve cached content not intended for caching.

    • Normalization: Leveraging mismatches in how URLs are normalized between web servers and CDNs to trick the cache into storing and serving sensitive data.

    • Static extensions: Manipulating file extensions (e.g., .js, .css) to trigger caching rules and expose sensitive data through cache deception attacks.

    • Static directories: Exploiting default caching rules for static paths (e.g., /static/) to poison the cache or retrieve unintended content across various configurations, and a bunch more below!

  • Confusion attacks in Apache: Orange Tsai’s research dives into some heavy research in the Apache HTTP Server, exploiting inconsistencies in how different modules interpret requests resulting in confusion attacks. This research uncovered nine vulnerabilities and over 30 techniques, with impacts ranging from access control bypasses to RCE and a whole bunch more.

Find out more at: https://www.criticalthinkingpodcast.io/tlbook

This episode is sponsored by ThreatLocker. Check out their eBook "The IT Professional's Blueprint for Compliance" here!

Web Timing Attacks

This research adds a nice addition to the toolbelt for exploiting some of the more fringe cases in black-box scenarios. Web timing attacks have always been more of a theoretical threat than a practical one, but in true James Kettle style, he does a great job of making these vectors feasible.

He describes the research as ‘…novel attack concepts to coax out server secrets including masked misconfigurations, blind data-structure injection, hidden routes to forbidden areas, and a vast expanse of invisible attack-surface.’

Now these are the kinds of things you’ll need to consider when you’ve hit a wall on a target and you aren’t making much traction - you can almost guarantee that these types of attacks weren’t threat-modelled internally, which is only great news for us as hunters.

This research is a very long read, containing a few new concepts and techniques within. We’ll give you the TL;DR of some of the more actionable ones but we highly recommend reading the research after to fill any holes. Let’s jump in.

Dual packet sink

The single packet attack was a technique dropped last year to massively reduce the latency of multiple requests, allowing them to hit within milliseconds of each other within a single packet.

This technique ensures the HTTP server doesn’t start processing the headers before it gets the body. This is useful as some HTTP servers start to process the headers before they get the entire HTTP request for efficiency when dealing with large amounts of traffic, meaning this new technique facilitates the entire request being processed simultaneously.

When detecting very very small timing differences techniques like this are essential, otherwise, there are too many variables at play such as network jitter and server load which would make realistic exploitation incredibly difficult.

Scoped SSRF

Scoped SSRF occurs when you’re limited to making SSRF requests only to specific subdomains or IP ranges within a target's internal network. James described the scoped SSRF scenario as the following ‘…This restriction can be implemented via an internal DNS server, simple hostname validation, a firewall blocking outbound DNS, or a tight listener config. The outcome is always the same - you've got a bug with an impact close to full SSRF, but it can't be detected using pingback/OAST techniques.

Although these constraints are intended to prevent the attacker from reaching unauthorized internal systems, the PortSwigger team naturally dropped some techniques to help bypass this.

The research goes down the route of identifying if the server actually tried to connect to a specified hostname, and gives an example of how this could be done via timing:

Host header

Response

Time

foo.example.com

404 Not Found

25ms

foo.bar.com

403 Forbidden

20ms

For detecting the name resolution with DNS, PortSwigger also had that covered.

DNS Techniques

One of the techniques utilised in this research was identifying and enumerating DNS resolution. One feature when trying to detect if resolution is happening remotely, however, is DNS caching. Naturally, this can throw a spanner in the works when it comes to timing attacks as the response is coming from the cache, which speeds up the whole process. How would you know from a black-box perspective if you hit a cache, or if the resolution was failed/successful?

Using a DNS label of 65 characters would result in the resolver failing to look up the domain; the difference in timing between a successful resolution and a failed resolution can be detected. If it’s too long and the resolver fails, it shows a time of X milliseconds. Whereas if it’s successful, it gives a different time. The example below from the research:

Host header

Response

Time

aaa{62}.example.com

404 Not Found

25ms

aaa{63}.example.com

404 Not Found

20ms

If you need to detect if resolution is happening at all, this is one to add to the toolbelt.

The full paper on this one contains a lot of stellar timing attack contexts, use cases and case studies. Be sure to give it a read: https://portswigger.net/research/listen-to-the-whispers-web-timing-attacks-that-actually-work#scoped-ssrf

Splitting the Email Atom

The email formatting RFC. If you’ve dived into it or looked at some of the traditional attacks around addresses you’ll probably appreciate that it’s odd and slightly nuanced. The good news is odd and slightly nuanced is precisely what we are looking for when hunting!

There’s already been a bit of research around email-based vectors, but when Gareth Heyes dived into the email RFC for this research well, it didn’t disappoint.

The premise here is that certain characters are allowed in an email address which can actually be interpreted completely differently due to some ancient RFCs that SMTP servers are still compliant with.

A few examples of this are below, but the techniques allow you to either: ensure addresses are interpreted differently and cause a mismatch between the web app and SMTP server or introduce additional addresses which allow an attacker to manipulate the email to be sent to more than one place.

Encoded Word

Encoded word based payloads leverage the obscure RFCs we touched upon earlier to abuse email encoding rules. By abusing encoded word in emails, attackers can produce unexpected outputs and split email addresses to reroute messages. This example is taken from the blog:

Now the interesting thing here is the application and SMTP server are going to interpret these things completely differently. An email is going to be sent [email protected] but the application will usually interpret it literally.

How this works exactly and what this is is broken down in the below demonstration from the blog:

Be sure to have a few encoded word variants in your head when signing up for services!

UUCP and Source Routing

UUCP is really old - ancient as Gareth describes it - which is a protocol used pre-internet and email allowing messages to be transferred between Unix systems. It uses the exclamation mark ! as a separator in email addresses, with a format user!domain, which is opposite to modern email addresses. Some examples:

Equally, the research uncovered source routes. Source routes allow you to specify additional routes that an email should be sent through in one address. An email with a source route looks something like abc%@rhynorater.com@example.com .

When this gets passed to an SMTP server it will first send the email to [email protected] and THEN to [email protected]:

Unicode and Punycode Overflows

Now there are some more encoding types supported uncovered from this research, Unicode and Punycode. A summary of each:

  • Unicode Overflow: Unicode allows for a vast array of characters beyond standard ASCII, such as emojis and characters from non-Latin scripts. Systems that aren't prepared to handle these characters properly can become vulnerable. They might be exploitable by embedding complex Unicode characters in email addresses to bypass parsing logic and trick systems into misinterpreting the email structure.

  • Punycode Overflow: Punycode is a way to represent Unicode characters in domain names using only ASCII characters. It is often used to handle internationalized domain names (IDN). Attackers can exploit this by encoding malicious Unicode content into Punycode, which can result in parsing issues when the system converts the Punycode back into Unicode.

An example of how some of these parsings could go wrong:

Methodology

If you want to have a look for this yourself, Gareth dropped the rough method for looking for these kinds of bugs:

Think about the amount of services that rely on the domain part as a method of authentication. A prime example is owning an @examplecorp.com email. Whenever you sign into Slack with an @examplecorp.com email, you’re assumed to be an employee and therefore have access to all the internal chats.

This research could be pretty impactful against a lot of targets. The TL;DR: use the methodology above, add some encoded versions (from the examples above) of emails to your wordlist and see if you get any hits or oddities in parsing. Check out the full research here: https://portswigger.net/research/splitting-the-email-atom

Gotta Cache ‘em All

With web caches, you have two branches of attack - web cache poisoning and web cache deception. With poisoning, you’re ‘poisoning’ the cache so it contains something malicious which then gets delivered to all users, every single time that cache is called.

With deception, you’re tricking the cache (and user) into caching sensitive information which can then be retrieved by using the same cache, disclosing the sensitive data to the attacker.

Now, both of these branches of attack touch on the fact that the origin server (webserver) and caching server (think CloudFlare for example) may interpret different characters and escape sequences differently. One server may see a path of /user/path and the other sees /user/path;.js instead.

Why this is important is because caching servers depend on something known as caching rules. These rules are more often than not configured to cache all files ending in certain extensions, such as .css or .js . As most of these files are static in nature, it makes sense to cache them instead of re-requesting them constantly.

It’s worth adding there’s more ways cache rules are built, including:

  • By extension

  • By static paths: everything under /static for example

  • By default files: robots.txt or sitemap.xml for example

The magic happens here when you send a path which, by the origin, is interpreted as /user/myaccount but the cache sees /user/myaccount;.js.If the request fits a caching rule, it will then cache the response from /user/myaccount. This means when the attacker visits /user/myaccount;.js they can harvest that sensitive information that was cached by the server!

The research does a great job of highlighting the normalisation discrepancies across technologies. Take a look at the below graph for example:

The TL;DR here is if you can get a mismatch between the two (origin and cache server) you can probably get something to cache that shouldn’t. Definitely check out the research as there are a bunch more discrepancies and nuances worth reading about: Gotta cache 'em all: bending the rules of web cache exploitation | PortSwigger Research

Apache Confusion Attacks

Orange Tsai dropped some very on-brand research which aims to break the internet yet again, this time targeting the Apache HTTP server.

This research focuses on confusion attacks; exploiting how different modules in the Apache ecosystem interpret different input in request fields resulting in a plethora of bugs ranging from:

  • Filename Confusion: Ambiguities in how modules handle file paths, potentially leading to unauthorized access.

  • DocumentRoot Confusion: Exploits differences in how modules determine document roots, allowing access to restricted files.

  • Handler Confusion: Inconsistencies in request handling, enabling attackers to bypass access control and execute unauthorized scripts.

This research alone clocked up a bug count of:

  • Nine vulnerabilities across multiple versions of Apache.

  • Over 30 different attack techniques targeting various modules and configurations.

Unfortunately for us, patches got pushed out pretty quickly due to the wide-ranging impact these would have on well, most of the internet. It’s worth a read, check the full research out here: https://blog.orange.tw/posts/2024-08-confusion-attacks-en/

Lots of incredible research post Defcon in this drop.

As always, keep hacking!