[HackerNotes Ep. 54] White Box Formulas - Vulnerable Coding Patterns

HackerNotes Goes Live, Joel scrapes H1 Bounty Data, Critical Gitlab CVE Leads to ATO, LLM Attacks and Code Review Tips

Hacker TLDR;

  • HackerNotes Gets Pushed to Prod: HackerNotes has been officially launched!

  • Joel scrapes H1 bounty data: Joel scrapes HackerOne, uncovering $9m in bounties paid over the last 90 days, with the top 10 programs contributing over 50% of the overall payouts.

  • GitLab CVE: Critical ATO with no user interaction. This occurred due to passing an array as the email for the password reset meaning the token is sent to the attacker’s email too.

  • New attack vector against LLMs with invisible input: Some Unicode tag characters may be invisible to us but they aren’t to LLMs.

  • Design Patterns and Code Review: Joel and Justin brain-dumped a bunch of patterns to be on the lookout for. Here is a summary of what can be found below:

    • Sanitisation, Then Modification of Data

    • Auth Checks Inside the Body of an ‘if’ Statement

    • Check for Bad Patterns with an 'if,' But Then Don't do Any Control Flow

    • Bad Regex

    • Replace Statements For Sanitization

    • Type Confusion

HackerNotes Gets Pushed to Prod

This platform is all about notes by hackers, for hackers. Whether you find yourself without headphones or can't listen to the entire podcast, we've got you covered. Your feedback is invaluable to us, so feel free to reach out on Discord or Twitter to share your thoughts.

Joel scrapes H1 bounty data

In a recent exploration of the bounty economy, Joel delved into scraping some program data, uncovering fascinating trends that shed light on the distribution of payouts. A noteworthy discovery revealed that:

  • 75% of the programs account for merely 10% of the total payouts.

  • On the flip side, the top 10 programs emerge as major players contributing to a significant 50% of the overall bounties.

While focusing on smaller programs may reduce the occurrence of duplicate submissions, it also comes with a trade-off—less program history when it comes to paying hunters. This leads to a strategic question: Could it be more beneficial for budding hackers to concentrate on a smaller program, establishing a niche to develop both reputation and payouts simultaneously?

Joel did add the caveat that not all programs were included - around half of the programs on HackerOne were scraped.

I’m hoping Joel keeps this scraping project going. We could start to see some interesting data that bug bounty programs don’t natively supply which hunters could use to help guide which programs to hack on.

Unveiling the GitLab CVE: Password Reset Flow Exploitation

Now the chances are you’ve probably seen or heard news of the GitLab CVE which dropped last week, putting the spotlight on a vulnerability within the applications password reset flow resulting in an account takeover requiring no user interaction.

In a standard password reset scenario, you'd input a single email address, something like &[email protected]. However, GitLab's vulnerability allowed for an array to be supplied instead of a single value. Your hacker sense should probably be tingling by now if it isn’t already.

To exploit this flaw, an attacker could send a relatively straight forward payload containing an array of emails such as user[email][][email protected]&user[email][][email protected] - sending this payload would result in both the target email AND the attacker email receiving the password reset token.

Joel dug into some previous code it appeared to stem from the requirement to send password reset tokens to any verified email associated with a given account. This is due to GitLab’s support to be able to associate more than one email with an account at any given time.

GitLab pays between $20,000 - $35,000 for a critical bug, so this one definitely resulted in a pretty sweet payday for the hunter.

Invisible Prompt Injection

An intriguing prompt injection bug surfaced on Twitter by Riley Goodside aka Goodside, enabling the transmission of invisible instructions through pasted text. This exploit capitalizes on the ability to pass Unicode tag characters which are often used in conjunction with emojis; these don't render properly but remain readable to large language models (LLMs). The potential impact and possibilities opened up by this vulnerability are noteworthy.

The full thread detailing the discovery of the bug can be found here and for those interested in delving deeper into this discovery, Rez0 has shared a small Python script designed to craft payloads for exploiting this particular bug.

LLM-related bugs are still a relatively new bug class. I think seeing this class of bug being exploited and how it’ll be leveraged in the wild is going to make for some really interesting reading.

Code review And Design patterns

Code review is one of them things where if you haven’t had much exposure to it or development experience, it can be daunting and quite time-consuming to get started.

The good news is however, there are some common patterns you can look for if you are approaching a new code base and you’re looking for a starting point.

“..Context is key” - Joel, 34:50

Context is essential if you stumble upon some of these patterns in your code review endeavours. Depending on the context, they might not necessarily lead to vulnerabilities in all instances, but what we can say is it’s probably not the best security practice regardless of whether it's vulnerable or not.

One of the key questions you might be thinking is what context matters? In my personal experience, to answer this question you have to spend some time with the component you are reviewing and ask:

  • What is the purpose and use case for this functionality?

    • Depending on what you're reviewing, take into context the purpose of wider application, integrations and so on

  • How is this code meant to be accessed/called in a standard workflow?

  • Where could this code be accessed from?

Getting hands-on experience implementing is also a great way to understand the code you’re looking at and gain that context you could be missing. It’s also worth noting code review can be extremely nuanced, but now we’ve got the context covered let’s jump into some common design patterns.

Design pattern: Sanitisation, Then Modification of Data

Now if you’ve listened to the pod already you know this one caused a bit of discussion between Joel and Justin initially. A common code pattern which usually doesn’t make much sense and can lead to developers undoing any form of sanitisation can be seen in the below PHP snippet:

<?php
$x = sanitize_html($_GET['a']);
$x = urldecode($x);

Here, this small snippet of code is simply grabbing the ‘a’ parameter from a GET request, using sanitize_html and immediately calling urldecode function.

When two functions which essentially undo each other are being sequentially called on the same input can be a big clue the developer has misunderstood the purpose of the functions. If you’re depending on a function to perform sanitisation but then modify the same data immediately afterwards, the chances are you could have reintroduced some of the input that the sanitisation removed.

As we said earlier, context is key here and depending on how this input is used after depends on if it's a problem or not. Is this now used to build out an HTML document or inserted into a template? Then this could become a problem.

Remember, POC or GTFO.

Design pattern: Auth Checks Inside the Body of an If Statement

Auth checks inside of if statements can be an interesting one. With the nature of an if statement, if we as attackers control the arguments supplied to the if, we can naturally divert the execution flow of the statement.

In the below code snippet, we have exactly that - we’re only calling the check_auth functioning the GET variable input exists:

<?php
if ($_GET['input']);
	check_auth();
}
do_something_else();
?>

The reason why this is one to look out for is it can be a requirement for code to only be accessible from an authenticated perspective, so having a condition around the check in an if statement could suggest a condition where the auth check doesn’t happen properly, or at all.

Although the snippet above is completely hypothetical for example sake, if we don’t supply the input variable, we hit the else branch which calls another function. How could we abuse this from an attacker's perspective?

Now Joel mentioned de-denting which essentially flips the logic, resulting in an if not - the sample below demonstrates this:

<?php
if (!isset($_GET['input'])) {
    check_auth();
}

do_something_else();
?>

This practice can help to write cleaner code, avoiding the need to write a massive if statement checking various conditions. Here, however, we’re dependent on the functions exiting or returning properly. If they aren’t, it could lead to some unexpected logic flaws.

Design pattern - Check for Bad Patterns With an 'if,' But Then Don't Do Any Control Flow

Sometimes, we all start something and then forget to finish it and it's no different in code. If you are digging around in a code base and stumble upon a check for a bad pattern - whether that be malicious input, user permissions or so on, make sure to trace the control flow.

Below we have another snippet of code which checks for a get parameter (user), and a check if it starts with _company :

if ($_GET['user'].startswith("_company")){
	$deny = true
}
//who cares, moving on..

Now we can clearly see an if with the check on the user, but it doesn’t go further than that. There’s no logic around what to do with the user, what to do if the user is denied, etc.

If you see this pattern, make sure you trace the flow of the code. Who knows where your input could end up - in a vulnerable sink, or calling an unforeseen else resulting in another vulnerability entirely!

Design Pattern - Bad Regex

Regex is a common go-to in code for pattern identification - if you approach a large code base, I’d almost guarantee there will be regex hiding in there somewhere.

If you aren’t familiar with regex, regex is essentially a syntax which allows you to match specific patterns in a string. You can imagine how useful this can be, especially when we start thinking about security controls.

In this small snip below, we have a regex which is meant to match the domain www.github.com:

re.match("www.github.com", origin)

To anyone familiar with regex, you’ll no doubt see a few problems here immediately. If you aren’t so familiar with regex, it’s important to know how regex works and why this is a problem.

As we said earlier regex uses syntax to match patterns, so special characters as we see them such as the . have an entirely different meaning in the context of a regex. Above we have:

  • An unescaped ‘.’, in regex ` ‘.’ character means match ANY character

  • Missing ^ or $ characters - these are used to match the regex on line starts and line endings. More on this shortly.

Regex101 is a great resource when it comes to all things regex - definitely use this if you don’t already. Using the above regex we can see a breakdown of the Regex and can test inputs the regex matches on:

Our two test strings also matched, both www.github.com and wwwbgithubbcom. Can you see from an attacker's perspective how this might come in useful?

As we touched upon earlier, regex has its own syntax for specifying pattern matches. Without giving you an entire lecture, some important takeaways from this are:

  • Look for regex in URL pattern matching. If things aren’t escaped properly as we demonstrated above, it could be a lead for a bypass.

  • Capture groups are used to capture and extract a specific part of the matched text. Capture groups can sometimes be used to match on strings which also include our attacker-controlled input - useful if you are trying to pass a capture group which checks for an email for example.

  • .* is sometimes used as a catch-all, but * matches the previous value 0 to unlimited instances, whereas .+ matches 1 to unlimited instances.

  • Multiline and capitalization flags can be used to modify the matching behaviour on the supplied input. If the multiline input is supplied but there's only matching happening on a single line you could smuggle in a payload. Equally, if something is expected to be a certain capitalization and you can modify the case, it may miss the regex entirely.

  • Specify the appropriate language on Regex101 when looking at a regex. Each language has its nuances which could be the difference between exploitable and unexploitable behaviour.

Design Pattern - Replace Statements for Sanitization

Sanitization can sometimes be seen in replace statements. Typically a developer might be aware of a specific input which could be malicious and wants to ensure it gets stripped out or even replaced.

Take for example a file path argument for a file download service. A developer might be aware of the potential implications of malicious user input and attempt to escape any ../ characters as seen below:

input = "....//etc/passwd"
input = input.replaceAll("../")
print(input)

// "../etc/passwd"

As attackers, we could use this to our advantage to normalize our payload by simply adding additional characters we know will be removed. This is especially effective when they aren’t replacing the string with any other characters, instead simply removing them.

Design pattern - Type Confusion

Now when we are dealing with languages which aren’t statically typed, meaning variable types are not explicitly declared at compile time, we can sometimes provide a different data type as the input.

A great example of this is the GitLab bug we spoke about earlier - instead of supplying a single email address, we supplied an array of emails which resulted in two emails being sent. It’s one of those vulnerability categories which can usually lead to some pretty severe bugs, searching for CVEs related to type confusion flagged countless criticals and highs.

If you want to get a bit more hands-on with this vuln type we recommend checking out Snyk’s resources here.

Navigating a vast codebase during a code review can be a tough task, but fear not! These design patterns are here to fine-tune your hacker sense and make the process more manageable.

Until next time, happy hacking and stay curious!