Critical Thinking - Bug Bounty Podcast
Posts
[HackerNotes Ep. 122] We Won Google's AI Hacking Event in Tokyo - Main Takeaways

[HackerNotes Ep. 122] We Won Google's AI Hacking Event in Tokyo - Main Takeaways

In this episode of Critical Thinking - Bug Bounty Podcast your boys are MVH winners! First we’re joined by Zak, to discuss the Google LHE as well as surprising us with a bug of his own! Then, we sit down with Lupin and Monke for a winners roundtable and retrospective of the event.

Yujilik
May 19, 2025

Hacker TL;DR

CTBB Job Board: We've launched a job board, and we already have some exciting positions from Zak. After this episode, you might want to check it out - especially if you're interested in working with him. We created this board to connect skilled hackers like you with curated opportunities at great companies. Check it out: https://jobs.ctbb.show
Zak (Bring a Bug): Really cool to have Zak, the Sec Engineering Manager at Google, sharing a bug. Zak's bug was a classic RCE via memory corruption: heap overflow → info leak → code execution. Simple chain, but with clever exploitation.
The target's huge infra meant they couldn't predict where uploaded files would land. Instead of spray-and-pray, they used conditional corruption using a heap-groomed object as reference point, if the file didn't hit the right machine with the right memory layout, it would safely error out.
Google works WITH the hackers: Google's LHE approach is unique - they had a ton of engineers there to help with questions about features, infrastructure, and application insights. This collaborative approach extends to both their LHE and VRP programs, really cool to hack Google with their help! Google takes all vulnerabilities seriously, even those that seem low impact or hard to exploit. With billions of users, what's "practically impossible" elsewhere becomes feasible at Google's scale, when 1% of your userbase is bigger than most companies' total users, those numbers add up.
Making LLM Output Predictable: Companies are optimising both product quality and backend efficiency. Some LLMs now interpret our input's logic, then they act upon their interpretation of our input instead of the input itself. Understanding this process is key for hacking, since LLMs handle our inputs differently in the background. Reading research papers really helps make exploits more stable.
Image generation systems typically have multiple filters: one checks the input prompt, another validates the generated output, and some filtering happens during training. Understanding these helps with bypassing them. For example, with input filtering, you can mix tokens from different languages - check out the tokenizer here. More on this below.

Two things before we start:

A heads-up for this week's HN: The team couldn't share details about the bugs they found due to restrictions. If you're here purely for the technical content, we’re sorry, we did our best, but our hands were tied. Still, we hope you enjoy the episode! Hahah
We're launching a job board, and Zak posted some jobs there, so go take a look. I'm pretty sure that after watching/reading this episode, you'll be inspired to go work with this guy.
The main reason for making this job board is to curate some good opportunities for both our community and the companies, because we know that if you're reading this, you're a very skilled hacker.
Check it out: https://jobs.ctbb.show

Bring a bug!

Really cool to have Zak, the Sec Engineering Manager at Google talking about a bug he found. We are unfortunately not going to have too much detail on any bugs this episode.

Zak's bug was a classic RCE via memory corruption: heap overflow → info leak → code execution. Not the most complex chain, but what made it cool was the exploitation logic.

The target had a massive infra, so they couldn’t predict where their uploaded file would land. One option would’ve been to spray and pray until it landed on the right place, but that’s not very good for obvious reasons. Instead, they came up with a cleaner trick, conditional corruption.

They already had a leak, so they could fingerprint the target machine. Then they crafted the exploit so that unless it hit a specific memory layout, only found on the intended box, it would just error out safely. But if the file landed where it should, the conditions matched, and the overwrite kicked in. Really cool that instead of spraying and potentially messing up with a ton of stuff, they still sprayed but the exploit would only work on the right machine.

They used a heap-groomed object (manipulating memory allocation so data ends up in expected locations) as a reference point—one of the known offsets. The payload would use that object to derive the location of the next target. On the wrong machine, that object wouldn’t be at the expected spot, so the calculation would fall apart.

Zak’s Career

Zak worked for the government for some time and then at Blizzard, and three years ago, he started working for Google. About 1.5 years ago, he got into AI security, he’s currently the Security Engineering Manager for AI-related products.

About the Google LHE, the CT Squad won 3 out of 4 awards:

MVH for the CT team
Most Creative Bug for Ciarán
Best Meme Award to Lupin

Google does LHEs differently from other companies. One of the coolest things was that Google sent a huge team to help the hackers, there was like one engineer for each hacker and they could ask a lot of questions during the event. It’s crazy how helpful they were, because Google actually wanted to be hacked so they could fix as many bugs as possible. Every time the hackers got stuck, there was always someone to help them understand the feature, the infra, give them some insights on the application, etc. When you think about it, this approach makes perfect sense, these are top-tier hackers who would figure things out eventually, so why not help them maximize their time and make the most of the experience, right? Really cool to see Google working WITH the hackers on both LHE and VRP.

By the end of the event, hackers dropped around 70 reports targeting AI products and features. One interesting characteristic of Google: even if a vuln seems really tricky to exploit or low impact, Google still takes it super seriously. Makes sense when you think about it, with billions of users and such a massive attack surface, what seems "practically impossible" elsewhere might actually be feasible at Google scale. AI attacks are inherently unstable, it's pretty common to see attacks that only work like 10% of the time. But to Google, a low success rate doesn't really matter. Even 1% of Google's user base is larger than 99% of all other companies' total users. That's just how massive Google's scale is.

Key takeaway from this event: even with a limited scope, they discovered 70 impactful vulnerabilities, which means that there's still a lot more to be discovered even in such a hardened company like Google. AI has evolved a lot too — a few months ago, the threat model was pretty difficult to understand, but now it is in so many places. People have been integrating AI into so much stuff that it has become a major attack vector for a lot of companies.

CT Hacker House

For the second part of the podcast, now at the Hacker House where the boys stayed: Since this part was very long and they couldn't talk too much about the techniques, I have an idea for this part of this week's HackerNotes — let's do a bullet-point rundown of the key takeaways. I'll make them very concise with as many resources as possible.

It's really interesting to see how AI technologies are evolving, what companies are doing not only to make the product better but also to make it more efficient on the back-end. For example, there are some LLMs that first interpret the logic of our input, then process it in an improved version. It's important to understand how this works to know how to hack when the LLMs are processing our inputs in different ways in the background. Another important part is that nowadays when you understand how LLMs process our data it becomes a lot easier to predict how they’re going to behave. So in order to stabilise the exploits, reading research papers helps a lot.
To better understand this part, check his blog post.
The boys got into a debate about whether it's worth nerding out on prompt engineering or just sticking to the good old natural language. Lupin's all in on prompt engineering and thinks the right way is to understand function calls, tool calls, and to speak the LLM’s behind-the-scenes language, telling it exactly what to do (or not do).
Monke and Rez0, though, are more like “nah, just talk to it like a human.” They argue that LLMs are starting to pick up on how people try to make them misbehave based on prompt phrasing. So instead of overengineering stuff, they say you’re better off just telling it something like you lost both your hands and you’re going blind, and hoping it feels bad enough to help. I mean… what if you actually did lose both hands and are going blind, right?
When you want to generate images, for example, there could be some filters in place. One is in the input, it checks your prompt. The other one is in the output, it generates and looks at what it has generated and decide whether it’s allowed or not. Some might say that there is also a filter in the training, like the AI was not trained to generate bad stuff, understanding each filter can help you bypass them.
For filter 1, for example, it’s sometimes possible to mix the tokens in different languages, you get a word that is split in 3 tokens for example and each token is a different language. You can learn about tokens here.
Quick example: guitar amp → amplificador + усилитель + 吉他放大器 = amил器
For filter 2, the boys debated about like it is sometimes possible to generate something if you ask the AI to generate it in inverted colours, or make the skin tone (if you’re generating a person) something really crazy like green skin, or add a glitch effect to the image, apply a vintage filter, etc.
How they won the LHE: Justin delegated tasks based on what they needed for their bugs to work. And when they deviated from the “path”, Justin was there to remind them that there was a 2x bonus for bugs that were in the scope so they should focus on what was going to get them there. Every time someone had a good lead, a member would go do their research on that and report back when they found something interesting. So they were always working on different things with a common objective.

That’s a wrap for the week!
As always, keep hacking!