Techno Blog

Chronicles from the Digital Era

Chronicles from the Digital Era: Cybersecurity Watch and Expertise

Recherche & expertise

Grep your 0days

Reading time:

12 minute(s)

-

30 April 2026

Grep your 0days - Precicom's Blog

AI, LLMs and Vulnerability Research

Once upon a time in vulnerability research, there existed only 2 methods of discovering vulnerabilities, Code Auditing, and Fuzzing.

Code auditing is the tedious task of manually reading code and trying to find vulnerable patterns and determine wheter they are reachable and exploitable. Historically, this process was only feasible if the auditor possessed strong knowledge of the targeted software type, the programming language used and of course basic software engineering knowledge. However, this task can be extremely hard when dealing with massive software with millions of lines of code (LOC) such as operating system kernels or web browsers. Even the maintainers of such codebases often don’t understand the entirety of the software. As a consequence, researchers often target one specific attacks surface to increase chances of success.

Fuzzing, on the other hand, was the pinnacle of automated vulnerability research, feeding random inputs that are mutated repeatedly until they reach a vulnerable code path and crashes , with coverage guided fuzzers like AFL++ and fuzzilli being the industry standard . This method requires some basic knowledge of configuring fuzzers and choosing an attack surface to target and leaving it up the machine to do the heavy lifting.

These two methods were the bread and butter of zero-day research, both demanding a high skillset, deep technical knowledge, custom tooling and a sprinkle of luck. But fortunately a new player just came in town : AI automated vulnerability research. Large Language Models (LLMs) are increasingly demonstrating value in the process of bug hunting and exploitation, from decades-old kernel bugs, to modern Windows zero-days and remote code execution (RCE) flaws in browsers. With no shortage of zero-day training material publicly available, it is reasonable to assume that any modern LLM has seen more vulnerable code than what any single researcher could encounter over a lifetime.

Firefox Security Vulnerabilities by month

The craft hackers spent years perfecting is being reduced to a single prompt and a respectable token balance. This shift has the potential to make advanced vulnerability discovery more accessible to anyone with access to this technology, enabling the identification of critical flaws in major software systems. At the same time, threat actors are likely to leverage these capabilities to scale their bug-hunting and exploitation efforts, particularly state-sponsored groups with significant budgets and highly skilled personnel.

Many are already worried if AI has become a more effective hacker than humans.

To find out, I conducted a small experiment. A few weeks ago I read an article on how Claude found a zero-day in radare2. The researcher pointed him to the code base and after a few prompts Claude generated a working command-injection proof-of-concept (PoC). I found this very impressive, especially with the prompts being very straightforward. Interestingly however, Claude failed to find memory corruption issues.

"Claude struggled to prove, let alone exploit, the memory corruption vulnerabilities. It did appear to be making progress on a "heap leak" issue"

"Claude tried hard to exploit various stack and heap buffer overflows, but failed to trigger a single ASAN crash."

I was curious why Claude didn’t find memory corruption issues. A reverse engineering software like radare2, written entirely in C, would be expected to contain some bugs, perhaps not easily exploitable ones, but definitely something leading to an Address Sanitizer (ASAN) crash.

As an enthusiast of memory corruptions, I decided to take a look for myself.

I cloned radare2, and launched some Semgrep rules to find some low-hanging fruits before diving deep and reading line by line. It is worth noting that I have never looked at this code base before and I have absolutely no idea where to look.

As I am stepping through the findings, I identified two vulnerabilities within a single code path ; a pointer freed, then read again and then freed again.

Grep your 0days - 01
Grep your 0days - 02

The bug in gdbr_threads_list() was caused by a cleanup label end that frees the allocated npid list in case of an error.

"Gold is where you find it"

This is the first and most important rule of looking for gold and it applies directly when looking for vulnerabilities.

By reviewing a few lines above the vulnerable function we found the same exact error handling logic implemented in another function, gdbr_pids_list(). This is our second distinct use-after-free (UAF)/DOUBLE FREE vulnerability in 20 minutes.

Grep your 0days - 03
Grep your 0days - 04

These bugs were in the GDB remote debugging feature, allowing you to use radare2 against a GDB server running a binary, so they can be potentially exploited remotely for code execution.

My question now was how can the best LLM in the field, capable of finding all types of vulnerabilities in major software, miss something this trivial ? Even a simple grep free() would have landed him in that code path.

AI as an enabler, not a replacement

Which brings us up to the conclusion, this technology is a great scaler, allowing access to infinite amount of AI hacker agents who will find critical vulnerabilities across all types of software. However, seeing how it still misses obvious findings, it will not replace human researchers, especially elite hackers capable of writing magic like multichain exploits. Instead, AI should be viewed as another tool in the researcher’s arsenal that enhances productivity and expands coverage, rather than replacing deep expertise.

Precicom: IT Management, Cybersecurity, and Digital Innovation
Precicom: IT Management, Cybersecurity, and Digital Innovation
Precicom: IT Management, Cybersecurity, and Digital Innovation
Precicom Technologies - cube noir
Saad Elharaj

Security engineer with hands-on SOC operations experience in threat detection, incident response, and enterprise infrastructure defense. Actively conducting security research across multiple domains.

This content might be of interest to you.

Cloud Solutions: Essential for Agile and Secure Businesses.

Disponibilité

Cloud Solutions: Essential for Agile and Secure Businesses.

Alexis Cadorette

DevSecOps Team Leader

5 minute(s) »

Tabletop - Incident Simulation (TTX)

Conformité

Tabletop Exercise (TTX): enhancing your organization’s preparedness and responsiveness

Martin Dagnault

Cyber Resilience Team Lead

4 minute(s) »

Precicom: IT Management, Cybersecurity, and Digital Innovation
Precicom: IT Management, Cybersecurity, and Digital Innovation
Precicom: IT Management, Cybersecurity, and Digital Innovation
Precicom Technologies - cube noir
Your unsubscription could not be processed. Please try again.
Your unsubscription has been successfully completed.

Unsubscribe from our mailing list

No longer wish to receive our electronic communications? Please fill in the field below and click on "Unsubscribe," and we will stop sending you our tech and event newsletters.