How AI hallucinations are making bug hunting harder

Bug bounty programs that pay people for finding bugs are a very useful tool for improving the security of software. But with the availability of artificial intelligence (AI) as seen in the popular large language models (LLMs) like ChatGPT, Bard, and others it looks like there is a new problem on the horizon.

Bounty hunters are using LLMs not only to translate or proofread their reports, but also to find bugs.

Daniel “Haxx” Stenberg of cURL explains in a blogpost why he sees this as a possible problem. CURL is a computer software project providing a library and command-line tool for transferring data using various network protocols. The name stands for Client for URL. Daniel is the original author and currently the lead developer.

He argues that, for some reason, bug bounty programs also attract fortune seekers that are looking for a quick buck without putting in the necessary work. According to Stenberg, developers could easily filter out these fortune seekers before they had access to LLMs.

The source of the problem lies in the bad habit of some LLMs to “hallucinate.” LLM hallucinations is the name for the events in which LLMs produce output that is coherent and grammatically correct but factually incorrect or nonsensical.

This is a problem for developers because they can often discard nonsensical reports from humans only after a short examination. But reports generated by AI look coherent, so they waste a lot more time.

In the mystery of the CVE’s that are not vulnerabilities we saw how the automated submission of bugs had raised issues before that wasted a lot of developer time. Time the developer would like to use to fix real bugs or work on new features. As Daniel put it:

“A security report can take away a developer from fixing a really annoying bug. because a security issue is always more important than other bugs. If the report turned out to be crap, we did not improve security and we missed out time on fixing bugs or developing a new feature. Not to mention how it drains you on energy having to deal with rubbish.”

This is especially annoying when the person that submitted the bug is unable to respond to additional queries in a way that clarifies the issue. Which is often the case since that person has no idea why the LLM flagged the submission as a bug in the first place.

In several areas people are working on tools that can recognize content created by AI, but these are not a full solution to this particular problem. Bug bounty hunters also use LLMs to translate their submissions from their native language to English. Which is often very helpful. But if a recognition tool were to discard all those submissions, they might end up ignoring a serious security vulnerability just because the bounty hunter wanted to submit a report in perfect English.

In the future, AI will undoubtedly proove to be useful in finding software bugs, but we expect these tools will be deployed by the developers themselves before the software goes live.


Our business solutions remove all remnants of ransomware and prevent you from getting reinfected. Want to learn more about how we can help protect your business? Get a free trial below.

https://blog.malwarebytes.com/feed/

Leave a Reply