Vulnhuntr, a static code analyzer using large language models (LLMs), discovered over a dozen zero-day vulnerabilities in popular open-source AI projects on Github (over 10,000 stars) within hours.
These vulnerabilities include Local File Inclusion (LFI), Cross-Site Scripting (XSS), Server-Side Request Forgery (SSRF), Remote Code Execution (RCE), Insecure Direct Object Reference (IDOR), and Arbitrary File Overwrite (AFO).Â
Vulnhuntr is a security tool that utilizes Large Language Models (LLMs) to discover remotely exploitable vulnerabilities in Python codebases, which overcomes limitations of context window size in LLMs by analyzing code in small chunks and intelligently requesting relevant parts.
It then reconstructs the call chain from user input to server output to confirm vulnerabilities by employing various prompt engineering techniques to guide the LLM toward a comprehensive analysis.Â
Join ANY.RUN's FREE webinar on How to Improve Threat Investigations on Oct 23 - Register Here
While currently limited to Python and focusing on specific vulnerabilities, it offers a significant improvement over static code analyzers in identifying complex multi-step vulnerabilities with minimized false positives and negatives.
Researchers explored retrieving Augmented Generation (RAG) and fine-tuning large language models (LLMs) to identify vulnerability call chains in code.Â
RAG proved inaccurate due to ambiguity in function names, while fine-tuning models yielded high false positives and struggled with multi-file vulnerabilities, while static parsing, particularly for dynamically typed languages like Python, presented challenges due to runtime modifications and limitations of static analysis tools.
The solution involved providing the LLM with the exact line of code where a function is called, along with the function name, which allows for targeted file and function location within the project, improving call chain accuracy.
It is a tool that utilizes Large Language Models (LLMs) to detect vulnerabilities in Python code by analyzing files handling remote user input and identifying potential weaknesses, as users can run it on entire repositories or specific files for better efficiency.
Vulnhuntr assigns confidence scores (1–10) to findings, with higher scores indicating a greater likelihood of an actual vulnerability.
In the future, as LLMs become more powerful, static code parsing might become less important, but focusing on the call chain between user input and server output should still improve accuracy in vulnerability detection.
Protext AI identified several critical vulnerabilities, where an RCE vulnerability allows attackers to execute arbitrary code due to a lack of input validation in a custom component functionality.Â
SSRF vulnerabilities exist in multiple functions where user-controlled URLs are used without proper sanitization, potentially enabling internal resource access.
An IDOR vulnerability in a PUT endpoint lets attackers modify messages they shouldn’t have access to, while LFI and AFO vulnerabilities arise from insufficient filename sanitization during file uploads.Â
XSS vulnerabilities are present due to a lack of output encoding and user-controlled content handling, which pose a severe security risk and require immediate attention.Â
How to Choose an ultimate Managed SIEM solution for Your Security Team -> Download Free Guide (PDF)