At Keysight, we continually explore new technologies that can help us better help our customers. This includes ChatGPT, a natural language artificial intelligence (AI) chatbot that has been trained on a large section of the knowledge available on the internet.
Critically minded individuals quickly exposed numerous flaws: it struggles with basic math, its safety mechanisms can be circumvented through prompt injection, and it frequently fabricates information while maintaining a convincingly authoritative tone. Our experts focused on exploring how ChatGPT can be used in source code analysis.
During the course of the investigation, ChatGPT was used to review a C file and identify vulnerabilities and other issues. The purpose of this investigation was to determine how the ChatGPT language model could be used to help our security analysts during code reviews. (No codebases under NDA were harmed during this research. ChatGPT is run as a service by OpenAI, which undoubtedly ingests, stores, and analyzes all conversations.)
We fed ChatGPT a C file with many known vulnerabilities. Overall, ChatGPT was able to identify some vulnerabilities in the C code, including a hardcoded password, a potential command injection vulnerability, and a bug where only the first three characters of a password were checked. It also had some false positives, including instances of making up source code that was not present in the original file. ChatGPT struggled to distinguish between command injection and buffer overflow vulnerabilities, but demonstrated a good understanding and general knowledge of memory exploitation and mitigation techniques. An interesting feature of ChatGPT is that it is able to summarize and explain code blocks in human language, which can help a security analyst quickly get a high-level understanding of a code block.
In addition to reviewing code, ChatGPT was also able to take a crude description of a vulnerability and expand it into a mostly correct finding description. It was able to provide risk ratings that matched the ratings given by a human analyst in 4 out of 4 cases and provided some decent suggestions for countermeasures. ChatGPT also demonstrated some background knowledge of fault injection and was able to generate an example code that implemented a countermeasure for flow integrity after being provided with more specific information.
Overall, while ChatGPT has some potential as a tool for security analysts, it is not currently accurate enough to fully rely on for code review. A security analyst can use it as a tool, as long as they know the limitations. The main limitation currently is that ChatGPT runs on a service that logs data and therefore cannot be used for any confidential code.
However, it may be useful for efficiently writing reports and explaining functions and modules in human language, which could potentially speed up reverse engineering efforts. It may also be interesting for developers to explore its code-generation capabilities. Code review using language models like ChatGPT is an active area of research, and such tools could be improved over time to be more accurate and reliable.
Do you have any questions about using artificial intelligence for security and source code analysis? Contact us at [email protected].