6.5 C
New York
Tuesday, February 25, 2025

Configuration options in BreakingPoint for running LLM prompt injection strikes


As part of the recent ATI release ATI-2024-19, ATI introduced a new set of capabilities in the BreakingPoint product, as part of the security engine mainly focusing on LLM prompt injection. There are new strikes focused on LLM prompt injection together with new options (“evasions”) to configure those Strikes.

The purpose of this blog is to present various strikes and options available as part of this release.

The Strikes intend to force an LLM model to behave in some different way than it was previously instructed to do. The target of those attacks would be to gain information about nefarious topics and actions (“how to make a weapon”), force the model to output the initial instructions or so on. To run these Strikes a user has several options which will explore briefly in the following paragraphs.

Preconfigured test

The easiest way to get started running these Strikes would be to use a preconfigured test. There are 3 preconfigured tests ready to run as shown below.


Fig 1. AI LLM Prompt precanned tests in BPS

Running one of these tests would result in traffic being sent which is similar to the traffic shown in the image below.


Fig. 2 Network PCAP of the HTTP AI application data

Going further we will dive deeper into these preconfigured tests in order to get a better understanding of what is going on. Depending on the test selected from the selection presented above, BPS will send traffic crafting packets specifically to target either OpenAI HTTP API endpoint, Gemini HTTP API endpoint or all the transports – for now the two mentioned before. The strike list used by these tests is the same, covering all the strikes which have been developed so far for this use case. We will look more into the list of strikes later. For now, we will focus on the evasion profile used by these strikes as this is the main difference between them.

Preconfigured evasion profiles

The list of all the LLM specific evasion profiles is shown below.


Fig. 3 AI LLM Evasion Profiles in the BPS

Depending on which preconfigured test selected, the corresponding evasion profile will be used. The main point of difference between these evasion profiles is the LLM option group. Below you can find the options for the three preconfigured evasion profiles.


Fig. 4 LLM evasions for “All” TransportLayer (field)


Fig. 5 LLM evasions for “Gemini API TransportLayer (field)


Fig. 6 LLM evasions for “OpenAI API” TransportLayer (field)

Preconfigured StrikeList

There is only one preconfigured StrikeList for Prompt Injection Strikes and that is AI LLM Prompt Injection Attacks.

This StrikeList includes various prompt injections from the DAN family (Do Anything Now) and one special Strike which sends random prompts from this category. You can see the details for this Strike list below.


Fig. 7 List of ”prompt_injection” Strikes—all of the Strikes in the ”AI LLM Prompt Injection Attacks” StrikeList

Prompt Injection Strikes

Depending on the Strike selected BPS will send a specific prompt and a question for those prompts which support including a question. Take as example DAN-Evil-Bot-Prompt and DAN-STAN-Prompt.


Fig. 8 PCAP of the Strike for a DAN prompt with a static prompt in the sense that there are no additional questions included in this prompt.


Fig. 9 PCAP of the Strike for a DAN prompt with an additional forbidden question added (highlighted).

Depending on the evasion options used this can customize the traffic sent including the category for the forbidden question used.

Evasion Options

The options for the LLM Prompt Injection Strikes are under a dedicated option group called LLM. This section has the following options: TransportLayer, BearerToken, GoogleApiKey, GoogleApiKeyInHeader, Model, ApiVersion, ForbiddenQuestionCategory, NumOfPrompts, Temperature, MaxOutputTokens and topP.


Fig. 10. LLM evasion options in the BPS

TransportLayer

The Transport Layer option decides how the prompt will be delivered over the network; this is referring to the application service option, not the L4 transport. Currently three options available: OpenAI API over HTTP, GemeniAPI over HTTP and All. The first two are transport standalone where the option All instructs the engine to iterate over all the transport and to send the prompt using all of them. There will be more to come in the future.


Fig. 11 TransportLayer Evasion field option values


Fig. 12 PCAP of the OpenAI API over HTTP application traffic


Fig. 13 PCAP of GemeniAPI over HTTP application traffic

BearerToken, GoogleApiKey and GoogleApiKeyInHeader

As both OpenAI API and Gemini API require some sort of authentication, the BearerToken and GoogleApiKey evasion options allow users to specify some specific value for those keys instead of randomly generating them.

The option GoogleApiKeyInHeader allows to select whether the API key will be sent in the first header or as dedicated header. The difference between the two can be seen in the screenshots below.


Fig. 14 PCAP showing GoogleApiKeyInHeader


Fig. 15 PCAP showing GoogleApiKeyInHeader in separate header

Model and ApiVersion

The Model and ApiVersion evasion options are strings used to represent which model to be used and which API version of the provider. The change the path or the body of the requests depending on the Transport used.

ForbiddenQuestionCategory

Some of the prompts allow additional questions to be sent. These prompts are selected from a specific category. The option which allows selecting these categories is ForbiddenQuestionCategory. Note that All is a special category which informs the engine that any of the other categories can be used for selection.


Fig. 16 The optional categories for additional forbidden questions. NumOfPrompts

We mentioned before that there is one special strike which allows sending random prompts. This strike has one additional option called NumOfPrompts which allows selecting how many prompts it should send.

Temperature, MaxOutputTokens and topP

The last options are Temperature, MaxOutputTokens and topP. These are hyperparameters of the Large Language Model used to modify the behavior of the text generation.

Leverage Subscription Service to Stay Ahead of Attacks

Keysight’s Application and Threat Intelligence subscription provides daily malware and bi-weekly updates of the latest application protocols and vulnerabilities for use with Keysight test platforms. The ATI Research Centre continuously monitors threats as they appear in the wild. Customers of BreakingPoint now have access to attack campaigns for different advanced persistent threats, allowing BreakingPoint Customers to test their currently deployed security control’s ability to detect or block such attacks.



Source link

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles