Detects attempts to manipulate or bypass the intended behaviour of language models through carefully crafted inputs. This evaluation is crucial for ensuring the security and reliability of AI systems by identifying potential security vulnerabilities in prompt handling.
input
: The user-provided prompt column to be analysed for injection attempts.input
.input
, requiring mitigation.Click here to learn how to setup evaluation using SDK.
Input Type | Parameter | Type | Description |
---|---|---|---|
Required Inputs | input | string | The user-provided prompt to be analysed for injection attempts. |
Output | Type | Description |
---|---|---|
Result | bool | Returns 1.0 if no prompt injection is detected (Passed), 0.0 if prompt injection is detected (Failed). |