Cloudflare announces Firewall for AI

Firewall for AI will analyze user prompts to large language models to identify attempts to extract data or otherwise exploit a model, Cloudflare said.

Editor at Large, InfoWorld |

Cloudflare has announced the development of Firewall for AI, a protection layer that can be deployed in front of large language models (LLMs) that promises to identify abuses before they reach the models.

Unveiled March 4, Firewall for AI is intended to be an advanced web application firewall (WAF) for applications that use LLMs, comprising a set of tools that can be deployed in front of applications to detect vulnerabilities and provide visibility into the threats to models.

Cloudflare said Firewall for AI will combine traditional WAF tools such as rate limiting and sensitive data detection with a new protection layer that analyzes the model prompts submitted users to identify attempts to exploit the model. Firewall for AI will run on the Cloudflare network, enabling Cloudflare to identify attacks early and protect users and models from attacks and abuses, the company said. The product is currently under development.

Some vulnerabilities that affect traditional web and API applications, such as injections and data exfiltration, also apply to the LLM world. But a new set of threats is now relevant because of how LLMs work. For example, researchers recently discovered a vulnerability in an AI collaboration platform that allowed them to hijack models and conduct unauthorized actions, Cloudflare said.

Cloudflare’s Firewall for AI will be deployed like a traditional WAF, in which every API request with an LLM prompt is scanned for patterns and signatures of possible attacks. It can be deployed in front of models hosted on the Cloudflare Workers AI platform or models hosted on any third-party infrastructure. Also, it can be used alongside Cloudflare AI Gateway.

Firewall for AI will run a series of detections designed to identify prompt injection attempts and other abuses, such as making sure the topic of the prompt stays within boundaries defined by the model owner. Firewall for AI also will look for prompts embedded in HTTP requests or allow customers to make rules based on where in the JSON body of the request that the prompt can be found.

Once enabled, Firewall for AI will analyze every prompt and provide a score based on the likelihood that it is malicious, Cloudflare said.

Next read this:

Paul Krill is an editor at large at InfoWorld, whose coverage focuses on application development.