Cloudflare, the renowned content delivery network and cloud security platform, aims to make artificial intelligence (AI) accessible to developers worldwide. Cloudflare has introduced GPU-powered infrastructure and model-serving capabilities, leveraging its edge network ushering in state-of-the-art foundation models for the masses. With a simple REST API call, any developer can tap into Cloudflare’s AI platform, marking a significant step towards democratizing AI.
The evolution of Cloudflare’s edge network
In 2017, Cloudflare introduced Workers, a serverless computing platform at the edge. This innovative platform enables developers to create JavaScript Service Workers that run directly in Cloudflare’s edge locations across the globe. With Workers, developers can modify a site’s HTTP requests and responses, make parallel requests, and respond directly from the edge. This approach simplifies web development and enhances performance, aligning with the W3C Service Workers standard.
AI integration with Cloudflare workers
The rise of generative AI has prompted Cloudflare to augment its Worker’s platform with AI capabilities. Cloudflare’s AI integration consists of three key elements:
1. Workers AI: This component operates on NVIDIA GPUs within Cloudflare’s global network, enabling serverless AI models. With a pay-as-you-go model, users can focus on their applications instead of infrastructure management, making AI more accessible and cost-effective.
2. Vectorize: Cloudflare’s vector database, Vectorize, facilitates rapid and cost-effective vector indexing and storage. It supports use cases that require access to operational models and customized data, adding versatility to AI applications.
3. AI gateway: The AI Gateway empowers organizations to cache, rate limit, and monitor their AI deployments across hosting environments. It enhances observability, rate limiting, and caching, reducing costs while optimizing application performance.
Strategic partnerships and model catalog
Cloudflare has forged strategic partnerships with industry leaders, including NVIDIA, Microsoft, Hugging Face, Databricks, and Meta, to bring GPU infrastructure and foundation models to its edge network. The platform also hosts embedding models to convert text to vectors. These vectors are stored, indexed, and queried using Vectorize, adding context to large language models (LLMs) and reducing response hallucinations. The AI Gateway enhances performance by providing observability, rate limiting, and caching for frequently queried AI models.
Cloudflare’s model catalog for Workers AI boasts the latest and most advanced foundation models. From Meta’s Llama 2 to Stable Diffusion XL to Mistral 7B, developers can access a comprehensive suite of tools to build modern applications powered by generative AI.
Optimizing AI models with ONNX runtime
Behind the scenes, Cloudflare utilizes ONNX Runtime, an open neural network exchange runtime led by Microsoft, to optimize running models in resource-constrained environments. This technology, also employed by Microsoft for running foundation models in Windows, ensures efficient AI deployment across diverse environments.
Developers have multiple options for integrating AI into their applications through Cloudflare. While JavaScript can be used to write AI inference code and deploy it to Cloudflare’s edge network, developers can also invoke AI models via a simple REST API using their language of choice. This flexibility makes it seamless to infuse generative AI into web, desktop, and mobile applications, regardless of the environment.
Expanding worker’s AI globally
In September 2023, Cloudflare launched Workers AI with inference capabilities in seven cities. However, the company has set an ambitious goal to support Workers AI inference in 100 cities by the end of the year, with near-ubiquitous coverage expected by the end of 2024. This rapid expansion ensures that developers worldwide can harness the power of AI at the edge.
Cloudflare is among the first content delivery network (CDN) and edge network providers to enhance its edge network with AI capabilities, powered by GPU-enabled Workers AI, Vectorize, and the AI Gateway. Collaborating with tech giants like Meta and Microsoft, Cloudflare offers a diverse model catalog and leverages ONNX Runtime for optimization. This strategic move not only positions Cloudflare as an industry leader but also paves the way for the democratization of AI at the edge.
Cloudflare’s commitment to making AI accessible to developers represents a significant milestone in the AI landscape. By seamlessly integrating AI capabilities into its edge network, Cloudflare empowers developers to leverage generative AI in their applications, driving innovation and expanding AI’s reach. With a focus on global expansion, Cloudflare is poised to lead the charge in democratizing AI for the benefit of all developers and users.