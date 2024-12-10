AI Accelerator

AI Accelerator is a semantic caching solution for large language model (LLM) APIs used in generative artificial intelligence (AI) applications.

AI accelerator caches LLM queries based on semantic similarity, allowing for flexible caching of query results and faster response times to end users. AI Accelerator is compatible with multiple LLM products performing pass-through caching. Your Fastly credentials allow you to call supported LLM APIs, caching queries automatically. AI Accelerator passes through any responses, errors, or rate limits associated with your LLM provider.

Fastly AI Accelerator can help:

Reduce user wait times between query and response

Reduce fees by making fewer direct queries to LLM services

Prerequisites

To use AI Accelerator, you must have a contract for Fastly services or a valid credit card on file.

On most accounts, anyone assigned the role of superuser can purchase this product from the Fastly control panel. If you have not been assigned that role, you can use the control panel to request that a superuser purchase it for you.

Limitations and considerations

AI Accelerator has been designed to be vendor neutral, so it works with many different LLM providers, but you must have valid credentials for at least one supported LLM service. A list of supported LLM providers is available here. LLM providers are third-party technologies subject to their own terms and conditions for which you are responsible.

Billing

AI Accelerator is an add-on and is priced in addition to Fastly services. Billing for Fastly AI Accelerator is based on the total number of incoming monthly requests to AI Accelerator.

You are responsible for any fees associated with the LLMs you choose. These fees will not be a part of your Fastly bill.