Axerity Platform

Cache API

Cache API responses to reduce costs and latency

The Cache API allows you to store and retrieve AI responses, reducing costs on repeated requests and improving latency.

Cost Savings

Cache hits cost only 0.01 credits compared to standard model pricing. Writing to cache is free.

Encrypted Storage

All cached data is encrypted at rest using AES-256 encryption.

How It Works

  1. Write: Store a response with a unique key
  2. Read: Retrieve cached responses by key
  3. Auto-expire: Responses expire after TTL (default 7 days)

Endpoints

MethodEndpointDescription
GET/api/v1/cache/:keyRetrieve cached item
PUT/api/v1/cache/:keyStore item in cache
DELETE/api/v1/cache/:keyDelete cached item
GET/api/v1/cacheList all cached keys

Store (PUT)

PUT https://axerity.com/api/v1/cache/:key

Request Body

Prop

Type

Example

curl -X PUT https://axerity.com/api/v1/cache/greeting-response \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "value": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "ttl": 86400,
    "tags": ["greetings", "common"]
  }'

Response

{
  "key": "greeting-response",
  "created_at": "2024-01-15T10:30:00Z",
  "expires_at": "2024-01-16T10:30:00Z",
  "size_bytes": 52
}

Retrieve (GET)

GET https://axerity.com/api/v1/cache/:key

Example

curl https://axerity.com/api/v1/cache/greeting-response \
  -H "Authorization: Bearer YOUR_API_KEY"

Response (Cache Hit)

{
  "key": "greeting-response",
  "value": {
    "role": "assistant",
    "content": "Hello! How can I help you today?"
  },
  "created_at": "2024-01-15T10:30:00Z",
  "expires_at": "2024-01-16T10:30:00Z",
  "hit": true
}

Response (Cache Miss)

{
  "key": "greeting-response",
  "hit": false
}

Delete (DELETE)

DELETE https://axerity.com/api/v1/cache/:key

Example

curl -X DELETE https://axerity.com/api/v1/cache/greeting-response \
  -H "Authorization: Bearer YOUR_API_KEY"

List Keys (GET)

GET https://axerity.com/api/v1/cache

Query Parameters

Prop

Type

Example

curl "https://axerity.com/api/v1/cache?tag=greetings&limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "keys": [
    {
      "key": "greeting-response",
      "created_at": "2024-01-15T10:30:00Z",
      "expires_at": "2024-01-16T10:30:00Z",
      "size_bytes": 52,
      "tags": ["greetings", "common"]
    }
  ],
  "total": 1,
  "storage_used_bytes": 52
}

Best Practices

Use Consistent Keys

Create deterministic keys based on your request parameters:

function getCacheKey(model, messages) {
  const hash = crypto
    .createHash("sha256")
    .update(JSON.stringify({ model, messages }))
    .digest("hex")
    .slice(0, 16);
  return `chat-${model}-${hash}`;
}

Cache Common Queries

Ideal candidates for caching:

  • FAQ responses
  • Greeting messages
  • Static information lookups
  • Repeated summarization tasks

Set Appropriate TTL

Use CaseRecommended TTL
Static content30 days
FAQ responses7 days
Dynamic summaries1 day
Real-time dataDon't cache

Pricing

OperationCost
Cache hit0.01 credits
Cache miss0 credits
WriteFree
DeleteFree
Storage$0.10/GB/month

Free Tier

All users get 100 MB of free cache storage.

On this page