Cache API
Cache API responses to reduce costs and latency
The Cache API allows you to store and retrieve AI responses, reducing costs on repeated requests and improving latency.
Cost Savings
Cache hits cost only 0.01 credits compared to standard model pricing. Writing to cache is free.
Encrypted Storage
All cached data is encrypted at rest using AES-256 encryption.
How It Works
- Write: Store a response with a unique key
- Read: Retrieve cached responses by key
- Auto-expire: Responses expire after TTL (default 7 days)
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/cache/:key | Retrieve cached item |
| PUT | /api/v1/cache/:key | Store item in cache |
| DELETE | /api/v1/cache/:key | Delete cached item |
| GET | /api/v1/cache | List all cached keys |
Store (PUT)
PUT https://axerity.com/api/v1/cache/:keyRequest Body
Prop
Type
Example
curl -X PUT https://axerity.com/api/v1/cache/greeting-response \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"value": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"ttl": 86400,
"tags": ["greetings", "common"]
}'Response
{
"key": "greeting-response",
"created_at": "2024-01-15T10:30:00Z",
"expires_at": "2024-01-16T10:30:00Z",
"size_bytes": 52
}Retrieve (GET)
GET https://axerity.com/api/v1/cache/:keyExample
curl https://axerity.com/api/v1/cache/greeting-response \
-H "Authorization: Bearer YOUR_API_KEY"Response (Cache Hit)
{
"key": "greeting-response",
"value": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"created_at": "2024-01-15T10:30:00Z",
"expires_at": "2024-01-16T10:30:00Z",
"hit": true
}Response (Cache Miss)
{
"key": "greeting-response",
"hit": false
}Delete (DELETE)
DELETE https://axerity.com/api/v1/cache/:keyExample
curl -X DELETE https://axerity.com/api/v1/cache/greeting-response \
-H "Authorization: Bearer YOUR_API_KEY"List Keys (GET)
GET https://axerity.com/api/v1/cacheQuery Parameters
Prop
Type
Example
curl "https://axerity.com/api/v1/cache?tag=greetings&limit=10" \
-H "Authorization: Bearer YOUR_API_KEY"Response
{
"keys": [
{
"key": "greeting-response",
"created_at": "2024-01-15T10:30:00Z",
"expires_at": "2024-01-16T10:30:00Z",
"size_bytes": 52,
"tags": ["greetings", "common"]
}
],
"total": 1,
"storage_used_bytes": 52
}Best Practices
Use Consistent Keys
Create deterministic keys based on your request parameters:
function getCacheKey(model, messages) {
const hash = crypto
.createHash("sha256")
.update(JSON.stringify({ model, messages }))
.digest("hex")
.slice(0, 16);
return `chat-${model}-${hash}`;
}Cache Common Queries
Ideal candidates for caching:
- FAQ responses
- Greeting messages
- Static information lookups
- Repeated summarization tasks
Set Appropriate TTL
| Use Case | Recommended TTL |
|---|---|
| Static content | 30 days |
| FAQ responses | 7 days |
| Dynamic summaries | 1 day |
| Real-time data | Don't cache |
Pricing
| Operation | Cost |
|---|---|
| Cache hit | 0.01 credits |
| Cache miss | 0 credits |
| Write | Free |
| Delete | Free |
| Storage | $0.10/GB/month |
Free Tier
All users get 100 MB of free cache storage.