Chat API
Chat completions API reference
The Chat API allows you to send messages and receive AI-generated responses from various models.
Credit Usage
Credits are charged per message in your conversation. See Models for pricing.
Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v1/chat | Create chat completion |
Create Completion (POST)
POST https://axerity.com/api/v1/chatRequest Body
Prop
Type
Message Object
Prop
Type
Example
curl -X POST https://axerity.com/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "openai/gpt-oss-120b",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello!" }
],
"temperature": 0.7
}'Response
{
"id": "chat_abc123",
"model": "openai/gpt-oss-120b",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}Streaming
Set stream: true to receive responses as server-sent events.
Example
curl -X POST https://axerity.com/api/v1/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "openai/gpt-oss-120b",
"messages": [{ "role": "user", "content": "Hello!" }],
"stream": true
}'Response Format
data: {"id":"chat_abc123","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chat_abc123","choices":[{"delta":{"content":"!"}}]}
data: {"id":"chat_abc123","choices":[{"delta":{"content":" How"}}]}
data: [DONE]Best Practices
Optimize Token Usage
- Set
max_tokensto limit response length when you only need short answers - Use system prompts efficiently—keep them concise
Manage Conversation History
- Only include previous messages when context is needed
- For long conversations (10+ messages), summarize history and start fresh
Choose the Right Model
| Use Case | Recommended Model |
|---|---|
| Simple tasks | openai/gpt-oss-20b |
| General purpose | openai/gpt-oss-120b |
| Complex reasoning | google/gemini-2.5-pro |
| Cost-sensitive | google/gemini-2.5-flash-lite |
Pricing
Credits are charged per message in your conversation array.
| Model | Credits/Message |
|---|---|
| GPT OSS 20B | 0.1 |
| GPT OSS 120B | 0.25 |
| Gemini 2.5 Flash Lite | 0.15 |
| Gemini 2.5 Flash | 1 |
| Gemini 2.5 Pro | 3 |
See Models for full pricing.