API reference
An OpenAI-compatible HTTP surface, built for agents. Chat completions, streaming, tool calls, model discovery: point an existing integration at it and change two values.
Base URL
base_url: https://usetalos.xyz/api/v1Unfiltered models, decentralized compute, large context and nothing retained: prompts are processed in memory and discarded (only token counts are kept for billing) and stay anonymous to the node running them. See Building agents for framework setups.
Authentication
Generate a key at usetalos.xyz/settings → the API tab. Keys look like sk-talos-… and are shown once, on creation. Pass it as a bearer token:
Authorization: Bearer sk-talos-...Requests bill against the credit balance of the account that owns the key. Top up with USDC from the dashboard.
Models
| Model | Description |
|---|---|
talos-light | Unfiltered 8B (Nimbus). Fast, runs on the broad node pool. |
talos-heavy | Unfiltered 30B (Atlas) with tools and large context. |
talos-heavy-think | talos-heavy with extended chain-of-thought reasoning. |
atlas-vision-27b | Unfiltered Atlas Vision 27B, with tools and image input. (alias: talos-heavy-vision) |
talos-code | Atlas Code 24B, the agentic model behind TALOS Code. (alias: atlas-code-24b) |
GET /v1/models lists them with a live available flag (Heavy requires a rig node online) and a pricing object ({ type: "per_message", credits: 8, usd: 0.08 }). Check availability if your integration depends on Heavy.
Pricing
Flat per request, you know the exact cost before you send it, no token math:
| Model | Credits / request | USD / request |
|---|---|---|
talos-light | 8 | $0.08 |
talos-heavy | 14 | $0.14 |
talos-heavy-think | 18 | $0.18 |
atlas-vision-27b | 14 | $0.14 |
talos-code | 14 | $0.14 |
1 credit = $0.01. A request that returns a tool call (one step of an agent loop) counts as one request. Rate limit: 90 requests/minute per key.
Balance
GET /v1/balance returns the credit balance left on the account that owns the key:
curl https://usetalos.xyz/api/v1/balance \
-H "Authorization: Bearer $TALOS_API_KEY"{
"object": "balance",
"credits": 1180,
"usd": 11.80,
"total_deposited": 2000,
"total_spent": 820
}Chat completions
POST /v1/chat/completions
curl
curl https://usetalos.xyz/api/v1/chat/completions \
-H "Authorization: Bearer $TALOS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "talos-light",
"messages": [{"role": "user", "content": "Hello!"}]
}'Python
from openai import OpenAI
client = OpenAI(base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
resp = client.chat.completions.create(
model="talos-light",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)Node
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://usetalos.xyz/api/v1", apiKey: "sk-talos-..." });
const resp = await client.chat.completions.create({
model: "talos-light",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);Streaming
Set stream: true to get Server-Sent Events as chat.completion.chunk objects, terminated by data: [DONE].
stream = client.chat.completions.create(
model="talos-light",
messages=[{"role": "user", "content": "Write a haiku."}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Function calling (tools)
Pass your own tools. When the model calls one, the response comes back with finish_reason: "tool_calls" and the call under message.tool_calls. Run it and send the result back as a tool message.
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]
messages = [{"role": "user", "content": "What's the weather in Lisbon?"}]
r1 = client.chat.completions.create(model="talos-heavy", messages=messages, tools=tools)
call = r1.choices[0].message.tool_calls[0] # get_weather({"city": "Lisbon"})
messages.append(r1.choices[0].message)
messages.append({"role": "tool", "tool_call_id": call.id, "content": "21C and clear"})
r2 = client.chat.completions.create(model="talos-heavy", messages=messages, tools=tools)
print(r2.choices[0].message.content) # "It's 21°C and clear in Lisbon."Tool calling and vision are most reliable on talos-heavy. The Light 8B can attempt tools but is less consistent.
Vision
atlas-vision-27b accepts images. Use the standard multimodal content format with an inline base64 data: URL:
import base64
img = base64.b64encode(open("photo.png", "rb").read()).decode()
resp = client.chat.completions.create(
model="atlas-vision-27b",
messages=[{"role": "user", "content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img}"}},
]}],
)
print(resp.choices[0].message.content)Pass images inline as base64; remote HTTPS URLs aren't fetched in this version.
Building agents
TALOS is designed to be the model behind agent frameworks. Your framework keeps doing what it does (memory, system prompt, persona, the tool loop) and TALOS is just the model it calls. Memory and persona need no special handling; they're the messages array and a system message you already send. For agents, use talos-heavy (or talos-heavy-think for harder reasoning): the 30B is far more reliable at multi-step tool use than the 8B.
base_url / baseURL : https://usetalos.xyz/api/v1
api_key : sk-talos-...
model : talos-heavyAny framework that accepts a custom OpenAI-compatible provider works the same way, for example:
# OpenAI Agents SDK
from agents import Agent, Runner, OpenAIChatCompletionsModel
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
agent = Agent(name="Assistant", instructions="You are helpful.",
model=OpenAIChatCompletionsModel(model="talos-heavy", openai_client=client))
print((await Runner.run(agent, "Plan my week.")).final_output)# LangChain / LangGraph
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="talos-heavy", base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")// Vercel AI SDK
import { createOpenAI } from "@ai-sdk/openai";
const talos = createOpenAI({ baseURL: "https://usetalos.xyz/api/v1", apiKey: "sk-talos-..." });
// use talos("talos-heavy") as the model in generateText / streamText / tool loopsErrors
Errors follow the standard shape ({ error: { message, type, code } }):
| Status | Meaning |
|---|---|
401 | Missing or invalid API key |
402 | Insufficient credits: top up with USDC |
404 | Unknown model |
429 | Rate limit exceeded |
503 | No node available for the requested tier (Heavy needs a rig node online) |
Rate limits
Default 90 requests/minute per key. Need more? Reach out via contact.
Image generation
POST /v1/images/generations: OpenAI-compatible, unfiltered image generation (Aperture HD on contributor GPUs). 18 credits per image, returned inline as base64, never stored server-side.
from openai import OpenAI
client = OpenAI(base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
img = client.images.generate(prompt="a neon-lit alley in the rain, cinematic", response_format="b64_json")
png_b64 = img.data[0].b64_json Parameters: prompt (required), size ("WIDTHxHEIGHT", default 1024x1024), negative_prompt, seed, nsfw (boolean, default false; SFW mode runs an output classifier; the hard safety line applies in both modes). n must be 1 and response_format must be b64_json: no URLs, nothing is stored. Renders take ~30s.
Native endpoint
POST /api/images/create: a TALOS-specific endpoint with finer-grained controls, separate from the OpenAI-compatible /v1 surface. Same sk-talos-… bearer auth.
| Field | Type | Default | Notes |
|---|---|---|---|
prompt | string | None | Required. |
negative_prompt | string | None | A baseline anti-artifact negative is always applied on top. |
width / height | int | 1024 | 512–1536, snapped to multiples of 64. |
steps | int | 32 | 10–60. |
cfg | number | 4.0 | Guidance scale; Aperture likes ~3.5–4.5. |
seed | int | random | Optional, for reproducible output. |
nsfw | bool | false | Allow adult content (18+). Off blocks adult output. |
{
"image": "data:image/png;base64,...",
"model": "talos-image",
"seed": 20260701,
"width": 1024,
"height": 1024,
"credits_charged": 18
}curl https://usetalos.xyz/api/images/create \
-H "Authorization: Bearer $TALOS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"a candid photo of a fox in snow, 35mm film","width":1216,"height":832}'Errors: 400 prompt blocked by content policy or a SFW request produced adult output · 402 insufficient credits · 503 no render nodes online right now.