Docs 09

API reference

An OpenAI-compatible HTTP surface, built for agents. Chat completions, streaming, tool calls, model discovery: point an existing integration at it and change two values.

Base URL

text
base_url: https://usetalos.xyz/api/v1

Unfiltered models, decentralized compute, large context and nothing retained: prompts are processed in memory and discarded (only token counts are kept for billing) and stay anonymous to the node running them. See Building agents for framework setups.

Authentication

Generate a key at usetalos.xyz/settings → the API tab. Keys look like sk-talos-… and are shown once, on creation. Pass it as a bearer token:

text
Authorization: Bearer sk-talos-...

Requests bill against the credit balance of the account that owns the key. Top up with USDC from the dashboard.

Models

ModelDescription
talos-lightUnfiltered 8B (Nimbus). Fast, runs on the broad node pool.
talos-heavyUnfiltered 30B (Atlas) with tools and large context.
talos-heavy-thinktalos-heavy with extended chain-of-thought reasoning.
atlas-vision-27bUnfiltered Atlas Vision 27B, with tools and image input. (alias: talos-heavy-vision)
talos-codeAtlas Code 24B, the agentic model behind TALOS Code. (alias: atlas-code-24b)

GET /v1/models lists them with a live available flag (Heavy requires a rig node online) and a pricing object ({ type: "per_message", credits: 8, usd: 0.08 }). Check availability if your integration depends on Heavy.

Pricing

Flat per request, you know the exact cost before you send it, no token math:

ModelCredits / requestUSD / request
talos-light8$0.08
talos-heavy14$0.14
talos-heavy-think18$0.18
atlas-vision-27b14$0.14
talos-code14$0.14

1 credit = $0.01. A request that returns a tool call (one step of an agent loop) counts as one request. Rate limit: 90 requests/minute per key.

Balance

GET /v1/balance returns the credit balance left on the account that owns the key:

bash
curl https://usetalos.xyz/api/v1/balance \
  -H "Authorization: Bearer $TALOS_API_KEY"
json
{
  "object": "balance",
  "credits": 1180,
  "usd": 11.80,
  "total_deposited": 2000,
  "total_spent": 820
}

Chat completions

POST /v1/chat/completions

curl

bash
curl https://usetalos.xyz/api/v1/chat/completions \
  -H "Authorization: Bearer $TALOS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "talos-light",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Python

python
from openai import OpenAI
client = OpenAI(base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
resp = client.chat.completions.create(
    model="talos-light",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

Node

javascript
import OpenAI from "openai";
const client = new OpenAI({ baseURL: "https://usetalos.xyz/api/v1", apiKey: "sk-talos-..." });
const resp = await client.chat.completions.create({
  model: "talos-light",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(resp.choices[0].message.content);

Streaming

Set stream: true to get Server-Sent Events as chat.completion.chunk objects, terminated by data: [DONE].

python
stream = client.chat.completions.create(
    model="talos-light",
    messages=[{"role": "user", "content": "Write a haiku."}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Function calling (tools)

Pass your own tools. When the model calls one, the response comes back with finish_reason: "tool_calls" and the call under message.tool_calls. Run it and send the result back as a tool message.

python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]
messages = [{"role": "user", "content": "What's the weather in Lisbon?"}]
r1 = client.chat.completions.create(model="talos-heavy", messages=messages, tools=tools)
call = r1.choices[0].message.tool_calls[0]             # get_weather({"city": "Lisbon"})
messages.append(r1.choices[0].message)
messages.append({"role": "tool", "tool_call_id": call.id, "content": "21C and clear"})
r2 = client.chat.completions.create(model="talos-heavy", messages=messages, tools=tools)
print(r2.choices[0].message.content)                    # "It's 21°C and clear in Lisbon."

Tool calling and vision are most reliable on talos-heavy. The Light 8B can attempt tools but is less consistent.

Vision

atlas-vision-27b accepts images. Use the standard multimodal content format with an inline base64 data: URL:

python
import base64
img = base64.b64encode(open("photo.png", "rb").read()).decode()
resp = client.chat.completions.create(
    model="atlas-vision-27b",
    messages=[{"role": "user", "content": [
        {"type": "text", "text": "What's in this image?"},
        {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img}"}},
    ]}],
)
print(resp.choices[0].message.content)

Pass images inline as base64; remote HTTPS URLs aren't fetched in this version.

Building agents

TALOS is designed to be the model behind agent frameworks. Your framework keeps doing what it does (memory, system prompt, persona, the tool loop) and TALOS is just the model it calls. Memory and persona need no special handling; they're the messages array and a system message you already send. For agents, use talos-heavy (or talos-heavy-think for harder reasoning): the 30B is far more reliable at multi-step tool use than the 8B.

text
base_url / baseURL : https://usetalos.xyz/api/v1
api_key            : sk-talos-...
model              : talos-heavy

Any framework that accepts a custom OpenAI-compatible provider works the same way, for example:

python
# OpenAI Agents SDK
from agents import Agent, Runner, OpenAIChatCompletionsModel
from openai import AsyncOpenAI
client = AsyncOpenAI(base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
agent = Agent(name="Assistant", instructions="You are helpful.",
              model=OpenAIChatCompletionsModel(model="talos-heavy", openai_client=client))
print((await Runner.run(agent, "Plan my week.")).final_output)
python
# LangChain / LangGraph
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="talos-heavy", base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
typescript
// Vercel AI SDK
import { createOpenAI } from "@ai-sdk/openai";
const talos = createOpenAI({ baseURL: "https://usetalos.xyz/api/v1", apiKey: "sk-talos-..." });
// use talos("talos-heavy") as the model in generateText / streamText / tool loops
Other frameworks
Any tool that lets you register a custom OpenAI-compatible provider works the same way: set its base URL, key and model id to the values above.

Errors

Errors follow the standard shape ({ error: { message, type, code } }):

StatusMeaning
401Missing or invalid API key
402Insufficient credits: top up with USDC
404Unknown model
429Rate limit exceeded
503No node available for the requested tier (Heavy needs a rig node online)

Rate limits

Default 90 requests/minute per key. Need more? Reach out via contact.

Image generation

POST /v1/images/generations: OpenAI-compatible, unfiltered image generation (Aperture HD on contributor GPUs). 18 credits per image, returned inline as base64, never stored server-side.

python
from openai import OpenAI
client = OpenAI(base_url="https://usetalos.xyz/api/v1", api_key="sk-talos-...")
img = client.images.generate(prompt="a neon-lit alley in the rain, cinematic", response_format="b64_json")
png_b64 = img.data[0].b64_json

Parameters: prompt (required), size ("WIDTHxHEIGHT", default 1024x1024), negative_prompt, seed, nsfw (boolean, default false; SFW mode runs an output classifier; the hard safety line applies in both modes). n must be 1 and response_format must be b64_json: no URLs, nothing is stored. Renders take ~30s.

Native endpoint

POST /api/images/create: a TALOS-specific endpoint with finer-grained controls, separate from the OpenAI-compatible /v1 surface. Same sk-talos-… bearer auth.

FieldTypeDefaultNotes
promptstringNoneRequired.
negative_promptstringNoneA baseline anti-artifact negative is always applied on top.
width / heightint1024512–1536, snapped to multiples of 64.
stepsint3210–60.
cfgnumber4.0Guidance scale; Aperture likes ~3.5–4.5.
seedintrandomOptional, for reproducible output.
nsfwboolfalseAllow adult content (18+). Off blocks adult output.
json
{
  "image": "data:image/png;base64,...",
  "model": "talos-image",
  "seed": 20260701,
  "width": 1024,
  "height": 1024,
  "credits_charged": 18
}
bash
curl https://usetalos.xyz/api/images/create \
  -H "Authorization: Bearer $TALOS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"a candid photo of a fox in snow, 35mm film","width":1216,"height":832}'

Errors: 400 prompt blocked by content policy or a SFW request produced adult output · 402 insufficient credits · 503 no render nodes online right now.