Docs 01

What TALOS is

TALOS is a peer-to-peer inference network. Prompts never touch a company data center; they're picked up by someone's spare graphics card, run there and streamed straight back.

The short version

Every request that hits TALOS is handed off to a machine owned by an ordinary person, somewhere on the network, who's opted in to run jobs for pay. There's no central server farm doing the actual thinking. It's a thin coordination layer that pairs your prompt with a free card and gets out of the way.

Two ways to run a model

TALOS ships two tiers, each backed by a different class of hardware on the network:

Tier	Model	Cost	Runs on	Notes
Light	Nimbus 8B	8 credits	Browser node (WebGPU)	~4.2GB download, ~6GB VRAM, unfiltered
Heavy	Atlas 30B or Atlas Vision 27B	14 credits (18 with deep-think)	Rig node	your choice of model, unfiltered, + vision & live web lookups

Credits

Every message costs credits: 1 credit = $0.01, topped up with USDC.

Top up with USDC; credits are spent per message based on the tier you pick.
Node operators keep 72% of the USD value of every job they complete, paid in USDC (82% once they've staked). See Staking for the full breakdown.

What's actually running this

Browser nodes run Nimbus 8B straight in a tab via WebGPU. No install required.
Rig nodes run the Heavy models on dedicated hardware, accelerated by CUDA, Metal, or Vulkan.
A lightweight relay server matches jobs to idle nodes and streams tokens back in real time.

Why bother

Centralized providers decide what you're allowed to ask, keep a copy of what you asked and can cut you off whenever they feel like it. TALOS routes around all three: it's unfiltered by default, doesn't retain your prompts and isn't owned by anyone who could pull the plug. Bring a GPU or don't. Either way, you can use the network or get paid by it.

NextWhy TALOS →