Docs 01

What TALOS is

TALOS is a peer-to-peer inference network. Prompts never touch a company data center; they're picked up by someone's spare graphics card, run there and streamed straight back.

The short version

Every request that hits TALOS is handed off to a machine owned by an ordinary person, somewhere on the network, who's opted in to run jobs for pay. There's no central server farm doing the actual thinking. It's a thin coordination layer that pairs your prompt with a free card and gets out of the way.

Two ways to run a model

TALOS ships two tiers, each backed by a different class of hardware on the network:

TierModelCostRuns onNotes
LightNimbus 8B8 creditsBrowser node (WebGPU)~4.2GB download, ~6GB VRAM, unfiltered
HeavyAtlas 30B or Atlas Vision 27B14 credits (18 with deep-think)Rig nodeyour choice of model, unfiltered, + vision & live web lookups

Credits

Every message costs credits: 1 credit = $0.01, topped up with USDC.

  • Top up with USDC; credits are spent per message based on the tier you pick.
  • Node operators keep 72% of the USD value of every job they complete, paid in USDC (82% once they've staked). See Staking for the full breakdown.

What's actually running this

  • Browser nodes run Nimbus 8B straight in a tab via WebGPU. No install required.
  • Rig nodes run the Heavy models on dedicated hardware, accelerated by CUDA, Metal, or Vulkan.
  • A lightweight relay server matches jobs to idle nodes and streams tokens back in real time.

Why bother

Centralized providers decide what you're allowed to ask, keep a copy of what you asked and can cut you off whenever they feel like it. TALOS routes around all three: it's unfiltered by default, doesn't retain your prompts and isn't owned by anyone who could pull the plug. Bring a GPU or don't. Either way, you can use the network or get paid by it.