What TALOS is
TALOS is a peer-to-peer inference network. Prompts never touch a company data center; they're picked up by someone's spare graphics card, run there and streamed straight back.
The short version
Every request that hits TALOS is handed off to a machine owned by an ordinary person, somewhere on the network, who's opted in to run jobs for pay. There's no central server farm doing the actual thinking. It's a thin coordination layer that pairs your prompt with a free card and gets out of the way.
Two ways to run a model
TALOS ships two tiers, each backed by a different class of hardware on the network:
| Tier | Model | Cost | Runs on | Notes |
|---|---|---|---|---|
| Light | Nimbus 8B | 8 credits | Browser node (WebGPU) | ~4.2GB download, ~6GB VRAM, unfiltered |
| Heavy | Atlas 30B or Atlas Vision 27B | 14 credits (18 with deep-think) | Rig node | your choice of model, unfiltered, + vision & live web lookups |
Credits
Every message costs credits: 1 credit = $0.01, topped up with USDC.
- Top up with USDC; credits are spent per message based on the tier you pick.
- Node operators keep 72% of the USD value of every job they complete, paid in USDC (82% once they've staked). See Staking for the full breakdown.
What's actually running this
- Browser nodes run Nimbus 8B straight in a tab via WebGPU. No install required.
- Rig nodes run the Heavy models on dedicated hardware, accelerated by CUDA, Metal, or Vulkan.
- A lightweight relay server matches jobs to idle nodes and streams tokens back in real time.
Why bother
Centralized providers decide what you're allowed to ask, keep a copy of what you asked and can cut you off whenever they feel like it. TALOS routes around all three: it's unfiltered by default, doesn't retain your prompts and isn't owned by anyone who could pull the plug. Bring a GPU or don't. Either way, you can use the network or get paid by it.