Request flow

A request moves through four services: client, orchestrator, worker, and settlement. The orchestrator matches work to supply and keeps the user stream open while the worker generates output.

App · API → Router → GPU worker → Token stream
GPU
01 · Client
User sends request

The app or API submits prompt, model, and session metadata.

02 · Orchestrator
Route job

The router selects eligible GPU supply by model, availability, and measured speed.

03 · Worker
Run inference

The worker executes the model and streams tokens back to the orchestrator.

04 · Settlement
Credit work

Usage is counted, earnings are credited, and prompt context is discarded.

Trace
User request
  -> queue by model and tier
  -> select idle worker
  -> stream tokens
  -> record usage
  -> discard prompt context