Kunagi Systems: Decentralized GPU inference network

A request moves through four services: client, orchestrator, worker, and settlement. The orchestrator matches work to supply and keeps the user stream open while the worker generates output.

App · API → Router → GPU worker → Token stream

01 · Client

User sends request

The app or API submits prompt, model, and session metadata.

02 · Orchestrator

Route job

The router selects eligible GPU supply by model, availability, and measured speed.

03 · Worker

Run inference

The worker executes the model and streams tokens back to the orchestrator.

04 · Settlement

Credit work

Usage is counted, earnings are credited, and prompt context is discarded.

Trace

User request
  -> queue by model and tier
  -> select idle worker
  -> stream tokens
  -> record usage
  -> discard prompt context

Request flow