Built for the Gonka decentralized GPU network

Simple access to low-cost, decentralized LLM inference.

Access community-selected models running on the Gonka decentralized GPU network through a familiar API. Gonka Broker adds accounts, API keys, and simple billing — so you can use everyday payment methods without crypto setup.

Visa Mastercard Apple Pay Google Pay Union Pay

Why inference costs are lower

Decentralized inference pricing is driven by an open market of GPU supply and demand. In practice, this often translates into significantly lower cost for comparable model classes.

Price comparison table
Provider Model Input (per 1M tokens) Output (per 1M tokens)
Anthropic Claude 4.5 Sonnet $3.00 $15.00
OpenAI Gpt-5 $1.25 $10.00
Gonka Qwen/Qwen3-235B ~$0.35 ~$0.35
OpenAI Gpt-5-mini $0.25 $2.00
Gonka RedHatAI/Qwen2.5 ~$0.02 ~$0.02

What Gonka Broker does

The Gonka network provides decentralized GPU compute for open-source language models. Direct access often comes with network-level tooling and on-chain payment flows. Gonka Broker simplifies access to the network for application developers.

The goal is to reduce operational friction while keeping the benefits of decentralized pricing. You authenticate using an API key and pay using familiar payment methods.

  • 🔑
    API keysAccess control for apps and environments.
  • 💳
    Familiar billingPay without dealing with wallets, tokens, or on-chain steps.
  • 🧩
    Client compatibilityUse OpenAI client SDKs by pointing to a different base_url.

Integration

Thanks to native support for OpenAI-compatible client SDKs, integration takes just three straightforward steps.


import OpenAI from "openai";
import { resolveAndSelectEndpoint, gonkaFetch } from "gonka-openai";

const { selected } = await resolveAndSelectEndpoint({
  sourceUrl: "http://node2.gonka.ai:8000",
});

const fetch = gonkaFetch({
  gonkaPrivateKey: process.env.YOUR_PRIVATE_API_KEY,
  selectedEndpoint: selected,
});

const client = new OpenAI({
  apiKey: "any-string",
  baseURL: selected.url,
  fetch,
});

const response = await client.chat.completions.create({
  model: "Qwen/Qwen3-32B-FP8",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);
        

Operational boundaries

Clear separation of responsibilities between the broker layer and the network.

Broker layer

Accounts, API keys, access control, and USD billing.

Network layer

Inference execution on the Gonka decentralized GPU network.

Data handling

Requests are forwarded for processing; the broker does not store prompts or completions.