Simple access to low-cost, decentralized LLM inference.
Access community-selected models running on the Gonka decentralized GPU network through a familiar API. Gonka Broker adds accounts, API keys, and simple billing — so you can use everyday payment methods without crypto setup.
Why inference costs are lower
Decentralized inference pricing is driven by an open market of GPU supply and demand. In practice, this often translates into significantly lower cost for comparable model classes.
| Provider | Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|---|
| Anthropic | Claude 4.5 Sonnet | $3.00 | $15.00 |
| OpenAI | Gpt-5 | $1.25 | $10.00 |
| Gonka | Qwen/Qwen3-235B | ~$0.35 | ~$0.35 |
| OpenAI | Gpt-5-mini | $0.25 | $2.00 |
| Gonka | RedHatAI/Qwen2.5 | ~$0.02 | ~$0.02 |
What Gonka Broker does
The Gonka network provides decentralized GPU compute for open-source language models. Direct access often comes with network-level tooling and on-chain payment flows. Gonka Broker simplifies access to the network for application developers.
The goal is to reduce operational friction while keeping the benefits of decentralized pricing. You authenticate using an API key and pay using familiar payment methods.
-
🔑API keysAccess control for apps and environments.
-
💳Familiar billingPay without dealing with wallets, tokens, or on-chain steps.
-
🧩Client compatibilityUse OpenAI client SDKs by pointing to a different
base_url.
Integration
Thanks to native support for OpenAI-compatible client SDKs, integration takes just three straightforward steps.
-
①Sign up and get an API key Create an account and generate an API key for your application or environment.
-
②Use your API key to send inference requests Use the native OpenAI client SDK with a Gonka-signed
fetchto call LLMs on the Gonka network.
import OpenAI from "openai";
import { resolveAndSelectEndpoint, gonkaFetch } from "gonka-openai";
const { selected } = await resolveAndSelectEndpoint({
sourceUrl: "http://node2.gonka.ai:8000",
});
const fetch = gonkaFetch({
gonkaPrivateKey: process.env.YOUR_PRIVATE_API_KEY,
selectedEndpoint: selected,
});
const client = new OpenAI({
apiKey: "any-string",
baseURL: selected.url,
fetch,
});
const response = await client.chat.completions.create({
model: "Qwen/Qwen3-32B-FP8",
messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);
-
③Top up when you need more tokens Add balance using familiar payment methods. No wallets or on-chain steps required.
Operational boundaries
Clear separation of responsibilities between the broker layer and the network.
Broker layer
Accounts, API keys, access control, and USD billing.
Network layer
Inference execution on the Gonka decentralized GPU network.
Data handling
Requests are forwarded for processing; the broker does not store prompts or completions.