Documentation

Tokenless exposes an OpenAI-compatible API. If you've called the OpenAI Chat Completions endpoint, you already know how to use us — point your SDK at our base URL and drop in a key.

Quickstart

  1. Create an account and grab a key from the dashboard.
  2. Add a little balance under Billing (your first $1 is free).
  3. Make your first call:
quickstart.sh
curl https://api.tokenless.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $TOKENLESS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "opus-4.8",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'
Base URL. Production is https://api.tokenless.ai/api/v1. Running locally with pnpm run dev the gateway is at http://localhost:8080/api/v1.

Authentication

Authenticate with a bearer token in the Authorization header. Keys start with sk-tk- and are shown once at creation — store them securely and rotate from the dashboard anytime.

bash
Authorization: Bearer sk-tk-xxxxxxxxxxxxxxxxxxxxxxxx

Chat completions

POST /api/v1/chat/completions — the request and response follow the OpenAI schema.

chat.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenless.ai/api/v1",
    api_key="sk-tk-...",
)

resp = client.chat.completions.create(
    model="opus-4.8",
    messages=[{"role": "user", "content": "Write a haiku about glass."}],
)
print(resp.choices[0].message.content)
print(resp.usage)  # includes prompt/completion tokens and cost

Streaming

Set stream: true to receive Server-Sent Events. Tokens stream over a dedicated gateway built to hold long-lived connections — the final event includes usage and cost.

stream.ts
const stream = await client.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Stream a story." }],
  stream: true,
  stream_options: { include_usage: true },
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Models

GET /api/v1/models returns the catalog with live per-token pricing. Pass a model's id as the model field.

Model idProviderIn /MOut /M
opus-4.8Anthropic$2.50$12.50
opus-4.7Anthropic$2.50$12.50
opus-4.6Anthropic$2.50$12.50
sonnet-4.6Anthropic$1.50$7.50
haiku-4.5Anthropic$0.50$2.50
gpt-5.5OpenAI$2.50$15.00
gpt-5.4OpenAI$1.25$7.50
gpt-5.3OpenAI$0.88$7.00

Usage & cost

Every response carries a usage object. Costs are reported in USD at the Tokenless rate (50% of list) and drawn from your prepaid balance.

usage.json
{
  "prompt_tokens": 18,
  "completion_tokens": 224,
  "total_tokens": 242,
  "prompt_tokens_details": { "cached_tokens": 0 },
  "cost": 0.0028
}

Balance & billing

Tokenless is prepaid. Load a balance, and each request draws it down at the metered rate. Turn on auto-reload to recharge automatically when your balance gets low. If auto-reload is off and your balance reaches $0, API access pauses immediately — no overage, no debt.

Errors

StatusMeaning
401Missing or invalid API key.
402Insufficient balance — top up to continue.
404Unknown model id.
429Rate limited — slow down and retry.
500Something went wrong on our side.

Ready to build?

Grab a key and make your first call in under a minute.