Documentation

Tokenless exposes an OpenAI-compatible API. If you've called the OpenAI Chat Completions endpoint, you already know how to use us — point your SDK at our base URL and drop in a key.

Quickstart

Create an account and grab a key from the dashboard.
Add a little balance under Billing (your first $1 is free).
Make your first call:

quickstart.sh

curl https://api.tokenless.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $TOKENLESS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "opus-4.8",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Base URL. Production is https://api.tokenless.ai/api/v1. Running locally with pnpm run dev the gateway is at http://localhost:8080/api/v1.

Authentication

Authenticate with a bearer token in the Authorization header. Keys start with sk-tk- and are shown once at creation — store them securely and rotate from the dashboard anytime.

bash

Authorization: Bearer sk-tk-xxxxxxxxxxxxxxxxxxxxxxxx

Chat completions

POST /api/v1/chat/completions — the request and response follow the OpenAI schema.

chat.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.tokenless.ai/api/v1",
    api_key="sk-tk-...",
)

resp = client.chat.completions.create(
    model="opus-4.8",
    messages=[{"role": "user", "content": "Write a haiku about glass."}],
)
print(resp.choices[0].message.content)
print(resp.usage)  # includes prompt/completion tokens and cost

Streaming

Set stream: true to receive Server-Sent Events. Tokens stream over a dedicated gateway built to hold long-lived connections — the final event includes usage and cost.

stream.ts

const stream = await client.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Stream a story." }],
  stream: true,
  stream_options: { include_usage: true },
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? "");
}

Models

GET /api/v1/models returns the catalog with live per-token pricing. Pass a model's id as the model field.

Model id	Provider	In /M	Out /M
opus-4.8	Anthropic	$2.50	$12.50
opus-4.7	Anthropic	$2.50	$12.50
opus-4.6	Anthropic	$2.50	$12.50
sonnet-4.6	Anthropic	$1.50	$7.50
haiku-4.5	Anthropic	$0.50	$2.50
gpt-5.5	OpenAI	$2.50	$15.00
gpt-5.4	OpenAI	$1.25	$7.50
gpt-5.3	OpenAI	$0.88	$7.00

Usage & cost

Every response carries a usage object. Costs are reported in USD at the Tokenless rate (50% of list) and drawn from your prepaid balance.

usage.json

{
  "prompt_tokens": 18,
  "completion_tokens": 224,
  "total_tokens": 242,
  "prompt_tokens_details": { "cached_tokens": 0 },
  "cost": 0.0028
}

Balance & billing

Tokenless is prepaid. Load a balance, and each request draws it down at the metered rate. Turn on auto-reload to recharge automatically when your balance gets low. If auto-reload is off and your balance reaches $0, API access pauses immediately — no overage, no debt.

Errors

Status	Meaning
401	Missing or invalid API key.
402	Insufficient balance — top up to continue.
404	Unknown model id.
429	Rate limited — slow down and retry.
500	Something went wrong on our side.

Ready to build?

Grab a key and make your first call in under a minute.

Get an API key