Get API key

Budgets & limits

How PrivateMind enforces spend and usage limits across API keys, users, and organizations.

PrivateMind enforces limits at three levels: per-API-key, per-user, and per-organization. PrivateMind enforces limits at three levels: per API key, per user, and per organization.

Organization limits

Org admins control budgets from /settings/org/overview and /settings/org/chat-usage. What you see depends on whether your organization meters usage by tokens or by cost.

Token-based organizations

The Overview page at /settings/org/overview shows:

  • Token limit / user / month: a badge on the Token Usage card
  • Total Tokens (this month): sum across the org for the current calendar month
  • Avg Tokens / User: total divided by user count
  • Org Allocation Used: percentage with a progress bar
  • Top Users by Token Usage: table with each user's tokens and their share of the per-user cap

When the allocation hits 90%, a warning appears. Usage above the cap is throttled via standard HTTP rate-limit responses (see Rate limits & budgets).

You cannot raise the per-user token cap from the org admin pages. Contact your PrivateMind account contact to request an increase.

Cost-based organizations

If your org meters by cost, the Overview page shows a Cost stat instead of Tokens. Every API call and chat message is priced using each model's input and output rates; the result is stored against the calling user.

The Chat Usage page at /settings/org/chat-usage is the cost dashboard. It has three tabs:

  • Overview: total spend, total budget, count of users over budget, top spenders chart
  • Users & Budgets: every user's monthly budget, current spend, percent used, and an Edit action
  • Models: usage broken down by model with token and cost totals

Editing a user's budget

From the Users & Budgets tab, click any row to open the user detail panel, then Edit Budget. The dialog accepts a new monthly cap in dollars. Save and the change applies immediately; the user's next request is metered against the new ceiling.

What "monthly" means

Budgets reset on the first of each calendar month, UTC. There is no rolling 30-day window. A budget raised mid-month carries into the next month at the new value unless you lower it again.

How spend is calculated

Each request bills prompt_tokens × input_rate + completion_tokens × output_rate for chat. Embeddings bill input tokens only. Streamed and non-streamed responses are priced identically. The final usage block on a stream is the source of truth.

Spend is updated after each request completes. There is no mid-flight observation; instrument clients if you need finer-grained tracking. See Usage for the API-side view.

Per-API-key budgets and rate limits

The org-level budgets above are about chat-app users. PrivateMind also issues per-API-key budgets and per-key rate limits, which are documented from the developer's angle in Rate limits & budgets.

The key thing for an org admin to know:

  • API keys belong to a user. Their spend rolls up to that user's monthly total.
  • Rate limits are per-key. A user with two keys gets two RPM windows.
  • Budgets are per-key. A user's overall spend is the sum of their keys' spend plus their chat-app spend, all measured against their monthly budget.

Users mint and manage their own keys from /settings/api-keys. Org admins do not mint keys for other users from the admin pages today.

Limits in action

When a user hits their budget or token cap:

  • API requests return 402 Payment Required (cost-based budgets) or 429 Too Many Requests (token budgets).
  • Chat messages in the app return a friendly error directing the user to ask their org admin.

There is no automatic top-up. Either the budget is raised, the calendar rolls over, or the user is blocked.

Where next

  • Audit logs: every budget edit is logged with who, when, and the new value
  • Rate limits & budgets: the developer-side reference for 402, 429, and per-key behaviour
  • Usage: the API endpoint for programmatic spend lookup