Budgets & limits · PrivateMind Docs

PrivateMind enforces limits at three levels: per-API-key, per-user, and per-organization.

Organization limits

Org admins control budgets from /settings/org/overview and /settings/org/chat-usage. What you see depends on whether your organization meters usage by tokens or by cost.

Token-based organizations

The Overview page at /settings/org/overview shows:

Token limit / user / month: a badge on the Token Usage card
Total Tokens (this month): sum across the org for the current calendar month
Avg Tokens / User: total divided by user count
Org Allocation Used: percentage with a progress bar
Top Users by Token Usage: table with each user's tokens and their share of the per-user cap

When the allocation hits 90%, a warning appears. Usage above the cap is throttled via standard HTTP rate-limit responses (see Rate limits & budgets).

You cannot raise the per-user token cap from the org admin pages. Contact your PrivateMind account contact to request an increase.

Cost-based organizations

If your org meters by cost, the Overview page shows a Cost stat instead of Tokens. Every API call and chat message is priced using each model's input and output rates; the result is stored against the calling user.

The Chat Usage page at /settings/org/chat-usage is the cost dashboard. It has three tabs:

Overview: total spend, total budget, count of users over budget, top spenders chart
Users & Budgets: every user's monthly budget, current spend, percent used, and an Edit action
Models: usage broken down by model with token and cost totals

Exporting usage to CSV

The Export button on the Chat Usage page downloads a CSV for the period you have selected (the same window selector that drives the dashboard). The file is named chat-usage_<org>_<period>_<date>.csv and opens cleanly in any spreadsheet. Use it when you need to reconcile spend offline, attach figures to an internal report, or, most commonly, charge AI usage back to the user, team, or cost centre that incurred it.

The CSV carries three sections, one after another:

Usage over time: one row per day (or per week for longer windows) with requests, input tokens, output tokens, total tokens, and cost for that bucket.
Usage by model: one row per model id, with the same columns, for the whole period.
Usage by user: one row per user for the whole period.

The Usage by user section is the one to reach for when allocating cost. Its columns are:

Column	Meaning
User	The member's display name (falls back to their email, then their id, if a name isn't set).
Email	The member's email address.
Requests	Number of chat and API calls the user made in the period.
Input Tokens	Tokens the user sent (prompts).
Output Tokens	Tokens the models generated for the user.
Total Tokens	`Input Tokens + Output Tokens`. Rows are sorted by this column, highest first.
Cost (USD)	The user's spend for the period.

Unlike the dashboard's on-screen "top users" panel, this section lists every user who was active in the period, not a top-20 cut, so the totals add up to the org's full spend and nothing falls off the bottom.

Headless traffic from the org's application keys (keys that run automations or service integrations rather than belonging to a person) is collapsed into a single row labelled Application keys, with a blank Email cell. That keeps shared service-account spend from being attributed to whichever user happens to own the key, so a single person's line reflects only their own interactive usage.

Editing a user's budget

From the Users & Budgets tab, click any row to open the user detail panel, then Edit Budget. The dialog accepts a new monthly cap in dollars. Save and the change applies immediately; the user's next request is metered against the new ceiling.

Default budget for new members

A member who joins gets a default monthly budget for their organization, applied automatically on their first sign-in. Edit any individual's cap from Users & Budgets at any time; the default only sets the starting value. As with any budget, it resets at the start of each calendar month (UTC).

What "monthly" means

Budgets reset on the first of each calendar month, UTC. There is no rolling 30-day window. A budget raised mid-month carries into the next month at the new value unless you lower it again.

How spend is calculated

Each request bills prompt_tokens × input_rate + completion_tokens × output_rate for chat. Embeddings bill input tokens only. Streamed and non-streamed responses are priced identically. The final usage block on a stream is the source of truth.

Spend is updated after each request completes. There is no mid-flight observation; instrument clients if you need finer-grained tracking. See Usage for the API-side view.

Per-API-key budgets and rate limits

The org-level budgets above are about chat-app users. PrivateMind also issues per-API-key budgets and per-key rate limits, which are documented from the developer's angle in Rate limits & budgets.

The key thing for an org admin to know:

API keys belong to a user. Their spend rolls up to that user's monthly total.
Rate limits are per-key. A user with two keys gets two RPM windows.
Budgets are per-key. A user's overall spend is the sum of their keys' spend plus their chat-app spend, all measured against their monthly budget.

Users mint and manage their own keys from /settings/api-keys. Org admins do not mint keys for other users from the admin pages today.

Limits in action

When a user hits their budget or token cap:

API requests return 402 Payment Required (cost-based budgets) or 429 Too Many Requests (token budgets).
Chat messages in the app return a friendly error directing the user to ask their org admin.

There is no automatic top-up. Either the budget is raised, the calendar rolls over, or the user is blocked.

Where next

Audit logs: every budget edit is logged with who, when, and the new value
Rate limits & budgets: the developer-side reference for 402, 429, and per-key behaviour
Usage: the API endpoint for programmatic spend lookup
My Usage: the single-user view a member sees of their own spend