Errors · PrivateMind Docs

The API uses standard HTTP status codes. Error bodies follow the OpenAI shape:

JSON

{
  "error": {
    "message": "Human-readable description",
    "type": "<category>",
    "code": "<machine-readable>"
  }
}

Common status codes

Code	Meaning	What to do
`200`	Success	-
`400`	Bad request: malformed JSON, unknown field, value out of range, exceeds context window	Fix the request; check `supported_parameters` for the model
`401`	Authentication failed: key missing, malformed, revoked, or expired	Verify the `Authorization` header; mint a new key
`402`	Budget exhausted: your key or org spend has reached its cap	Raise the key cap in Settings, or ask your org admin
`403`	Forbidden: model not in your org's allowed list	Pick a permitted model (see `GET /v1/models`)
`404`	Unknown endpoint or model id	Check the path; check `/v1/models`
`413`	Payload too large: mainly audio uploads (75 MB cap)	Shorten or split
`429`	Rate limited: too many requests per minute on this key	Back off with exponential delay
`5xx`	Server error: timeout, memory limit, transient failure	Retry with exponential backoff; switch model if persistent

Retry 429 and 5xx. Don't retry other 4xx codes; they'll fail the same way.

Errors before the stream starts return a regular JSON error body with the appropriate status code.

Errors mid-stream are delivered as an SSE chunk with an error field and the stream is closed:

Text

data: {"error": {"message": "Upstream timeout", "type": "engine_error", "code": "timeout"}}

Always handle the possibility of receiving an error chunk instead of [DONE].

Code	Meaning
`budget_exceeded`	Key or org budget is exhausted
`rpm_exceeded`	Too many requests per minute
`timeout`	Model didn't respond in time
`context_exceeded`	Total tokens exceed the model's context window

Most often: streaming with an intermediate proxy buffering the response. See Streaming.
Otherwise: check the model's status. If it's in maintenance, it'll be dropped from GET /v1/models.