Get API key

Sources

Upload files for retrieval, then list what a user or conversation can reach.

A source is a piece of content the chat plane can search against. Most often a file (PDF, DOCX, plain text, etc.) that has been extracted, chunked, and embedded into a searchable index at ingest time. Two read endpoints expose what's attached to a user or a specific conversation, plus one write endpoint for direct file ingestion.

List sources

GET /v1/sources returns every source the calling user can see: files they own, group-shared sources, and org-level sources surfaced by an admin.

cURL
curl -s "https://api.privatemind.com/v1/sources" \
  -H "Authorization: Bearer $PMIND_USER_KEY"

Response:

JSON
{
  "success": true,
  "total": 3,
  "body": [
    {
      "id": 482,
      "source_type": "vectorized",
      "source_name": "file_quarterly-report.pdf",
      "source_description": "Vectorized file \"quarterly-report.pdf\" containing searchable content",
      "handler": "vectorFileHandler",
      "source_config": {
        "conversation_id": 1207,
        "collection_name": "vec_482",
        "file_id": 482,
        "file_name": "quarterly-report.pdf",
        "file_type": "application/pdf",
        "file_size": 184302,
        "chunks_count": 47,
        "stored_at": "2026-05-20T14:02:11.482Z"
      },
      "conversation_id": 1207,
      "is_active": true,
      "use_in_tasks": true,
      "created_at": "2026-05-20T14:02:11.482Z",
      "updated_at": "2026-05-20T14:02:11.482Z"
    }
  ]
}

Fields worth knowing:

Field Meaning
source_type vectorized for RAG files, tabular for spreadsheets, custom / mcp_web for org-level connectors.
handler Backend handler that runs against the source at retrieval time. vectorFileHandler for vectorized files.
source_config Handler-specific config. For indexed files: the collection name, file metadata, chunk count. Secrets are redacted unless the caller owns the source.
conversation_id The conversation this source row is attached to, or null for an org-level source not bound to any one chat.
is_active Whether the source is currently considered "on" for the conversation it's mapped to.
use_in_tasks Whether agentic task runs should include this source in tool dispatch.

A single source can be attached to multiple conversations, so it may appear more than once in the list.

List sources on a conversation

GET /v1/conversations/{id}/sources narrows the result to one conversation. Only sources actively mapped to that conversation come back.

cURL
curl -s "https://api.privatemind.com/v1/conversations/1207/sources" \
  -H "Authorization: Bearer $PMIND_USER_KEY"

Same row shape as GET /v1/sources. 404 if the conversation doesn't belong to the calling user.

Ingesting a file

POST /v1/conversations/{id}/files/vectorize takes a single file as multipart form data, extracts its text, chunks it, embeds the chunks, and writes both vectors and a source row in one shot. Pass id=0 (or any non-existent id) and the server will create a new conversation and return its id.

cURL
curl -s "https://api.privatemind.com/v1/conversations/0/files/vectorize" \
  -H "Authorization: Bearer $PMIND_USER_KEY" \
  -F "file=@quarterly-report.pdf"

Successful response (201):

JSON
{
  "success": true,
  "message": "File vectorized with 47 chunks successfully",
  "total": {
    "conversation_id": 1207,
    "vectorized_file": {
      "id": 482,
      "file_name": "quarterly-report.pdf",
      "collection_name": "vec_482",
      "file_type": "application/pdf",
      "file_size": 184302,
      "created_at": "2026-05-20T14:02:11.482Z"
    }
  }
}

If the same file (same SHA-256) is uploaded again by the same user, the server skips re-embedding and links the existing source to the new conversation. The response carries total.duplicate: true and the existing file_id.

Form fields

Field Required Notes
file yes Single file part. The multipart key must be exactly file.
query ?ephemeral=true no Marks the auto-created conversation as ephemeral; purged by the chat-cleanup job.

Supported file types

Text extracted in-process by the backend before embedding:

  • Plain text: .txt, .md, .markdown, .log, .json, .jsonl, .ndjson, .geojson, .xml, .xhtml, .yaml, .yml, .sql, .py, .js, .ts, .r, .dat
  • Documents: .pdf, .docx, .pptx, .odt, .odp, .rtf, .html, .htm

Legacy .doc (OLE binary) is rejected: re-save as .docx. Spreadsheets (.csv, .xlsx, .ods, etc.) and images go through separate /files/tabular and /files/ocr ingestion routes, not documented here.

Size limits

  • 75 MB per request (total body size cap).
  • 50 MB per file. Files between 50 MB and 75 MB will pass the request check and be rejected with 413 File too large for extraction.
  • Extracted text above 1,000,000 characters is truncated before embedding; the response message says so.

Failure modes

Status When
400 No file part, unsupported file type, or empty/invalid conversation_id segment.
403 Not a user-scoped key, or org has file_attachments_enabled = false.
413 File body exceeds the extractor's 50 MB cap.
429 Per-key rate limit exhausted.
503 The embedding model service is unreachable. Try again.

How sources reach the model

Sources do not auto-attach to POST /v1/chat/completions. That endpoint only handles completion requests and never reads from the source index. Source-aware retrieval lives in the PrivateMind chat plane itself (the web app and the /v1/conversations/{id}/generate SSE endpoint used by the embeddable widget), where the system retrieves relevant chunks from indexed sources and adds them to the conversation context before calling the model.

Where next

  • Authentication: minting a user key from an application key via /v1/auth/exchange.
  • Embeddings: the same model that powers ingest, if you want to build your own vector index instead.
  • Chat completions: the pass-through completion path that does not read sources.