Reference

HTTP & API

Ren exposes a small, local-first FastAPI service on http://127.0.0.1:8000. Every route below is drawn straight from the source — real paths, real fields, real defaults. Chat over HTTP, stream over SSE, or speak through the voice WebSocket.

Base URL & auth. Ren binds to 127.0.0.1:8000 by default — override with REN_HOST and REN_PORT. There is no token on the HTTP surface: Ren is local-only, so keep it on your LAN and never expose it to the public internet. Configuration is read from REN_* environment variables (or a .env file); the one exception is ANTHROPIC_API_KEY, which keeps its standard unprefixed name. Legacy WREN_* variable names still work as aliases.

Endpoint index

Method	Path	Purpose
GET	/health	Liveness + whether an API key is configured
POST	/chat	Talk to Ren — one blocking request/response turn
POST	/chat/stream	Same turn streamed token-by-token over SSE
GET	/memory	Recent conversation turns (read-only)
GET	/threads	List conversation threads
POST	/threads	Create a new thread
GET	/threads/{id}	Get a single thread
PATCH	/threads/{id}	Toggle a thread’s allow_dangerous flag
GET	/identity	Ren’s identity card (public-key fingerprint)
POST	/identity/attest	Self-signed identity card (proves key ownership)
GET	/hub.json	Discovery manifest for LAN clients
GET	/hub/health	Richer health endpoint for LAN clients
GET	/notifications	SSE stream of due reminders
GET	/home/events	SSE stream of device state changes
WS	/ws/audio	Full-duplex voice (PCM16 in / out)

Model tiers

Ren speaks in intent, not model IDs. When a chat request omits tier, Ren auto-routes: short asks go to fast, planning and debugging escalate to hard, and everything else lands on default. Pass "tier" in the request body to pin a turn to a specific tier. The global fallback is set with REN_DEFAULT_TIER.

Tier	Claude model	Auto-routed for
`fast`	claude-haiku-4-5-20251001	Short asks (under 80 chars)
`default`	claude-sonnet-4-6	Everything else
`hard`	claude-opus-4-8	plan · architect · prove · debug · analyze

A fourth sentinel tier, local, maps to no cloud model — the seam where an offline fallback would live.

Core endpoints

GET/health

Liveness probe. Always responds, even with no API key configured — handy for readiness checks.

Response 200

{
  "status": "ok",
  "version": "1.0.0",
  "has_api_key": true
}

curl

curl localhost:8000/health

POST/chat

One blocking turn with Ren. message is required; tier and thread_id are optional (omit thread_id to use the most-recently-active thread). Returns 503 if no key is configured and 504 if the turn exceeds the timeout.

Request body

{
  "message": "What's on my calendar today?",
  "tier": "default",        // optional: fast | default | hard
  "thread_id": 1            // optional: defaults to most-recently-active
}

Response 200

{
  "reply": "You have a 2pm dentist appointment and a 5pm call.",
  "model": "claude-sonnet-4-6",
  "tier": "default",
  "thread_id": 1
}

curl

curl -s localhost:8000/chat \
  -H 'content-type: application/json' \
  -d '{"message":"remind me to call mom at 6pm","tier":"fast"}'

POST/chat/stream

The same turn as /chat, streamed token-by-token as Server-Sent Events (text/event-stream). Frame types: token, tool_run, done, and error. The body is identical to /chat.

SSE frames

event: token
data: {"text": "Here"}

event: token
data: {"text": "'s your week"}

event: tool_run
data: {"tool": "list_reminders", "status": "running"}

event: tool_run
data: {"tool": "list_reminders", "status": "done"}

event: done
data: {"reply": "Here's your week ...", "model": "claude-opus-4-8", "tier": "hard", "thread_id": 1}

event: error
data: {"detail": "..."}

The done frame mirrors the /chat response. A mid-stream failure emits a single error frame instead of dropping the connection.

curl

curl -sN localhost:8000/chat/stream \
  -H 'content-type: application/json' \
  -d '{"message":"plan my week","tier":"hard"}'

GET/memory

Read-only window onto recent conversation turns. Query params: limit (default 20) and thread_id (default 1).

Response 200

{
  "turns": [
    { "id": 41, "role": "user",      "content": "remind me to call mom", "thread_id": 1, "created_at": "2026-06-25T14:02:11" },
    { "id": 42, "role": "assistant", "content": "Done — I'll remind you.", "thread_id": 1, "created_at": "2026-06-25T14:02:13" }
  ]
}

curl

curl -s 'localhost:8000/memory?limit=20&thread_id=1'

Threads

Threads partition conversation memory. Each carries an allow_dangerous flag — set it to let side-effecting tools run for that thread only, without unlocking them globally.

GET/threads

List threads, most-recently-active first. Query param: limit (default 20).

Response 200

{
  "threads": [
    {
      "id": 1,
      "name": "General",
      "allow_dangerous": false,
      "created_at": "2026-06-01T09:00:00",
      "last_active_at": "2026-06-25T14:02:13"
    }
  ]
}

POST/threads

Create a thread. name defaults to "New thread". Returns the new thread.

Request body

{ "name": "Kitchen remodel" }

Response 200

{
  "id": 7,
  "name": "Kitchen remodel",
  "allow_dangerous": false,
  "created_at": "2026-06-25T14:10:00",
  "last_active_at": "2026-06-25T14:10:00"
}

curl

curl -s localhost:8000/threads \
  -H 'content-type: application/json' \
  -d '{"name":"Kitchen remodel"}'

GET/threads/{id}

Fetch a single thread by id. Returns 404 if it does not exist.

Response 200

{
  "id": 1,
  "name": "General",
  "allow_dangerous": false,
  "created_at": "2026-06-01T09:00:00",
  "last_active_at": "2026-06-25T14:02:13"
}

PATCH/threads/{id}

Update a thread’s allow_dangerous flag. Returns the updated thread, or 404 if it does not exist.

Request body

{ "allow_dangerous": true }

curl

curl -s -X PATCH localhost:8000/threads/1 \
  -H 'content-type: application/json' \
  -d '{"allow_dangerous": true}'

Identity

Ren holds a persistent ed25519 keypair on-device. The identity endpoints expose its public fingerprint and a self-signed card — the seam for a future trust network.

GET/identity

Ren’s unsigned identity card: who this Ren is, plus its public-key fingerprint and key provenance.

Response 200

{
  "kind": "ren-identity",
  "version": 1,
  "public_key": "9f86d081884c7d659a2feaa0c55ad015...e7f3",
  "root": "local-file"
}

POST/identity/attest

A self-signed identity card — Ren signs its own card to prove it owns the key. No external verifier is wired yet, so verified_by is null.

Response 200

{
  "card": {
    "kind": "ren-identity",
    "version": 1,
    "public_key": "9f86d081884c7d659a2feaa0c55ad015...e7f3",
    "root": "local-file"
  },
  "signature": "MEUCIQ...base64-ed25519-signature...",
  "verified_by": null,
  "note": "self-signed; external attestation not yet wired (see ATTEST_ENDPOINT)"
}

curl

curl -s -X POST localhost:8000/identity/attest

Discovery & health

GET/hub.json

Machine-readable discovery manifest for iOS / Ren Micro auto-configuration — advertises where to chat and stream voice.

Response 200

{
  "name": "Ren",
  "version": "1.0.0",
  "ws_audio": "/ws/audio",
  "chat": "/chat/stream",
  "voice_enabled": false
}

curl

curl -s localhost:8000/hub.json

GET/hub/health

Richer health endpoint for LAN clients — adds voice availability and process uptime in seconds.

Response 200

{
  "status": "ok",
  "version": "1.0.0",
  "voice_enabled": false,
  "uptime_s": 3812.4
}

Event streams (SSE)

GET/notifications

A Server-Sent Events stream of due reminders. Each newly-due reminder arrives as a reminder frame and is marked notified so it fires once. The optional interval query param overrides the poll period (REN_NOTIFICATIONS_INTERVAL_S, default 15s).

SSE frame

event: reminder
data: {"id": 12, "text": "call mom", "kind": "reminder", "due_at": "2026-06-25T18:00:00"}

curl

curl -sN localhost:8000/notifications

GET/home/events

A Server-Sent Events stream of smart-home device state changes. The first poll establishes a baseline (no frame), then each change emits a device_change frame. Optional interval query param overrides the 5-second poll period.

SSE frame

event: device_change
data: {
  "accessory_id": "hue-3",
  "name": "Living Room Lamp",
  "service_type": "lightbulb",
  "char_key": "on",
  "value": true,
  "previous": false
}

Voice — WebSocket

WS/ws/audio

Full-duplex voice. Send JSON-enveloped PCM16 audio chunks and receive transcript, streamed tokens, tool activity, and synthesized speech back. Requires REN_VOICE_ENABLED=true with the optional ML deps installed; otherwise the socket replies with an error frame and closes. Only one voice session is allowed at a time, and voice turns auto-approve dangerous tools (the speaker is physically present).

Client → server

{ "type": "audio", "pcm16": "<base64 PCM16 chunk>" }   // stream chunks...
{ "type": "end" }                                      // ...then signal end of utterance

Server → client

{ "type": "wake",       "data": null }
{ "type": "transcript", "data": "turn off the kitchen lights" }
{ "type": "token",      "data": "Turning" }
{ "type": "tool_run",   "data": {"tool": "set_power", "status": "running"} }
{ "type": "barge_in",   "data": null }
{ "type": "done",       "data": {"reply": "Done.", "tier": "fast", "thread_id": 1} }
{ "type": "audio",      "pcm16": "<base64 synthesized PCM16>" }

Browser example

const ws = new WebSocket('ws://localhost:8000/ws/audio')

ws.onmessage = (e) => {
  const msg = JSON.parse(e.data)
  if (msg.type === 'audio') playPcm16(atob(msg.pcm16))
  else console.log(msg.type, msg.data)
}

// stream microphone audio as base64 PCM16 frames, then end the turn:
ws.send(JSON.stringify({ type: 'audio', pcm16: base64Chunk }))
ws.send(JSON.stringify({ type: 'end' }))

Idle timeout. A voice socket with no pipeline activity for 30 seconds is sent an error frame and closed. Reconnect to start a new session.

Status codes

Code	Meaning
200	OK
404	Thread not found (`/threads/{id}`)
422	Invalid request body — fails schema validation (e.g. empty message)
503	`ANTHROPIC_API_KEY` is not set — returned by the chat endpoints
504	Chat exceeded its timeout (`REN_CHAT_TIMEOUT_S`, default 60s)

Read the manual →FAQ Home