LocalEngine Local API Reference
LocalEngine ships an opt-in HTTP API served entirely from your Mac. It's designed as a low-latency bridge for browser extensions and Privy apps — a Chrome extension can ask the on-device model to translate a selection without anything leaving the device.
Overview
The local API is documented and opt-in: it is not enabled by default. Turn it on from the app's Dashboard, and the engine begins listening on loopback. All endpoints are served by the active runtime, so the same Metal-accelerated GGUF model that powers the Chat tab answers API requests.
Base URL
http://127.0.0.1:8765
The API binds to loopback only. To keep traffic on-device, callers
(extensions, Privy apps) connect over 127.0.0.1 — there
is no cloud endpoint and no telemetry.
Authentication
When a token is configured, send it as a bearer token on every
/v1/* request:
Authorization: Bearer <token> /health stays unauthenticated for local readiness
checks. Tokens are stored in the macOS Keychain and
never written to disk in plain text.
/health Unauthenticated readiness probe.
{
"status": "ok",
"engine": "LocalEngine",
"version": "0.2.0"
} /v1/status Returns the active runtime and model readiness.
{
"runtime": "llama",
"active_model": "local-gguf",
"backend": "metal",
"ready": true
} /v1/translate Translate a single string. Supported fields:
textsourcetargetmode
Request
curl http://127.0.0.1:8765/v1/translate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token-if-configured>" \
-d '{"text":"hello world","source":"auto","target":"zh-CN","mode":"selection"}' Response
{
"translation": "你好,世界",
"source": "en",
"target": "zh-CN",
"latency_ms": 120,
"engine": "LocalEngine"
} /v1/translate/batch Translate many strings in one round-trip — ideal for whole-page translation in a browser extension, where each text node maps to one array entry.
curl http://127.0.0.1:8765/v1/translate/batch \
-H "Content-Type: application/json" \
-d '{"items":["hello","world"],"source":"auto","target":"zh-CN"}' Extension priority
The Privy Chrome extension selects a translation engine in this order — LocalEngine first, so on-device translation wins whenever the app is running:
1 · LocalEngine AppOn-device, private, lowest latency2 · OllamaLocal fallback runtime3 · LM StudioLocal fallback runtime4 · Custom OpenAI-compatibleRemote provider of last resort