REST API

Alongside the WebRTC Realtime API, ItanniX exposes HTTP endpoints for offline TTS, authenticated story access, and voice management used by apps and assistants. Base URL for all paths below:

https://api.itannix.com

Authentication

App API routes require the same device headers as the Realtime relay: X-Workspace-Key, X-Client-Id, and X-Client-Secret (TOFU enrollment). Optional: X-App-Source for analytics. Send these on every request below.

Required headers (example)

X-Workspace-Key: <workspace_key>
X-Client-Id: <client_uuid>
X-Client-Secret: <client_secret>
Content-Type: application/json   # omit for multipart clone upload

On failure, responses use HTTP 4xx/5xx with a JSON body where detail is an object containing error and code (machine-readable).

Error response shape

{
  "detail": {
    "error": "Human-readable message",
    "code": "ERROR_CODE"
  }
}

TTS & voices

All paths are prefixed with /v1/app. Qwen3-TTS powers synthesis; voices must belong to the authenticated workspace and be visible to the device (see each endpoint).

GET /v1/app/voices

List voice profiles available to this device: workspace voices created in the dashboard plus clones created by this client.

Response 200 — JSON object with a voices array.
Each item: voice_id, name, category (cloned = this device, workspace = created via dashboard), optional preview_url (time-limited signed URL when storage is configured).

Response (example)

{
  "voices": [
    {
      "voice_id": "550e8400-e29b-41d4-a716-446655440000",
      "name": "Bedtime narrator",
      "category": "workspace",
      "preview_url": "https://..."
    }
  ]
}

POST /v1/app/voices/clone

Create a cloned voice from a reference recording. Audio is transcribed (for the adapter), stored, and synced to the TTS adapter.

Content-Type: multipart/form-data
Form fields: name (required), description (optional, default empty), audio_file (required file upload).
Audio rules: converted to WAV server-side; duration must be at least 10 seconds and at most 30 seconds.
Response 201 — voice_id, name, status (e.g. ready), optional preview_url.
Common error codes: EMPTY_AUDIO, AUDIO_TOO_SHORT, AUDIO_TOO_LONG, TRANSCRIPTION_FAILED, STORAGE_ERROR.

POST /v1/app/voices/design

Create a voice from a text description (no audio upload). The service materializes reference audio and registers the profile.

Content-Type: application/json
Response 201 — same shape as clone: AppCloneResponse.
Common error codes: MISSING_DESIGN_PROMPT, DESIGN_FAILED.

Request body

{
  "name": "Friendly guide",
  "design_prompt": "Warm, calm adult voice suitable for bedtime stories.",
  "language": "en",
  "description": "Optional notes"
}

POST /v1/app/text-to-speech/{voice_id}/with-timestamps

Synthesize speech with word-level timestamps. The voice must be in the workspace and allowed for this device: either created by this client (cloned) or created by a user in the workspace (workspace).

Path parameter: voice_id — UUID string of the voice profile.
Text limit: up to 15,000 characters.
Response 200 — JSON with signed audio_url (temporary), expires_in (seconds, typically 300), duration_seconds, timestamps (array of word, start_ms, end_ms), full_text.
Common error codes: TEXT_TOO_LONG, EMPTY_TEXT, INVALID_VOICE_ID, VOICE_NOT_FOUND, VOICE_NOT_AVAILABLE (403), ADAPTER_SYNC_FAILED, ADAPTER_NOT_CONFIGURED, TTS_FAILED, STORAGE_ERROR.

Request body

{
  "text": "Hello from ItanniX.",
  "output_format": "mp3_44100_128",
  "speed": 1.0,
  "stability": 0.75
}

output_format may contain mp3 for MP3 or otherwise WAV. Designed voices may send a voice_clone_prompt path on the adapter; clone voices use profile_id from the adapter.

Stories

Paths are under /v1/app. Only stories that have a non-empty slug appear in the list endpoint.

GET /v1/app/stories

List story cards for the authenticated workspace (slugged stories only).

Response 200 — JSON array of objects: story_id, slug, name, optional excerpt, age_group, illustration_url, illustration_alt, sort_order.

GET /v1/app/stories/{slug}

Full story payload for a slug in this workspace (includes full transcript).

Response 200 — story_id, slug, name, transcript, optional excerpt, illustration_url, illustration_alt, estimated_reading_time_seconds.
Error: 404 with STORY_NOT_FOUND if the slug does not exist in the workspace.

POST /v1/app/stories/{slug}/read

Generate narration for the full story transcript as a WAV file. The voice must belong to the same workspace.

Content-Type: application/json
Body: optional voice_id (UUID). If omitted, the story's default_voice_profile_id is used when set; otherwise the first voice in the workspace is used. If there are no voices, returns 400 VOICE_NOT_AVAILABLE.
Response 200 — raw WAV bytes, Content-Type: audio/wav, Content-Length set.
Common error codes: STORY_NOT_FOUND, VOICE_NOT_AVAILABLE, VOICE_NOT_FOUND, ADAPTER_SYNC_FAILED.

Request body (optional)

{
  "voice_id": "550e8400-e29b-41d4-a716-446655440000"
}

See also: Realtime API (WebRTC), Quickstart, and the SDK.