Realtime API

WebRTC session and SDP exchange for live voice. All paths below follow OpenAI's Realtime API protocol. For HTTP TTS and story/audio lookups, see the REST API page.

Base URL

https://api.itannix.com

Endpoints

PurposePath
Create Session/v1/realtime/sessions
WebRTC SDP Exchange/v1/realtime

Authentication

All requests require a X-Workspace-Key header to authenticate your workspace, a X-Client-Id header identifying your device/client, and a X-Client-Secret header for secure device authentication.

Required Headers

X-Workspace-Key required
Your workspace API key (generated in workspace settings)
X-Client-Id required
Your device/client identifier (UUID format)
X-Client-Secret recommended
Device-generated secret for authentication (32+ chars)

Workspace API Key: Generate your workspace API key in the dashboard under Settings → Workspace. This key authenticates all API requests for your workspace.

Trust On First Use (TOFU): The first time a device connects with a secret, it is enrolled. Subsequent connections must use the same secret. Generate a random secret on your device and store it securely.

POST /v1/realtime/sessions

Creates a new WebRTC session. Returns session info and ICE servers for establishing the WebRTC connection.

Headers

X-Workspace-Key required
Your workspace API key
X-Client-Id required
Your device/client identifier
X-Client-Secret required
Your device-generated secret
Content-Type required
application/json

Request Body

{
  "modalities": ["text", "audio"]
}

Response

{
  "id": "session_abc123",
  "iceServers": [
    {
      "urls": ["turn:relay.itannix.com:3478"],
      "username": "turn_user",
      "credential": "turn_password"
    }
  ]
}

Response Fields

  • id - Session identifier
  • iceServers - Array of STUN/TURN servers for WebRTC NAT traversal
POST /v1/realtime

Exchanges WebRTC SDP (Session Description Protocol) offer/answer. This establishes the WebRTC peer connection for bidirectional audio streaming.

Headers

X-Workspace-Key required
Your workspace API key
X-Client-Id required
Your device/client identifier
X-Client-Secret required
Your device-generated secret
Content-Type required
application/sdp

Request Body

v=0
o=- 1234567890 2 IN IP4 127.0.0.1
s=-
t=0 0
m=audio 9 UDP/TLS/RTP/SAVPF 111
a=rtpmap:111 opus/48000/2
...

Response

v=0
o=- 9876543210 2 IN IP4 127.0.0.1
s=-
t=0 0
m=audio 9 UDP/TLS/RTP/SAVPF 111
a=rtpmap:111 opus/48000/2
...

WebRTC Protocol

This endpoint follows the standard WebRTC SDP exchange:

  1. Client creates offer with createOffer()
  2. Client sends offer SDP to this endpoint
  3. Server responds with answer SDP
  4. Client sets remote description with answer
  5. WebRTC connection established, audio streams begin

Message Format

Once the WebRTC connection is established, communication happens via the data channel using JSON messages. The message format is identical to OpenAI's Realtime API.

Transcript Messages

User Transcript
{
  "type": "conversation.item.input_audio_transcription.completed",
  "transcript": "Hello, how are you?"
}
Assistant Transcript (Streaming)
{
  "type": "response.audio_transcript.delta",
  "delta": "I'm doing well, thank you"
}

Function Calls

Function Call Request
{
  "type": "response.output_item.done",
  "item": {
    "type": "function_call",
    "id": "call_abc123",
    "call_id": "call_abc123",
    "name": "get_weather",
    "arguments": "{\"location\": \"San Francisco\"}"
  }
}
Function Call Response
{
  "type": "conversation.item.create",
  "item": {
    "type": "function_call_output",
    "call_id": "call_abc123",
    "output": "{\"temperature\": 72, \"condition\": \"sunny\"}"
  }
}

// Then trigger response generation
{
  "type": "response.create"
}

Client-Side Function Calls

Some function calls must be handled by your device. When you receive these via the data channel, execute the action locally and send the result back.

Important: These functions require hardware access and must be implemented on your device. The server cannot execute them remotely.

adjust_device_volume

Increase or decrease device volume by 10%

Parameters
{
  "action": "increase" | "decrease"
}

set_device_volume

Set device volume to a specific level

Parameters
{
  "volume_level": 0-100  // percentage
}

quiet_device

Mute device audio (set volume to 0%)

Parameters
{} // No parameters

Note: stop_audio is executed by the voice server (not client-side). No client implementation needed.

Handling Client-Side Functions
dataChannel.onmessage = (event) => {
  const message = JSON.parse(event.data);
  
  if (message.type === 'response.output_item.done' && 
      message.item?.type === 'function_call') {
    const { name, arguments: args, call_id } = message.item;
    const params = JSON.parse(args);
    let result;
    
    switch (name) {
      case 'adjust_device_volume':
        result = adjustVolume(params.action); // Your implementation
        break;
      case 'set_device_volume':
        result = setVolume(params.volume_level); // Your implementation
        break;
      case 'quiet_device':
        result = muteDevice(); // Your implementation
        break;
      default:
        return; // Voice server or backend function, no client action needed
    }
    
    // Send result back
    dataChannel.send(JSON.stringify({
      type: 'conversation.item.create',
      item: {
        type: 'function_call_output',
        call_id: call_id,
        output: JSON.stringify({ success: true, ...result })
      }
    }));
    dataChannel.send(JSON.stringify({ type: 'response.create' }));
  }
};

Error Responses

All errors follow standard HTTP status codes with JSON error details:

Error Response Format
{
  "error": {
    "message": "Missing X-Client-Id header",
    "type": "invalid_request_error",
    "code": "missing_header"
  }
}
400 Bad Request - Invalid request format or missing required headers
401 Unauthorized - Invalid or missing client ID
404 Not Found - Client ID not found or not activated
500 Internal Server Error - Server-side error