WebSocket API

Endpoint

WebSocket Connection

URL: wss://api.thesavants.ai/voice-stream

Connection Lifecycle

Authentication Handshake

Establish Connection

Open WebSocket connection to the endpoint

Send Token

Immediately send JWT token as first message (String)

Await Confirmation

Wait for savant_voice_connected message from server

Begin Streaming

Start sending audio data as binary messages

Connection States

State	Description	Actions
`CONNECTING`	WebSocket handshake in progress	Wait for connection
`AUTHENTICATING`	JWT token sent, awaiting confirmation	Monitor for `savant_voice_connected`
`CONNECTED`	Ready for audio streaming	Send/receive audio data
`ERROR`	Connection failed or error occurred	Handle error, retry if appropriate
`CLOSED`	Connection terminated	Clean up resources

Message Protocol

Client → Server Messages

Authentication Token
Audio Data

Type: String
Timing: First message after connection
Format: Raw JWT token

websocket.send("eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...");

Server → Client Messages

Connection Confirmed
Error Messages
Audio Data

Type: JSON String
Timing: After successful authentication

{
  "type": "vapi_connected",
  "message": "Voice API connection established",
  "timestamp": "2024-01-15T10:30:00.000Z"
}

Connection Parameters

URL Parameters

The WebSocket endpoint does not accept URL parameters.

Headers

Standard WebSocket headers are used:

GET /voice-stream HTTP/1.1
Host: your-server.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Version: 13

Subprotocols

No custom subprotocols are required.

Error Handling

Connection Errors

Connection Failed

HTTP Status Codes:

400 Bad Request - Invalid WebSocket request
401 Unauthorized - Authentication required (shouldn’t happen at connection)
403 Forbidden - Access denied
404 Not Found - Invalid endpoint
500 Internal Server Error - Server error
503 Service Unavailable - Server overloaded

Client Actions:

Verify endpoint URL
Check network connectivity
Implement exponential backoff for retries

Authentication Errors

After Connection Established:

{
  "type": "error",
  "error": {
    "code": "TOKEN_INVALID",
    "message": "Invalid JWT token format"
  }
}

Client Actions:

Request new JWT token
Verify token format
Check token expiration

Audio Format Errors

{
  "type": "error", 
  "error": {
    "code": "INVALID_AUDIO_FORMAT",
    "message": "Audio must be 16-bit PCM at 16kHz mono"
  }
}

Client Actions:

Verify audio configuration
Check sample rate (16kHz)
Ensure mono channel
Confirm 16-bit PCM format

Implementation Examples

JavaScript
Flutter/Dart

class VoiceWebSocket {
  constructor(token) {
    this.token = token;
    this.ws = null;
  }

  connect() {
    return new Promise((resolve, reject) => {
      this.ws = new WebSocket('wss://your-server.com/voice-stream');
      
      this.ws.onopen = () => {
        // Send authentication token immediately
        this.ws.send(this.token);
      };
      
      this.ws.onmessage = (event) => {
        if (typeof event.data === 'string') {
          const message = JSON.parse(event.data);
          if (message.type === 'savant_voice_connected') {
            resolve();
          } else if (message.type === 'error') {
            reject(new Error(message.error.message));
          }
        } else {
          // Handle incoming audio data
          this.handleAudioData(event.data);
        }
      };
      
      this.ws.onerror = (error) => reject(error);
      this.ws.onclose = () => this.handleDisconnection();
    });
  }

  sendAudio(audioData) {
    if (this.ws.readyState === WebSocket.OPEN) {
      this.ws.send(audioData);
    }
  }

  close() {
    if (this.ws) {
      this.ws.close();
    }
  }
}

Best Practices

Connection Management

Always handle connection states properly and implement reconnection logic

Error Recovery

Implement exponential backoff for reconnection attempts

Audio Buffering

Buffer audio data appropriately to handle network fluctuations

Resource Cleanup

Always clean up WebSocket connections and audio resources

Performance Considerations

Latency: WebSocket provides low-latency communication
Bandwidth: ~256 kbps for 16kHz 16-bit mono PCM
Buffer Size: Recommended 20-100ms audio chunks
Connection Pooling: Not applicable - use single connection per session

API Documentation

​Endpoint

WebSocket Connection

​Connection Lifecycle

​Authentication Handshake

​Connection States

​Message Protocol

​Client → Server Messages

​Server → Client Messages

​Connection Parameters

​URL Parameters

​Headers

​Subprotocols

​Error Handling

​Connection Errors

​Implementation Examples

​Best Practices

Connection Management

Error Recovery

Audio Buffering

Resource Cleanup

​Performance Considerations

Endpoint

Connection Lifecycle

Authentication Handshake

Connection States

Message Protocol

Client → Server Messages

Server → Client Messages

Connection Parameters

URL Parameters

Headers

Subprotocols

Error Handling

Connection Errors

Implementation Examples

Best Practices

Performance Considerations