Skip to main content

Endpoint

WebSocket Connection

URL: wss://api.thesavants.ai/voice-stream

Connection Lifecycle

Authentication Handshake

1

Establish Connection

Open WebSocket connection to the endpoint
2

Send Token

Immediately send JWT token as first message (String)
3

Await Confirmation

Wait for savant_voice_connected message from server
4

Begin Streaming

Start sending audio data as binary messages

Connection States

StateDescriptionActions
CONNECTINGWebSocket handshake in progressWait for connection
AUTHENTICATINGJWT token sent, awaiting confirmationMonitor for savant_voice_connected
CONNECTEDReady for audio streamingSend/receive audio data
ERRORConnection failed or error occurredHandle error, retry if appropriate
CLOSEDConnection terminatedClean up resources

Message Protocol

Client → Server Messages

  • Authentication Token
  • Audio Data
Type: String
Timing: First message after connection
Format: Raw JWT token
websocket.send("eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...");

Server → Client Messages

  • Connection Confirmed
  • Error Messages
  • Audio Data
Type: JSON String
Timing: After successful authentication
{
  "type": "vapi_connected",
  "message": "Voice API connection established",
  "timestamp": "2024-01-15T10:30:00.000Z"
}

Connection Parameters

URL Parameters

The WebSocket endpoint does not accept URL parameters.

Headers

Standard WebSocket headers are used:
GET /voice-stream HTTP/1.1
Host: your-server.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Version: 13

Subprotocols

No custom subprotocols are required.

Error Handling

Connection Errors

HTTP Status Codes:
  • 400 Bad Request - Invalid WebSocket request
  • 401 Unauthorized - Authentication required (shouldn’t happen at connection)
  • 403 Forbidden - Access denied
  • 404 Not Found - Invalid endpoint
  • 500 Internal Server Error - Server error
  • 503 Service Unavailable - Server overloaded
Client Actions:
  • Verify endpoint URL
  • Check network connectivity
  • Implement exponential backoff for retries
After Connection Established:
{
  "type": "error",
  "error": {
    "code": "TOKEN_INVALID",
    "message": "Invalid JWT token format"
  }
}
Client Actions:
  • Request new JWT token
  • Verify token format
  • Check token expiration
{
  "type": "error", 
  "error": {
    "code": "INVALID_AUDIO_FORMAT",
    "message": "Audio must be 16-bit PCM at 16kHz mono"
  }
}
Client Actions:
  • Verify audio configuration
  • Check sample rate (16kHz)
  • Ensure mono channel
  • Confirm 16-bit PCM format

Implementation Examples

  • JavaScript
  • Flutter/Dart
class VoiceWebSocket {
  constructor(token) {
    this.token = token;
    this.ws = null;
  }

  connect() {
    return new Promise((resolve, reject) => {
      this.ws = new WebSocket('wss://your-server.com/voice-stream');
      
      this.ws.onopen = () => {
        // Send authentication token immediately
        this.ws.send(this.token);
      };
      
      this.ws.onmessage = (event) => {
        if (typeof event.data === 'string') {
          const message = JSON.parse(event.data);
          if (message.type === 'savant_voice_connected') {
            resolve();
          } else if (message.type === 'error') {
            reject(new Error(message.error.message));
          }
        } else {
          // Handle incoming audio data
          this.handleAudioData(event.data);
        }
      };
      
      this.ws.onerror = (error) => reject(error);
      this.ws.onclose = () => this.handleDisconnection();
    });
  }

  sendAudio(audioData) {
    if (this.ws.readyState === WebSocket.OPEN) {
      this.ws.send(audioData);
    }
  }

  close() {
    if (this.ws) {
      this.ws.close();
    }
  }
}

Best Practices

Connection Management

Always handle connection states properly and implement reconnection logic

Error Recovery

Implement exponential backoff for reconnection attempts

Audio Buffering

Buffer audio data appropriately to handle network fluctuations

Resource Cleanup

Always clean up WebSocket connections and audio resources

Performance Considerations

  • Latency: WebSocket provides low-latency communication
  • Bandwidth: ~256 kbps for 16kHz 16-bit mono PCM
  • Buffer Size: Recommended 20-100ms audio chunks
  • Connection Pooling: Not applicable - use single connection per session