API Overview

Introduction

The Savants Voice API enables real-time, bidirectional voice conversations between your application and AI. The API uses a hybrid approach combining HTTP for authentication and WebSockets for real-time audio streaming.

Architecture

The API follows a simple two-step process: Step 1: Authentication

Flutter App  →  [POST /api/websocket-voice/token]  →  Auth Server
Flutter App  ←  [JWT Token Response]  ←  Auth Server

Step 2: Real-time Communication

Flutter App  →  [WebSocket Connect + JWT]  →  WebSocket Server
Flutter App  ←  [savant_voice_connected confirmation]  ←  WebSocket Server
Flutter App  ↔  [Bidirectional Audio Stream]  ↔  WebSocket Server

Flow Description

Authentication Request: Flutter app sends API key and device ID to get JWT token
Token Response: Server returns short-lived JWT token (1 minute expiry)
WebSocket Connection: App connects to WebSocket endpoint with JWT token
Connection Confirmation: Server confirms with savant_voice_connected message
Audio Streaming: Bidirectional PCM audio streaming begins

Base Configuration

Environment	Base URL
Production	`https://api.thesavants.ai`
WebSocket	`wss://api.thesavants.ai/voice-stream`

API Endpoints

Authentication

POST /api/websocket-voice/token - Request JWT for WebSocket connection

WebSocket

WSS /voice-stream - Real-time audio streaming endpoint

Authentication Flow

Request Token

POST to /api/websocket-voice/token with API key and device ID

Receive JWT

Server returns short-lived (1 minute) JWT token

WebSocket Auth

Send JWT as first message after WebSocket connection

Begin Streaming

Start bidirectional audio streaming

Data Flow

Client → Server

Authentication Token (String): JWT for connection authorization
Audio Data (Binary): PCM audio from microphone

Server → Client

Connection Messages (JSON): Status and error messages
Audio Data (Binary): PCM audio from AI

Audio Specifications

Property	Value
Format	Raw PCM
Sample Rate	16,000 Hz
Bit Depth	16-bit
Channels	1 (Mono)
Endianness	Little-Endian
Encoding	Signed Integer

Rate Limits

Resource	Limit
Token Requests	60/minute per device
Concurrent Connections	10 per device
Audio Streaming	No specific limit (bandwidth dependent)

Error Handling

All errors follow a consistent format:

{
  "error": {
    "code": "ERROR_CODE",
    "message": "Human-readable description"
  }
}

SDK Support

Language	Status	Package
Flutter/Dart	✅ Official	Native implementation
JavaScript	🚧 Planned	Coming soon
Python	🚧 Planned	Coming soon

Security

Transport Security: All connections use TLS/SSL encryption
Authentication: JWT-based token authentication
Token Expiry: 1-minute token lifespan prevents replay attacks
No Data Storage: Audio streams are not stored server-side

Versioning

Current API version: v1 The API uses URL-based versioning:

https://your-server.com/api/v1/websocket-voice/token

Support

Technical Documentation: This reference guide
Integration Support: support@thesavants.ai
Response Time: 24-48 hours for technical inquiries

API Documentation

WebSocket API

Introduction

Architecture

Flow Description

Base Configuration

API Endpoints

Authentication

WebSocket

Authentication Flow

Data Flow

Client → Server

Server → Client

Audio Specifications

Rate Limits

Error Handling

SDK Support

Security

Versioning

Support

API Documentation

WebSocket API

​Introduction

​Architecture

​Flow Description

​Base Configuration

​API Endpoints

​Authentication

​WebSocket

​Authentication Flow

​Data Flow

​Client → Server

​Server → Client

​Audio Specifications

​Rate Limits

​Error Handling

​SDK Support

​Security

​Versioning

​Support

Introduction

Architecture

Flow Description

Base Configuration

API Endpoints

Authentication

WebSocket

Authentication Flow

Data Flow

Client → Server

Server → Client

Audio Specifications

Rate Limits

Error Handling

SDK Support

Security

Versioning

Support