Audio Requirements
Required Format
Raw PCM, 16,000 Hz, 16-bit Signed Little-Endian, Mono
Technical Specifications
| Parameter | Value | Description |
|---|---|---|
| Format | PCM (Pulse Code Modulation) | Uncompressed linear audio |
| Sample Rate | 16,000 Hz | 16,000 samples per second |
| Bit Depth | 16-bit | Signed 16-bit integers |
| Channels | 1 (Mono) | Single audio channel |
| Endianness | Little-Endian | Least significant byte first |
| Encoding | Signed Integer | Two’s complement representation |
| Byte Order | Low byte, High byte | For each 16-bit sample |
Sample Format Details
16-bit Signed Integer Range
| Value | Hex | Binary | Description |
|---|---|---|---|
32767 | 0x7FFF | 0111111111111111 | Maximum positive |
1 | 0x0001 | 0000000000000001 | Smallest positive |
0 | 0x0000 | 0000000000000000 | Digital silence |
-1 | 0xFFFF | 1111111111111111 | Smallest negative |
-32768 | 0x8000 | 1000000000000000 | Maximum negative |
Little-Endian Byte Order
For a 16-bit sample value of0x1234:
- Low byte (LSB):
0x34(stored first) - High byte (MSB):
0x12(stored second)
Data Rate Calculations
Bandwidth Requirements
Buffer Size Recommendations
| Duration | Samples | Bytes | Use Case |
|---|---|---|---|
| 10ms | 160 | 320 | Minimal latency |
| 20ms | 320 | 640 | Recommended |
| 50ms | 800 | 1,600 | Standard buffer |
| 100ms | 1,600 | 3,200 | Large buffer |
Recommended: 20ms chunks (640 bytes) provide the best balance of latency and reliability.
Implementation Examples
- Flutter/Dart Configuration
- JavaScript Web Audio API
- Python Audio Processing
Audio Quality Guidelines
Recording Best Practices
Microphone Setup
Use close-talking microphone 6-12 inches from speaker
Environment
Record in quiet environment with minimal background noise
Gain Control
Avoid automatic gain control (AGC) for consistent levels
Sample Quality
Maintain signal levels between -12dB to -6dB for optimal quality
Audio Processing Chain
Validation and Testing
Format Validation
Audio Level Monitoring
Common Issues and Solutions
Audio Distortion
Audio Distortion
Symptoms:
- Crackling or popping sounds
- Metallic audio quality
- Clipped audio
- Incorrect sample rate conversion
- Audio clipping (levels too high)
- Wrong endianness
- Incorrect bit depth
No Audio / Silence
No Audio / Silence
Symptoms:
- No sound transmitted or received
- Zero bytes in audio buffers
- WebSocket receives empty data
Format Mismatch Errors
Format Mismatch Errors
Server Error:Checklist:
- ✅ Sample rate is exactly 16,000 Hz
- ✅ Bit depth is 16-bit (not 8-bit or 24-bit)
- ✅ Channel count is 1 (mono)
- ✅ Endianness is little-endian
- ✅ Format is raw PCM (not compressed)