Voice input with voice response (full voice-to-voice)
POST/chat/voice-response
Full voice-to-voice pipeline: STT (Whisper) → AI Agent → TTS (OpenAI) → Audio Response.
Streaming Mode: Add stream: "true" in form data to receive SSE stream including:
transcriptionevent (STT result)agent_thinking,token,agent_responseeventstts_start,tts_completeevents with base64 audiostream_endwith full performance metrics
Note: This endpoint uses a different base path: /api/chat/voice-response
Request
Responses
- 200
Standard Mode: JSON response with transcription, AI response, and audio.
Streaming Mode: SSE stream with all events including TTS audio.