Run Speech-to-Text, Text-to-Speech, and LLM integration locally with an Ollama-inspired interface.
Available for macOS, Linux, and Windows
Multiple STT engines including whisper.cpp (50x faster), faster-whisper with VAD, and OpenAI Whisper
Multiple TTS engines with voice selection: Native OS TTS, Kokoro, SpeechT5, Bark, and XTTS for voice cloning
Voice-based conversational AI with Ollama integration for intelligent responses and streaming support
uv pip install localkin-service-audio
kin audio transcribe audio.wav
kin audio tts "Hello world" --model kokoro-82m
kin audio listen --llm ollama --tts --stream
kin audio run kokoro-82m --port 8001