vad

Here are 261 public repositories matching this topic...

modelscope / FunASR

Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.

Updated May 30, 2026
Python

snakers4 / silero-vad

Star

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

voice-commands speech pytorch voice-recognition vad voice-control speech-processing voice-detection voice-activity-detection onnx onnxruntime onnx-runtime

Updated Mar 26, 2026
Python

smacke / ffsubsync

Sponsor

Star

Automagically synchronize subtitles with video.

Updated May 24, 2026
Python

CheshireCC / faster-whisper-GUI

Star

faster_whisper GUI with PySide6

openai vad whisper asr transcribe voice-transcription faster-whisper whisperx

Updated Dec 8, 2024
Python

TEN-framework / ten-vad

Star

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

audio real-time voice-commands speech voice-recognition vad automatic-speech-recognition speech-processing conversational-ai voice-activity-detection voice-agent silero-vad

Updated Feb 2, 2026
C

FluidInference / FluidAudio

Star

Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.

audio macos swift ios real-time avfoundation nvidia vad automatic-speech-recognition speech-to-text ane speaker-recognition asr speaker-diarization voice-activity-detection coreml speaker-identification speaker-embedding parakeet

Updated Jun 2, 2026
Swift

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

kotlin python c go csharp cpp speech-recognition vad asr voice-activity-detection

Updated Oct 20, 2025
C++

sandrohanea / whisper.net

Sponsor

Star

Whisper.net. Speech to text made simple using Whisper Models

translation cross-platform dotnet dotnetcore speech-recognition vad speech-to-text voice-activity-detection

Updated Jun 1, 2026
C#

jtkim-kaist / VAD

Star

Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.

data speech dnn lstm speech-recognition attention vad voice-detection voice-activity-detection bdnn acam speech-activity-detection

Updated Jun 9, 2021
MATLAB

amsehili / auditok

Star

An audio/acoustic activity detection and audio segmentation tool

vad audio-data audio-activities audio-segmentation voice-detection voice-activity-detection

Updated May 14, 2026
Python

shashikg / WhisperS2T

Star

An Optimized Speech-to-Text Pipeline for the Whisper Model Supporting Multiple Inference Engine

deep-learning speech-recognition vad speech-to-text whisper asr tensorrt voice-activity-detection tensorrt-llm

Updated Aug 27, 2024
Jupyter Notebook

FireRedTeam / FireRedASR2S

Star

A SOTA Industrial-Grade All-in-One ASR system with ASR, VAD, LID, and Punc modules. FireRedASR2 supports Chinese (Mandarin, 20+ dialects/accents), English, code-switching, and both speech and singing ASR. FireRedVAD supports speech/singing/music in 100+ langs. FireRedLID supports 100+ langs and 20+ zh dialects. FireRedPunc supports zh and en.

open-source speech-recognition vad automatic-speech-recognition asr lid language-identification sota voice-activity-detection asr-pipeline punctuation-restoration audio-event-classification llm punctuation-prediction industrial-grade multimodal-llm speechllm audio-event-detection

Updated Jun 2, 2026
Python

DmitryRyumin / ICASSP-2023-24-Papers

Star

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

Updated May 5, 2025
Python

gkonovalov / android-vad

Star

Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.

Updated Jul 15, 2025
C

FireRedTeam / FireRedVAD

Star

A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD

vad voice-activity-detection aed sound-event-detection audio-event-classification audio-event-detection