3 releases
new 0.1.2 | Apr 22, 2025 |
---|---|
0.1.1 | Apr 22, 2025 |
0.1.0 | Apr 22, 2025 |
#81 in Web programming
410KB
8K
SLoC
STT CLI (Speech-to-Text Command Line Interface)
A command-line tool for real-time speech-to-text transcription with AI (Groq and OpenAI).
Features
- Real-time audio capture from microphone
- Support for multiple transcription providers:
- Groq (using whisper-large-v3)
- OpenAI (using Whisper)
- Efficient audio processing with proper chunking
- Clean shutdown handling with Ctrl+C
Installation
-
Make sure you have Rust installed on your system. If not, install it from rustup.rs
-
Clone the repository:
git clone https://github.com/TwistingTwists/stt-cli cd stt-cli
-
Build the project:
cargo build --release
Usage
The CLI supports different transcription providers through the -t
or --transcription-provider
flag:
# Using Groq
./target/release/stt-cli -t groq
# Using OpenAI
./target/release/stt-cli -t open-ai
Environment Variables
Before running the application, make sure to set up the required API keys:
-
For Groq:
export GROQ_API_KEY='your-groq-api-key'
-
For OpenAI:
export OPENAI_API_KEY='your-openai-api-key'
Expected Output
When running the application, you'll see:
- Initialization messages for audio device setup
- Real-time transcription of your speech
- Status messages for audio processing and API requests
Example:
Initializing audio device...
Audio capture started. Speak into your microphone.
[Transcription] "Hello, this is a test of the speech to text system."
...
Press Ctrl+C to gracefully stop the application.
Contributing
Contributions are welcome! Please feel free to submit a Issue.
License
This project is licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Dependencies
~23–55MB
~867K SLoC