3 releases
0.1.2 | Oct 19, 2024 |
---|---|
0.1.1 | Oct 19, 2024 |
0.1.0 | Oct 17, 2024 |
#4 in Accessibility
497 downloads per month
46KB
922 lines
Whisp
A lightweight desktop speech-to-text tool powered by modern models like OpenAI's Whisper. Whisp provides a simple interface for converting speech to text with minimal resource overhead.
Overview
Whisp offers an unobtrusive and customizable way to transcribe your voice into text. It operates as a globally available desktop application. Activate it via a hotkey, and it can automatically paste the transcribed text into any focused input field.
Design principles:
-
Reliable: Built to be stable and handle errors gracefully. Resilient in the face of errors. Retry and recovery.
-
Lightweight: Resource-efficient, minimal system impact, simple.
Installation
Currently the only way to install this is via cargo:
cargo install whisp
whisp
# if cargo bin is not in your path
~/.cargo/bin/whisp
Configuration
Configuration is managed through a whisp.toml
file located in your systems
configuration directory. The whisp drop-down has an option to copy the
configuration file path to the clipboard.
hotkey = "shift+super+Semicolon"
openai_key = "your-api-key"
language = "en"
model = "whisper-1"
restore_clipboard = true
auto_paste = false
Usage
To start using Whisp, define your preferred hotkey, configure the model, and run the application. You can then trigger voice recording via the hotkey and receive transcriptions automatically.
Common Use Cases
-
Messaging: Quickly respond to messages in chat applications like Discord or Slack.
-
Document Writing: Speak freely to draft large amounts of text quickly. Then apply post-processing yourself or with the help of a language model to refine the text.
-
Code Commenting: Dictate comments directly into your editor. Note this tool does not write code well. However, perhaps this can change when the automatic post processing is added. Reach out if you are interested in contributing.
License
Whisp is licensed under the MIT license.
Dependencies
~18–58MB
~1M SLoC