This is a Speech-to-Text application for Whatsapp that uses Whisper and Whatsapp-Web.js, running on Docker
Once authenticated on Whatsapp Web, the worker will transcribe all voice messages that you reply to with the command !tran using Whisper. Currently, it is only configured to transcribe messages from contacts saved in your contact book.
Originally, the program used Google Cloud Speech, but it now uses Whisper, which is a lightweight, open-source speech recognition engine.
If you do not want to host the model directly on your computer, you can use the main_openai_api branch, which uses the OpenAI API to transcribe the audio.
If you want to contribute, just send a pull request.
Just reply to the voice message you want to transcribe with !tran
docker-compose build
docker-compose up
(Do not detach, the qr will be displayed in the terminal)docker-compose.yml
file. The default values are:
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
responseMsgHeader
and responseMsgHeaderError
inside the node/index.js. You can setup the message header for the automatic response.fetchMessages()
from whatsapp-web.js, the function that handle this it's called downloadQuotedMedia()