This project is a Streamlit-based application that allows users to download audio from YouTube videos, transcribe them using OpenAI's Whisper model, and display the transcriptions with pagination.
Check out the demo of the application: OpenAI Whisper Transcribe YouTube Videos
Clone this repository:
git clone https://github.com/RiteshGenAI/openai_whisper_transcribe_yt_videos.git
cd openai_whisper_transcribe_yt_videos
Install the required packages:
pip install -r requirements.txt
Install FFmpeg if it's not already on your system. Installation methods vary by operating system.
Run the Streamlit app:
streamlit run .srcapp.py
Enter a YouTube video URL in the provided input field.
The app will download the audio, transcribe it, and display the transcription with pagination.
Download Audio: The download_audio
function uses yt-dlp to download the audio from the provided YouTube URL. It saves the audio as a WAV file.
Transcribe Audio: The transcribe_audio
function uses OpenAI's Whisper model to transcribe the downloaded audio file.
Display Transcript: The display_transcript_with_pagination
function splits the transcript into pages and displays them using Streamlit's UI components.
Process Audio: The process_audio
function orchestrates the entire process, from downloading to transcribing and displaying the result.
model_name
parameter in the transcribe_audio
function.tokens_per_page
parameter in display_transcript_with_pagination
to change the amount of text displayed per page.This application requires a significant amount of computational resources, especially for longer videos. Using a CUDA-enabled GPU can significantly speed up the transcription process.
MIT License