Vocalize is a pronunciation trainer made for language learners.
Vocalize is an application that provides pronunciation training for language learners. The user selects the language that they would like to practice, either English and Spanish, and is then presented with practice words. The user is able to record their pronunciation and submit it for comparison against the average pronunciation of the word. A visual representation of the user's pronunciation is graphed against the average pronunciation.
The average pronunciation of each word is created by feeding YouTube videos into a custom audio processing algorithm. We first scrape audio books from YouTube and submit them to IBM Watson's Text-to-Speech API. We then use FFmpeg to create an audio file for each word in the audiobook. When a word appears multiple times, we average the word instances together using a custom Python module that is built on top of SciPi. We narrow the scope of our data by only processing the 1000 most popular words of each language. Once an average pronunciation has been create for a word, it is stored using Amazon S3.
Front End: React.js, React Native, Redux, D3.js
Back End: Node.js, Express, MongoDB, Amazon S3
Audio Processing: Python, SciPy, IBM Watson, FFmpeg
Testing: Chai, Mocha, pytest
Build Tools: Gulp, Browersify, Webpack
Deployment: Digital Ocean
brew install youtube-dl
npm install
gulp build
node server.js
In the data scraping directory you will find node js files that scrape youtube videos (audio books) for wav files of words.
npm install
node index.js scrape <youtube id> <language>
There is also a file that runs the python scripts to average the words and outputs them into a 'averaged' folder called average.sh