The project consists of two parts - a voice bot and a RESTful server for interacting with it.
To run the bot locally, you need to run python3 bot.py
(or run_bot.sh
) and select the desired operation option in the proposed menu (more details here).
To start a RESTful server that provides an interface for interacting with voice bot modules, you need to run python3 rest_server.py
(or run_rest_server.sh
) (more details here).
To build a docker image based on a RESTful server, run sudo docker build -t voice_chatbot:0.1 .
(more details here).
ATTENTION! This was my graduation project, so the architecture and code here are not very good, I understand this, and as soon as I have time, I will update everything.
A complete list of all dependencies required for operation:
Voice_ChatBot_data.zip
(3Gb) from Google Drive and unpack it into the root of the project ( data
and install_files
folders). If you are using Ubuntu 16.04 or higher, you can use install_packages.sh
(tested on Ubuntu 16.04 and 18.04) to install all packages. By default, TensorFlow for CPU will be installed. If you have an nvidia graphics card with the official driver version 410 installed, you can install TensorFlowGPU. To do this, you need to pass the gpu
parameter when running install_packages.sh
. For example:
./install_packages.sh gpu
In this case, 2 archives will be downloaded from my Google Drive:
Install_CUDA10.0_cuDNN_for410.zip
(2.0Gb) with CUDA 10.0 and cuDNN 7.5.0 (if the gpu
parameter was passed). The installation will be completed automatically, but if something goes wrong, there is an Install.txt
instruction in the downloaded archive.Voice_ChatBot_data.zip
(3Gb) with training data and ready-made models. It will be automatically unpacked into data
and install_files
folders in the project root. If you cannot or do not want to use the script to install all the required packages, you must manually install RHVoice and CMUclmtk_v0.7 using the instructions in install_files/Install RHVoice.txt
and install_files/Install CMUclmtk.txt
. You also need to copy the language, acoustic model and dictionary files for PocketSphinx from temp/
to /usr/local/lib/python3.6/dist-packages/pocketsphinx/model
(your path to python3.6
may be different). The files of the language model prepared_questions_plays_ru.lm
and the dictionary prepared_questions_plays_ru.dic
must be renamed to ru_bot_plays_ru.lm
and ru_bot_plays_ru.dic
(or change their name to speech_to_text.py
if you have your own language model and dictionary).
The bot is based on a recurrent neural network, the AttentionSeq2Seq model. In the current implementation, it consists of 2 bidirectional LSTM cells in the encoder, an attention layer, and 2 LSTM cells in the decoder. Using an attention model allows you to establish a “soft” correspondence between input and output sequences, which improves quality and productivity. The input dimension in the last configuration is 500 and the sequence length is 26 (i.e., the maximum length of sentences in the training set). Words are converted into vectors using the word2vec encoder (with a dictionary of 445,000 words) from the gensim library. The seq2seq model is implemented using Keras and RecurrentShop. The trained seq2seq model (the weights of which are located in data/plays_ru/model_weights_plays_ru.h5
) with the parameters specified in the source files has an accuracy of 99.19% (i.e. the bot will answer 1577 out of 1601 questions correctly).
At the moment, there are 3 sets of data for training the bot: 1601 question-answer pairs from various plays ( data/plays_ru
), 136,000 pairs from various works ( data/conversations_ru
, thanks to NLP Datasets) and 2,500,000 pairs from subtitles for 347 TV series ( data/subtitles_ru
, more details in Russian subtitles dataset). The word2vec models are trained on all datasets, but the neural network is only trained on the playsets dataset.
Training the word2vec model and a neural network on a dataset of plays without changing parameters takes approximately 7.5 hours on nvidia gtx1070 and intel core i7. Training on data sets from works and subtitles on this hardware will last at least several days.
The bot can work in several modes:
The training set consists of 1600 question %% answer pairs taken from various Russian plays. It is stored in the file data/plays_ru/plays_ru.txt
. Each pair of questions %% answer is written on a new line, i.e. There is only one pair on one line.
All stages necessary for training are performed by the prepare()
or load_prepared()
and train()
methods of the TextToText
class from text_to_text.py
module and the build_language_model()
method of LanguageModel
class from the preparing_speech_to_text.py
module. Or you can use the train()
function of the bot.py
module.
To run the bot in training mode, you need to run bot.py
with the train
parameter. For example, like this:
python3 bot.py train
Or you can simply run bot.py
(or run_bot.sh
) and select mode 1 and 1 from the proposed menu.
The learning process consists of several stages:
1. Preparation of the training sample.
To prepare the training sample, the source_to_prepared.py
module, consisting of the SourceToPrepared
class, is used. This class reads a training set from a file, separates questions and answers, removes unsupported characters and punctuation, and converts the resulting questions and answers into fixed-size sequences (using <PAD>
filler words). This class also prepares questions for the network and processes its responses. For example:
Input: "Зачем нужен этот класс? %% Для подготовки данных"
Output: [['<PAD>', ..., '<PAD>', '?', 'класс', 'этот', 'нужен', 'Зачем', '<GO>'], ['Для', 'подготовки', 'данных', '<EOS>', '<PAD>', ..., '<PAD>']]
The training sample is read from the file data/plays_ru/plays_ru.txt
, the converted [question, answer] pairs are saved in the file data/plays_ru/prepared_plays_ru.pkl
. Also, a histogram of the sizes of questions and answers is built, which is saved in data/plays_ru/histogram_of_sizes_sentences_plays_ru.png
.
To prepare a training sample from a data set based on plays, simply pass the name of the corresponding file to the prepare_all()
method. To prepare a training sample from a dataset based on works or subtitles, you must first call combine_conversations()
or combine_subtitles()
and then call preapre_all()
.
2. Translation of words into real vectors.
The word_to_vec.py
module, consisting of the WordToVec
class, is responsible for this stage. This class encodes fixed-size sequences (i.e. our questions and answers) into real vectors. The word2vec encoder from the gensim library is used. The class implements methods for encoding all [question, answer] pairs from the training set into vectors at once, as well as for encoding a question to the network and decoding its answer. For example:
Input: [['<PAD>', ..., '<PAD>', '?', 'класс', 'этот', 'нужен', 'Зачем', '<GO>'], ['Для', 'кодирования', 'предложений', '<EOS>', '<PAD>', ..., '<PAD>']]
Output: [[[0.43271607, 0.52814275, 0.6504923, ...], [0.43271607, 0.52814275, 0.6504923, ...], ...], [[0.5464854, 1.01612, 0.15063584, ...], [0.88263285, 0.62758327, 0.6659863, ...], ...]]
(i.e. each word is encoded as a vector with a length of 500 (this value can be changed, the size
argument in the build_word2vec()
method))
Pairs [question, answer] are read from the file data/plays_ru/prepared_plays_ru.pkl
(which was obtained at the previous stage; to expand and improve the quality of the model, it is recommended to additionally pass a preprocessed data set from subtitles data/subtitles_ru/prepared_subtitles_ru.pkl
to the build_word2vec()
method) , encoded pairs are saved to the file data/plays_ru/encoded_plays_ru.npz
. Also, during the work process, a list of all used words is built, i.e. dictionary, which is saved in the file data/plays_ru/w2v_vocabulary_plays_ru.txt
. The trained word2vec model is also saved in data/plays_ru/w2v_model_plays_ru.bin
.
To translate words from the training set into vectors, just pass the name of the corresponding file to the build_word2vec()
method and set the desired parameters.
3. Network training.
At this stage, the seq2seq model is trained on previously prepared data. The text_to_text.py
module, consisting of the TextToText
class, is responsible for this. This class trains the network, saves the network model and weighting coefficients, and allows you to conveniently interact with the trained model.
For training, you need a file data/plays_ru/encoded_plays_ru.npz
containing [question, answer] pairs encoded into vectors that were obtained at the previous stage. During the training process, after every 5th epoch (this value can be changed), the extreme intermediate result of network training is saved in the file data/plays_ru/model_weights_plays_ru_[номер_итерации].h5
, and at the last iteration in the file data/plays_ru/model_weights_plays_ru.h5
(iteration - one network training cycle, a certain number of epochs, after which the weights are saved to a file and you can, for example, evaluate the accuracy of the network or display other parameters. By default, the number of epochs is 5, and the total number of iterations is 200). The network model is saved in the file data/plays_ru/model_plays_ru.json
.
After training the network, the quality of training is assessed by submitting all questions to the input of the trained network and comparing the network’s answers with the standard answers from the training set. If the accuracy of the estimated model is higher than 75%, then the incorrect answers from the network are saved to the file data/plays_ru/wrong_answers_plays_ru.txt
(so that they can be analyzed later).
To train the network, just pass the name of the corresponding file to the train()
method and set the desired parameters.
4. Building a language model and dictionary for PocketSphinx.
This stage is needed if speech recognition will be used. At this stage, a static language model and phonetic dictionary for PocketSphinx are created based on questions from the training set (caution: the more questions in the training set, the longer it will take PocketSphinx to recognize speech). To do this, use build_language_model()
method (which accesses text2wfreq, wfreq2vocab, text2idngram
and idngram2lm
from CMUclmtk_v0.7) of the LanguageModel
class from the preparing_speech_to_text.py
module. This method uses questions from the file with the original training set (before they are prepared by the source_to_prepared.py
module), saves the language model in the file temp/prepared_questions_plays_ru.lm
, and the dictionary in temp/prepared_questions_plays_ru.dic
( plays_ru
may change depending on what training set was used). At the end of the work, the language model and dictionary will be copied to /usr/local/lib/python3.х/dist-packages/pocketsphinx/model
with the names ru_bot_plays_ru.lm
and ru_bot_plays_ru.dic
( plays_ru
can change in the same way as in the previous stage, you will need to enter the root user password).
To interact with the trained seq2seq model, the predict()
function is intended (which is a wrapper over the predict()
method of the TextToText
class from the text_to_text.py
module) of the bot.py
module. This function supports several operating modes. In text mode, i.e. When the user enters a question from the keyboard and the network responds with text, only the predict()
method of the TextToText
class from the text_to_text.py
module is used. This method accepts a string with a question to the network and returns a string with the network’s response. To work, you need: the file data/plays_ru/w2v_model_plays_ru.bin
with the trained word2vec model, the file data/plays_ru/model_plays_ru.json
with the parameters of the network model, and the file data/plays_ru/model_weights_plays_ru.h5
with the weights of the trained network.
To run the bot in this mode, you need to run bot.py
with the predict
parameter. For example, like this:
python3 bot.py predict
You can also simply run bot.py
(or run run_bot.sh
) and select mode 2 and 1 in the proposed menu.
This mode differs from the previous one in that the parameter speech_synthesis = True
is passed to the predict()
function of bot.py
module. This means that interaction with the network will proceed in the same way as in mode 2, but the network’s response will additionally be voiced.
Voicing answers, i.e. speech synthesis, implemented in the get()
method of the TextToSpeech
class from the text_to_speech.py
module. This class requires RHVoice-client to be installed and, using command line arguments, passes it the necessary parameters for speech synthesis (you can see about installing RHVoice and examples of accessing RHVoice-client in install_files/Install RHVoice.txt
). The get()
method takes as input the string that needs to be converted to speech, and, if required, the name of the .wav file in which the synthesized speech will be saved (with a sampling rate of 32 kHz and a depth of 16 bits, mono; if not specified, speech will be played immediately after synthesis). When creating an object of the TextToSpeech
class, you can specify the name of the voice to use. 4 voices are supported: male Aleksandr and three female - Anna, Elena and Irina (more details in RHVoice Wiki).
To run the bot in this mode, you need to run bot.py
with the predict -ss
parameters. For example, like this:
python3 bot.py predict -ss
You can also simply run bot.py
(or run run_bot.sh
) and select mode 3 and 1 in the proposed menu.
To work in this mode, you need to pass the parameter speech_recognition = True
to the predict()
function of the bot.py
module. This means that interaction with the network, or rather entering questions, will be carried out using voice.
Speech recognition is implemented in the get()
method of the SpeechToText
class of speech_to_text.py
module. This class uses PocketSphinx and a language model with a dictionary ( ru_bot_plays_ru.lm
and ru_bot_plays_ru.dic
), which were built in network training mode. The get()
method can work in two modes: from_file
- speech recognition from a .wav or .opus file with a sampling frequency >=16 kHz, 16bit, mono (the file name is passed as a function argument) and from_microphone
- speech recognition from a microphone. The operating mode is set when creating an instance of the SpeechRecognition
class, because Loading the language model takes some time (the larger the model, the longer it takes to load).
To run the bot in this mode, you need to run bot.py
with the parameters predict -sr
. For example, like this:
python3 bot.py predict -sr
You can also simply run bot.py
(or run run_bot.sh
) and select mode 4 and 1 in the proposed menu.
This is a combination of modes 3 and 4.
To work in this mode, you need to pass the parameters speech_recognition = True
and speech_synthesis = True
to the predict()
function of the bot.py
module. This means that questions will be entered using voice, and network responses will be spoken. A description of the modules used can be found in the description of modes 3 and 4.
To run the bot in this mode, you need to run bot.py
with the parameters predict -ss -sr
. For example, like this:
python3 bot.py predict -sr -ss
or
python3 bot.py predict -ss -sr
You can also simply run bot.py
(or run run_bot.sh
) and select mode 5 and 1 in the proposed menu.
This server provides a REST api for interacting with the bot. When the server starts, a neural network trained on a data set from plays is loaded. Datasets from works and subtitles are not yet supported.
The server is implemented using Flask, and multi-threaded mode (production version) using gevent.pywsgi.WSGIServer. The server also has a limit on the size of received data in the request body equal to 16 MB. The implementation is in the rest_server.py
module.
You can start the WSGI server by running run_rest_server.sh
(starting the WSGI server at 0.0.0.0:5000
).
The server supports command line arguments, which make starting it a little easier. The arguments have the following structure: [ключ(-и)] [адрес:порт]
.
Possible keys:
-d
- launch a test Flask server (if the key is not specified, the WSGI server will be launched)-s
- start the server with https support (uses a self-signed certificate, obtained using openssl) Valid options адрес:порт
:
host:port
- launch on the specified host
and port
localaddr:port
- launch with auto-detection of the machine address on the local network and the specified port
host:0
or localaddr:0
- if port = 0
, then any available port will be selected automaticallyList of possible combinations of command line arguments and their description:
5000
. For example: python3 rest_server.py
host:port
- launch the WSGI server on the specified host
and port
. For example: python3 rest_server.py 192.168.2.102:5000
-d
- launch a test Flask server on 127.0.0.1:5000
. For example: python3 rest_server.py -d
-d host:port
- launch a test Flask server on the specified host
and port
. For example: python3 rest_server.py -d 192.168.2.102:5000
-d localaddr:port
- launch a test Flask server with auto-detection of the machine address on the local network and port port
. For example: python3 rest_server.py -d localaddr:5000
-s
- launch a WSGI server with https support, auto-detection of the machine address on the local network and port 5000
. For example: python3 rest_server.py -s
-s host:port
- launch a WSGI server with https support on the specified host
and port
. For example: python3 rest_server.py -s 192.168.2.102:5000
-s -d
- launch a test Flask server with https support on 127.0.0.1:5000
. For example: python3 rest_server.py -s -d
-s -d host:port
- launches a test Flask server with https support on the specified host
and port
. For example: python3 rest_server.py -s -d 192.168.2.102:5000
-s -d localaddr:port
- launches a test Flask server with https support, auto-detection of the machine address on the local network and port port
. For example: python3 rest_server.py -s -d localaddr:5000
The server can choose the available port itself; to do this, you need to specify port 0
in host:port
or localaddr:port
(for example: python3 rest_server.py -d localaddr:0
).
A total of 5 queries are supported:
/chatbot/about
will return information about the project/chatbot/questions
will return a list of all supported questions/chatbot/speech-to-text
, accepts a .wav/.opus file and returns the recognized string/chatbot/text-to-speech
, takes a string and returns a .wav file with synthesized speech/chatbot/text-to-text
, accepts a string and returns the bot's response as a string 1. The server has basic http authorization. Those. To gain access to the server, you need to add a header to each request containing login:password, encoded using base64
(login: bot
, password: test_bot
). Example in python:
import requests
import base64
auth = base64.b64encode('testbot:test'.encode())
headers = {'Authorization' : "Basic " + auth.decode()}
It will look like this:
Authorization: Basic dGVzdGJvdDp0ZXN0
2. In the speech recognition request (which is number 3), the server expects a .wav or .opus file (>=16kHz 16bit mono) with recorded speech, which is also transmitted to json using base64
encoding (i.e. opens .wav /.opus file, reads in an array of byte, then shut up base64
, the resulting array is decoded from bypass form to line utf-8
is placed in JSON), in Python it looks like this:
# Формирование запроса
auth = base64.b64encode('testbot:test'.encode())
headers = {'Authorization' : "Basic " + auth.decode()}
with open('test.wav', 'rb') as audio:
data = audio.read()
data = base64.b64encode(data)
data = {'wav' : data.decode()}
# Отправка запроса серверу
r = requests.post('http://' + addr + '/chatbot/speech-to-text', headers=headers, json=data)
# Разбор ответа
data = r.json()
data = data.get('text')
print(data)
3. In the request for speech synthesis (which is at number 4), the server will send the JSON response with the .Wav file (16bit 32kHz Mon) with a synthesized speech, which was encoded as described above (to be decoded from json to get it from json The desired line in the byte array, then decode it using base64
and write it to a file or stream, to then play), an example on Python:
# Формирование запроса
auth = base64.b64encode('testbot:test'.encode())
headers = {'Authorization' : "Basic " + auth.decode()}
data = {'text':'который час'}
# Отправка запроса серверу
r = requests.post('http://' + addr + '/chatbot/text-to-speech', headers=headers, json=data)
# Разбор ответа
data = r.json()
data = base64.b64decode(data.get('wav'))
with open('/home/vladislav/Проекты/Voice chat bot/temp/answer.wav', 'wb') as audio:
audio.write(data)
All transmitted data are wrapped in JSON (including errors).
{
"text" : "Информация о проекте."
}
{
"text" : ["Вопрос 1",
"Вопрос 2",
"Вопрос 3"]
}
{
"wav" : "UklGRuTkAABXQVZFZm10IBAAAAABAAEAAH..."
}
or
{
"opus" : "ZFZm10IBUklQVZFZm10IBARLASBAAEOpH..."
}
The server will tell him:
{
"text" : "который час"
}
{
"text" : "который час"
}
The server will tell him:
{
"wav" : "UklGRuTkAABXQVZFZm10IBAAAAABAAEAAH..."
}
{
"text" : "прощай"
}
The server will tell him:
{
"text" : "это снова я"
}
1. GET CARAS on /chatbot/about
An example of a request that forms python-requests
:
GET /chatbot/about HTTP/1.1
Host: 192.168.2.83:5000
Connection: keep-alive
Accept-Encoding: gzip, deflate
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: python-requests/2.9.1
An example of a request that forms Curl ( curl -v -u testbot:test -i http://192.168.2.83:5000/chatbot/about
):
GET /chatbot/about HTTP/1.1
Host: 192.168.2.83:5000
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: curl/7.47.0
In both cases, the server replied:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 305
Date: Fri, 02 Nov 2018 15:13:21 GMT
{
"text" : "Информация о проекте."
}
2. GET request on /chatbot/questions
An example of a request that forms python-requests
:
GET /chatbot/questions HTTP/1.1
Host: 192.168.2.83:5000
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: python-requests/2.9.1
Connection: keep-alive
Accept-Encoding: gzip, deflate
An example of a request that forms Curl ( curl -v -u testbot:test -i http://192.168.2.83:5000/chatbot/questions
):
GET /chatbot/questions HTTP/1.1
Host: 192.168.2.83:5000
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: curl/7.47.0
In both cases, the server replied:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 1086
Date: Fri, 02 Nov 2018 15:43:06 GMT
{
"text" : ["Что случилось?",
"Срочно нужна твоя помощь.",
"Ты уезжаешь?",
...]
}
3. Post supply to /chatbot/speech-to-text
An example of a request that forms python-requests
:
POST /chatbot/speech-to-text HTTP/1.1
Host: 192.168.2.83:5000
User-Agent: python-requests/2.9.1
Accept: */*
Content-Length: 10739
Connection: keep-alive
Content-Type: application/json
Authorization: Basic dGVzdGJvdDp0ZXN0
Accept-Encoding: gzip, deflate
{
"wav" : "UklGRuTkAABXQVZFZm10IBAAAAABAAEAAH..."
}
An example of a request that forms Curl ( curl -v -u testbot:test -i -H "Content-Type: application/json" -X POST -d '{"wav":"UklGRuTkAABXQVZFZm10IBAAAAABAAEAAH..."}' http://192.168.2.83:5000/chatbot/speech-to-text
Text):
POST /chatbot/speech-to-text HTTP/1.1
Host: 192.168.2.83:5000
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: curl/7.47.0
Accept: */*
Content-Type: application/json
Content-Length: 10739
{
"wav" : "UklGRuTkAABXQVZFZm10IBAAAAABAAEAAH..."
}
The server replied:
HTTP/1.1 200 OK
Content-Length: 81
Date: Fri, 02 Nov 2018 15:57:13 GMT
Content-Type: application/json
{
"text" : "Распознные слова из аудиозаписи"
}
4. Post-provision on /chatbot/text-to-speech
An example of a request that forms python-requests
:
POST /chatbot/text-to-speech HTTP/1.1
Host: 192.168.2.83:5000
Connection: keep-alive
Accept: */*
User-Agent: python-requests/2.9.1
Accept-Encoding: gzip, deflate
Content-Type: application/json
Content-Length: 73
Authorization: Basic dGVzdGJvdDp0ZXN0
{
"text" : "который час"
}
An example of a request that forms Curl ( curl -v -u testbot:test -i -H "Content-Type: application/json" -X POST -d '{"text":"который час"}' http://192.168.2.83:5000/chatbot/text-to-speech
):
POST /chatbot/text-to-speech HTTP/1.1
Host: 192.168.2.83:5000
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: curl/7.47.0
Accept: */*
Content-Type: application/json
Content-Length: 32
{
"text" : "который час"
}
The server replied:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 78151
Date: Fri, 02 Nov 2018 16:36:02 GMT
{
"wav" : "UklGRuTkAABXQVZFZm10IBAAAAABAAEAAH..."
}
5. Post-call on /chatbot/text-to-text
An example of a request that forms python-requests
:
POST /chatbot/text-to-text HTTP/1.1
Host: 192.168.2.83:5000
Accept-Encoding: gzip, deflate
Content-Type: application/json
User-Agent: python-requests/2.9.1
Connection: keep-alive
Content-Length: 48
Accept: */*
Authorization: Basic dGVzdGJvdDp0ZXN0
{
"text" : "прощай"
}
An example of a request that forms Curl ( curl -v -u testbot:test -i -H "Content-Type: application/json" -X POST -d '{"text":"прощай"}' http://192.168.2.83:5000/chatbot/text-to-text
):
POST /chatbot/text-to-text HTTP/1.1
Host: 192.168.2.83:5000
Authorization: Basic dGVzdGJvdDp0ZXN0
User-Agent: curl/7.47.0
Accept: */*
Content-Type: application/json
Content-Length: 23
{
"text" : "прощай"
}
The server replied:
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 68
Date: Fri, 02 Nov 2018 16:41:22 GMT
{
"text" : "это снова я"
}
The project contains Dockerfile, which allows you to assemble Docker an image based on this project. If you used install_packages.sh
and before installing all dependencies and you did not install Docker earlier, you need to install it manually. For example, like this (checked at Ubuntu 04/16/18.04):
sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
sudo apt-add-repository 'deb https://apt.dockerproject.org/repo ubuntu-xenial main' -y
sudo apt-get -y update
sudo apt-get install -y docker-engine
After installation, perform sudo systemctl status docker
to make sure that everything has been installed and works (in the output of this command you will find a line with green text active (running)
).
To assemble the image, you must go to the terminal into the folder with the project and perform ( -t
-launch of the terminal .
-the directory from which the Docker Build is called (the point means all files for the image are located in the current directory), voice_chatbot:0.1
-the image label and the image label and the image label and His version):
sudo docker build -t voice_chatbot:0.1 .
After the successful execution of this operation, you can display a list of existing images by completing:
sudo docker images
In the list you will see our image - voice_chatbot:0.1
.
Now you can launch this image ( -t
-launch of the terminal, -i
-interactive mode, --rm
-delete the container after the end of its operation, -p 5000:5000
-throw all the connections to the port 5000 to the car car to the port on the port. 5000 (you can also clearly indicate another address to which you will need to connect from the outside, for example: -p 127.0.0.1:5000:5000
:5000)):
sudo docker run -ti --rm -p 5000:5000 voice_chatbot:0.1
As a result, the server will start at 0.0.0.0:5000
and you can contact it at the address indicated in the terminal (if you did not indicate the other when starting the image).
Note : the assembled Docker-image weighs 5.2GB. The initial project files also include the .dockerignore
file, in which there are file names that do not need to be added to the image. To minimize the size of the final image, all files related to setting data from stories and subtitles, files with intermediate results of data processing and training of the neural network were excluded from it. This means that the image contains only files of a trained network and raw source data sets.
Just in case, in the source files of the project there is a command_for_docker.txt
file containing a minimum necessary set of commands for working with Docker.
If you have questions or you want to cooperate, you can write to me by mail: [email protected] or in linkedin.