English | Chinese
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answerin
RetrievalDense Passage Retrieval for Open-Domain Question Answering
APScheduler
A full-process dialogue system that can be deployed and executed
TensorFlow model
Transformer
Seq2Seq
SMN retrieval model
Scheduled Sampling Transformer
GPT2
Task Dialogue
Pytorch model
Transformer
Seq2Seq
This project aims to build a dialogue system that can be deployed online. It includes both open domain and task-oriented dialogue systems. It reproduces related models. The paper reading notes are placed in another project: nlp-paper. TensorFlow and Pytorch are used in the project. To implement.
The data directory in the warehouse contains toy data for each corpus, which can be used to verify system execution. The complete corpus and Paper can be viewed here
LCCC
CrossWOZ
little yellow chicken
Douban
Ubuntu
Qingyun
Tieba
Linux executes run.sh, and the project directory check executes check.sh (or check.py)
actuator.py in the root directory is the general execution entrance, which is executed by calling the following command format (note that requirements.txt is installed before execution):
python actuator.py --version [Options] --model [Options] ...
When executing through actuator.py in the root directory, --version
, --model
and --act
are required parameters, among which --version
is the code version tf/torch
, --model
is the corresponding model transformer/smn...
for execution. transformer/smn...
, and act is execution mode ( pre_treat
mode by default). For more detailed command parameters, please refer to actuator.py
under each model or the corresponding json configuration file in the config directory.
--act
execution mode is described as follows:
The pre_treat mode is text preprocessing mode. If there is no word segmentation result set and dictionary, you need to run the pre_treat mode first.
train mode is training mode
evaluate mode is indicator evaluation mode
Chat mode is a conversation mode. When running in chat mode, enter ESC to exit the conversation.
The normal execution sequence is pre_treat->train->evaluate->chat
There is a separate actuator.py under each model, which can bypass the outer coupling for execution and development. However, pay attention to adjusting the project directory path during execution.
Under dialogue, the core code of the relevant model is placed to facilitate future encapsulation and packaging.
checkpoints saves the location for checkpoints
config is the directory where the configuration file is saved
data is the storage location of the original data. At the same time, the intermediate data files generated during the execution of the model are also saved in this directory.
models save directory for models
tensorflow and pytorch place the core code for model construction and execution of each module
preprocess_corpus.py is a corpus processing script, which processes each corpus in single-round and multi-round dialogues, and standardizes unified interface calls.
read_data.py is used for the data loading format call of load_dataset.py
metrics.py is a script for various indicators
tools.py is a tool script, which contains word breakers, log operations, checkpoint save/load scripts, etc.
Place documentation instructions under docs, including model paper reading notes
docker (mobile) is used for server (mobile terminal) deployment scripts
Server is the UI service interface. Use flask to build and use it. Just execute the corresponding server.py.
tools is the reserved tool directory
actuator.py (run.sh) is the total actuator entrance
check.py (check.sh) is the project directory check script
Before using the SMN retrieval dialogue system, you need to prepare the Solr environment. Linux is recommended for the Solr deployment system environment. It is recommended to use container deployment (Docker is recommended) for tools, and prepare:
Solr(8.6.3)
pysolr(3.9.0)
A brief explanation is provided below. For more details, please refer to the article: Obtain candidate response retrieval in the retrieval dialogue system - using pysolr to call Solr
To ensure that Solr runs stably online and facilitate subsequent maintenance, please use DockerFile for deployment. DockerFile acquisition address: docker-solr
For test model use only, you can use the following simplest build instructions:
docker pull solr:8.6.3 # 然后启动solr docker run -itd --name solr -p 8983:8983 solr:8.6.3 # 然后创建core核心选择器,这里取名smn(可选) docker exec -it --user=solr solr bin/solr create_core -c smn
Regarding the word segmentation tools in Solr, there are IK Analyzer, Smartcn, Pinyin word segmenter, etc. You need to download the corresponding jar, and then add the configuration in the Solr core configuration file managed-schema.
Special note : If you use TF-IDF, you also need to enable similarity configuration in managed-schema.
After deploying Solr online, use pysolr in Python to connect and use:
pip install pysolr
The method of adding index data (generally requiring security check first) is as follows. Add an index to the reply data. Responses are a json in the form: [{},{},{},...]. Each object in it is constructed according to your reply needs:
solr = pysolr.Solr(url=solr_server, always_commit=True, timeout=10) # 安全检查 solr.ping() solr.add(docs=responses)
The query method is as follows. To query all statements using TF-IDF, the query statement method is as follows:
{!func}sum(product(idf(utterance,key1),tf(utterance,key1),product(idf(utterance,key2),tf(utterance,key2),...)
You need to add the data to Solr before use. To use it in this SMN model, just execute the pre_treat mode first.
Attention Is All You Need | Reading Notes: The pioneering work of Transformer, worthy of intensive reading | Ashish et al, 2017
Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots | Reading notes: SMN retrieval dialogue model, multi-layer and multi-granularity extraction of information | Devlin et al, 2018
Massive Exploration of Neural Machine Translation Architectures | Reading notes: The first large-scale analysis using NMT architecture hyperparameters as an example is presented. The experiment brings novel insights and practical suggestions for building and extending NMT architectures. | Denny et al., 2017
Scheduled Sampling for Transformers | Reading Notes: Applying Scheduled Sampling in Transformer | Mihaylova et al, 2019
Licensed under the Apache License, Version 2.0. Copyright 2021 DengBoCong. Copy of the license.