transformers data augmentation
1.0.0
Code associated with the Data Augmentation using Pre-trained Transformer Models paper
Code contains implementation of the following data augmentation methods
In paper, we use three datasets from following resources
Run src/utils/download_and_prepare_datasets.sh
file to prepare all datsets.download_and_prepare_datasets.sh
performs following steps
To run this code, you need following dependencies
To run data augmentation experiment for a given dataset, run bash script in scripts
folder.
For example, to run data augmentation on snips
dataset,
scripts/bart_snips_lower.sh
for BART experimentscripts/bert_snips_lower.sh
for rest of the data augmentation methods@inproceedings{kumar-etal-2020-data,
title = "Data Augmentation using Pre-trained Transformer Models",
author = "Kumar, Varun and
Choudhary, Ashutosh and
Cho, Eunah",
booktitle = "Proceedings of the 2nd Workshop on Life-long Learning for Spoken Language Systems",
month = dec,
year = "2020",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.lifelongnlp-1.3",
pages = "18--26",
}
Please reachout to [email protected] for any questions related to this code.
This project is licensed under the Creative Common Attribution Non-Commercial 4.0 license.