explain then predict Download - explain then predict Source code download

explain then predict

Other source code

1.0.0

Download

Adversarial robustness under the "Explain Then Predict" setting

Overview

This repository contains the source code for our BlackBoxNLP 2024 @ EMNLP paper:

Enhancing adversarial robustness in Natural Language Inference using explanations

In this work, we investigate whether the usage of intermediate explanations in the Natural Language Inference (NLI) task can serve as a model-agnostic defence strategy against adversarial attacks. Our claim is that the intermediate explanation can filter out potential noise superimposed by the adversarial attack in the input pair (premise, hypothesis). Through extensive experimentation, we prove that conditioning the output label (entailment, contradiction, neutral) on an intermediate explanation that describes the inference relationship between the input premise and hypothesis, adversarial robustness is indeed achieved.

Project structure

The repo is organized in the following core directories:

fine-tuning: Includes the code for training and evaluating all the models that are used in our experiments. See the README file located in the fine-tuning directory for more details.
adversarial_attacks: Includes the code for performing adversarial attacks against the aforementioned models. See the README file located in the adversarial_attacks directory for more details.

Installation

Create a local copy of the repo: git clone https://github.com/alexkoulakos/explain-then-predict.git
Navigate to the root directory: cd explain-then-predict
Create a virtual environment called venv: virtualenv --system-site-packages venv
Activate the virtual environment: src venv/bin/activate (for Linux/MacOS) or ./venv/Scripts/activate.ps1 (for Windows)
Install necessary dependancies: pip install -r requirements.txt

Support and Issues

If you encounter any issues, bugs, or have questions, please feel free to open an issue on GitHub. Describe the problem you encountered, including:

A clear description of the issue or bug
Steps to reproduce the issue (if applicable)
Any relevant error messages or screenshots
Details about your environment (Python version, OS, library versions)

We’ll do our best to respond quickly and help resolve any problems.

Citation

If you use our findings in your work, don't forget to cite our paper:

@inproceedings{koulakos-etal-2024-enhancing,
    title = "Enhancing adversarial robustness in Natural Language Inference using explanations",
    author = "Koulakos, Alexandros and Lymperaiou, Maria and Filandrianos, Giorgos and Stamou, Giorgos",
    editor = "Belinkov, Yonatan and Kim, Najoung and Jumelet, Jaap and Mohebbi, Hosein and Mueller, Aaron and Chen, Hanjie",
    booktitle = "Proceedings of the 7th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP",
    month = nov,
    year = "2024",
    address = "Miami, Florida, US",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.blackboxnlp-1.7",
    pages = "105--117"
}

Expand

Additional Information

Version 1.0.0
Type Other source code
Update Time 2024-11-30
size 63.59KB
From Github

Related Applications

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
GitHub actions/download artifact

2024-11-01

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
waymo open dataset

Other source code

December 2023 Update
SmartTube

Other source code

24.71 Stable
Sunamu

Other source code

Release 2.2.0
waymo open dataset

Other source code

December 2023 Update
wp functions

Other categories

1.0.0
termwind

Other categories

v2.3.0

Related Information All