ToxicClassifier

Toxic Classifiers developed for the GATE Cloud. Two models are available:

kaggle: trained on the Kaggle Toxic Comments Challenge dataset.
olid: trained on the OLIDv1 dataset from OffensEval 2019 (paper)

We fine-tuned a Roberta-base model using the simpletransformers toolkit.

Requirements

python=3.8 pandas tqdm pytorch simpletransformers

Pre-defined environments

conda

conda env create -f environment.yml

pip

pip install -r requirements.txt

(if the above does not work or if you want to use GPUs, you can try to follow the installation steps of simpletransformers: https://simpletransformers.ai/docs/installation/

Models

Download models from the latest release of this repository (currently available kaggle.tar.gz, olid.tar.gz)
Decompress file inside models/en/ (which will create models/en/kaggle or models/en/olid respectively)

Basic Usage

python __main__.py -t "This is a test" (should return 0 = non-toxic)

python __main__.py -t "Bastard!" (should return 1 = toxic)

Options

t: text
l: language (currently only supports "en")
c: classifier (currently supports "kaggle" and "olid" -- default="kaggle")
g: gpu (default=False)

Output

The output is composed by the predicted class and the probabilities of each class.

REST Service

Pre-built Docker images are available for a REST service that accepts text and returns a classification according to the relevant model - see the "packages" section for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
ToxicClassifier		ToxicClassifier
docker		docker
models/en		models/en
test_data		test_data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ToxicClassifier

Requirements

Pre-defined environments

Models

Basic Usage

Options

Output

REST Service

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors 3

Uh oh!

Languages

License

GateNLP/ToxicClassifier

Folders and files

Latest commit

History

Repository files navigation

ToxicClassifier

Requirements

Pre-defined environments

Models

Basic Usage

Options

Output

REST Service

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors 3

Uh oh!

Languages

Packages