chameleon-lizard
/

tox-mt0-xl

Text2Text Generation

Model card Files Files and versions Community

tox-mt0-xl / README.md

chameleon-lizard's picture

chameleon-lizard

Update README.md

79bb9db verified about 1 year ago

|

history blame contribute delete

2.04 kB

	---
	library_name: transformers
	license: openrail++
	datasets:
	- textdetox/multilingual_paradetox
	- chameleon-lizard/synthetic-multilingual-paradetox
	language:
	- en
	- ru
	- uk
	- am
	- de
	- es
	- zh
	- ar
	- hi
	pipeline_tag: text2text-generation
	---

	# Model Card for Model ID

	Finetune of the mt0-xl model for text toxification task.


	## Model Details

	### Model Description

	This is a finetune of mt0-xl model for text toxification task. Can be used for synthetic data generation from non-toxic examples.

	- Developed by: Nikita Sushko
	- Model type: mt5-xl
	- Language(s) (NLP): English, Russian, Ukranian, Amharic, German, Spanish, Chinese, Arabic, Hindi
	- License: OpenRail++
	- Finetuned from model: mt0-xl

	## Uses

	This model is intended to be used for synthetic data generation from non-toxic examples.

	### Direct Use

	The model may be directly used for text toxification tasks.

	### Out-of-Scope Use

	The model may be used for generating toxic versions of sentences.

	## Bias, Risks, and Limitations

	Since this model generates toxic versions of sentences, it may be used to increase toxicity of generated texts.

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	import transformers

	checkpoint = 'chameleon-lizard/tox-mt0-xl'

	tokenizer = transformers.AutoTokenizer.from_pretrained(checkpoint)
	model = transformers.AutoModelForSeq2SeqLM.from_pretrained(checkpoint, torch_dtype='auto', device_map="auto")

	pipe = transformers.pipeline(
	"text2text-generation",
	model=model,
	tokenizer=tokenizer,
	max_length=512,
	truncation=True,
	)

	language = 'English'
	text = "That's dissapointing."
	print(pipe('Rewrite the following text in {language} the most toxic and obscene version possible: {text}')[0]['generated_text'])
	# Resulting text: "That's dissapointing, you stupid ass bitch."
	```

	Be sure to prompt with the provided prompt format for the best performance. Failure to include target language may result in model responses be in random language.