theblackcat102
/

electra-large-reward-model

Text Classification

Model card Files Files and versions Community

electra-large-reward-model / README.md

theblackcat102's picture

Update README.md

7d67d6f over 2 years ago

|

387 Bytes

	---
	license: mit
	---


	Reward Model pretrained on openai/webgpt_comparison and humanfeedback summary. Unlike the other electra-large model this model is trained using rank loss with one more datasets.

	On validation dataset the result is much more stable than usual.

	You can refer to this [wandb](https://wandb.ai/theblackcat102/reward-model/runs/1d4e4oi2?workspace=) for more details