Qwen2.5-7B-Instruct - a SpectralPO Collection

SpectralPO 's Collections

DeepSeek-R1-Distill-Llama-8B

Qwen2.5-32B-Instruct

Qwen2.5-14B-Instruct

DeepSeek-R1-Distill-Qwen-7B

Qwen2.5-7B-Instruct

Offline RL with Neg Samples

Qwen2.5-7B-Instruct

updated Apr 27