joshmiller656
/

Llama-3.1-Nemotron-70B-Instruct-AWQ-INT4

Does the RL lead to this model to prefer to give answers in a certain length scope?

#1 opened 7 days ago by