Archive-models/nq_hotpotqa_train-seven_test-search-r1-grpo-deepseekr1-7b-em-structureformat2 Updated 24 days ago • 10
Archive-models/nq_hotpotqa_train-seven_test-search-r1-ppo-deepseekr1-7b-em-structureformat2 Updated 24 days ago • 7
Archive-models/nq_hotpotqa_train-seven_test-search-r1-ppo-deepseekr1-14b-em-structureformat2 Updated 24 days ago • 6
Archive-models/nq_hotpotqa_train-seven_test-search-r1-grpo-deepseekr1-14b-em-structureformat2 Updated 24 days ago • 67
Archive-models/nq_hotpotqa_train-seven_test-search-r1-ppo-qwen2.5-3b-em-structureformat2-sample10000 Updated 24 days ago • 9
Archive-models/nq_hotpotqa_train-seven_test-search-r1-ppo-qwen2.5-3b-em-structureformat2-sample1000 Updated 24 days ago • 9
Archive-models/nq_hotpotqa_train-seven_test-search-r1-grpo-qwen2.5-3b-em-structureformat2-sample10000 Updated 24 days ago • 8
Archive-models/nq_hotpotqa_train-seven_test-search-r1-grpo-qwen2.5-3b-em-structureformat2-sample1 Updated 24 days ago • 8
Archive-models/nq_hotpotqa_train-seven_test-search-r1-ppo-qwen2.5-7b-em-randomretrieval Updated 24 days ago • 8
Archive-models/nq_hotpotqa_train-seven_test-search-r1-ppo-qwen2.5-3b-em-structureformat2-sample100 Updated 24 days ago • 8