RabotniKuma
/

Fast-OpenMath-Nemotron-14B

Safetensors

qwen2

Model card Files Files and versions Community

RabotniKuma commited on 17 days ago

Commit

fe62ab3

verified ·

1 Parent(s): 23ce9e3

Update README.md

Browse files

Files changed (1) hide show

README.md +15 -14

README.md CHANGED Viewed

@@ -16,21 +16,22 @@ Technical details can be found in [our github repository](https://github.com/ana
 **Note:**
 This model likely inherits the ability to perform inference in TIR mode from the original model. However, all of our experiments were conducted in CoT mode, and its performance in TIR mode has not been evaluated.
-# Performance comparison
-<img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='300px'>
-<img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_nemotron.png?raw=true' max-height='300px'>
-|                            |              | AIME 2024        |                                 | AIME 2025        |                                 |
-| -------------------------- | ------------ | ---------------- | ------------------------------- | ---------------- | ------------------------------- |
-| Model                      | Token budget | Pass@1 (avg. 64) | Output tokens | Pass@1 (avg. 64) | Output tokens |
-| OpenMath-Nemotron-14B      | 24000        | **73.3**             | 12277                           | **64.4**             | 13027                           |
-|                            | 16384        | 66.4             | 8932                            | 53.8             | 11547                           |
-|                            | 12800        | 57               | 7000                            | 42.3             | 9996                            |
-|                            | 8192         | 37.4             | 4835                            | 28               | 7186                            |
-| Fast-OpenMath-Nemotron-14B | 24000        | 71.7             | 10545                           | 60.4             | 11053                           |
-|                            | 16384        | **68.2**             | 8270                            | **55.6**             | 10216                           |
-|                            | 12800        | **62.3**             | 6359                            | **47.7**             | 9052                            |
-|                            | 8192         | **47.6**             | 4299                            | **33.8**             | 6674                            |
 # Inference
 ## vLLM

 **Note:**
 This model likely inherits the ability to perform inference in TIR mode from the original model. However, all of our experiments were conducted in CoT mode, and its performance in TIR mode has not been evaluated.
+# Evaluation
+<img src='https://github.com/analokmaus/kaggle-aimo2-fast-math-r1/blob/master/assets/pass1_aime_all.png?raw=true' max-height='400px'>
+|                            |              | AIME 2024        |                    | AIME 2025        |                    |
+| -------------------------- | ------------ | ---------------- | ------------------ | ---------------- | ------------------ |
+| Model                      | Token budget | Pass@1 (avg. 64) | Mean output tokens | Pass@1 (avg. 64) | Mean output tokens |
+| OpenMath-Nemotron-14B      | 32000        | 76.2             | 11493              | 64.5             | 13414              |
+|                            | 24000        | 75.4             | 11417              | 63.4             | 13046              |
+|                            | 16000        | 66               | 10399              | 54.2             | 11422              |
+|                            | 12000        | 55               | 9053               | 40               | 9609               |
+|                            | 8000         | 36               | 6978               | 27.2             | 7083               |
+| Fast-OpenMath-Nemotron-14B | 32000        | 70.7             | 9603               | 61.4             | 11424              |
+|                            | 24000        | 70.6             | 9567               | 60.9             | 11271              |
+|                            | 16000        | 66.6             | 8954               | 55.3             | 10190              |
+|                            | 12000        | 59.4             | 7927               | 45.6             | 8752               |
+|                            | 8000         | 47.6             | 6282               | 33.8             | 6589               |
 # Inference
 ## vLLM