This is a 800M parameters model pre-trained with [QuEST](https://arxiv.org/abs/2502.05003) over 80B C4 tokens in INT1 format (W1A1). The code to verify that this model works in INT1 can be found [here](https://github.com/IST-DASLab/QuEST/blob/main/src/HadamardBinaryTesting.ipynb).