File size: 663 Bytes
8535e80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
tags:
- kernel
---

# Optimizer

Optimizer is a python package that provides:
- PyTorch implementation of recent optimizer algorithms
- with support for parallelism techniques for efficient large-scale training.

### Currently implemented
- [Parallel Muon with FSDP2](./docs/muon/parallel_muon.pdf)

## Usage

```python
import torch
from torch.distributed.fsdp import FullyShardedDataParallel as FSDP
from kernels import get_kernel

optimizer = get_kernel("motif-technologies/optimizer")

model = None # your model here
fsdp_model = FSDP(model)

optim = optimizer.Muon(
    fsdp_model.parameters(),
    lr=0.01,
    momentum=0.9,
    weight_decay=1e-4,
)
```