lldacing commited on
Commit
eb7ccd8
·
verified ·
1 Parent(s): 29f5d34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -3
README.md CHANGED
@@ -1,3 +1,34 @@
1
- ---
2
- license: bsd-3-clause
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd-3-clause
3
+ ---
4
+
5
+ Windows wheels of [flash-attention](https://github.com/Dao-AILab/flash-attention)
6
+
7
+ Build cuda wheel steps
8
+ - First clone code
9
+ ```
10
+ git clone https://github.com/Dao-AILab/flash-attention
11
+ cd flash-attention
12
+ ```
13
+ - Switch tag branch, such as `v2.7.0.post2` (you can get latest tag by `git describe --tags` or list all available tags by `git tag -l`)
14
+
15
+ ```
16
+ git checkout -b v2.7.0.post2 v2.7.0.post2
17
+ ```
18
+
19
+ - Download WindowsWhlBuilder_cuda.bat into `flash-attention`
20
+
21
+ - To build with MSVC, please open the "Native Tools Command Prompt for Visual Studio". The exact name may depend on your version of Windows, Visual Studio, and cpu architecture (in my case it was "x64 Native Tools Command Prompt for VS 2022".)
22
+
23
+ - Switch python env and make sure the corresponding torch cuda version is installed
24
+
25
+ - Start task
26
+ ```
27
+ # Build with 1 parallel workers (I used 8 workers on i9-14900KF-3.20GHz-RAM64G, which took about 30 minutes.)
28
+ # If you want reset workers, you should edit `WindowsWhlBuilder_cuda.bat` and modify `set MAX_JOBS=1`.(I tried to modify it by parameters, but failed)
29
+ WindowsWhlBuilder_cuda.bat
30
+
31
+ # Enable cxx11abi
32
+ WindowsWhlBuilder_cuda.bat FORCE_CXX11_ABI=TRUE
33
+ ```
34
+ - Wheel file will be placed in the `dist` directory