zhengxuanzenwu commited on
Commit
39aa4c0
·
verified ·
1 Parent(s): 6030781

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -8,6 +8,8 @@ AxBench evaluates interpretability methods in terms of concept detection and mod
8
 
9
  # 2. What is `gemma-diffmean-9b-it-res`?
10
 
 
 
11
  - `gemma-`: Refer to Gemma 2 models
12
  - `diffmean-` : The dictionary learning model is taking the difference in mean between two contrastive groups.
13
  - `9b-it-`: The dictionary is for Gemma 2 9B instruction-tuning model
 
8
 
9
  # 2. What is `gemma-diffmean-9b-it-res`?
10
 
11
+ It is a single dictionary of subspaces for 16K concepts and serves as a drop-in replacement for SAEs.
12
+
13
  - `gemma-`: Refer to Gemma 2 models
14
  - `diffmean-` : The dictionary learning model is taking the difference in mean between two contrastive groups.
15
  - `9b-it-`: The dictionary is for Gemma 2 9B instruction-tuning model