martingenzel commited on
Commit
f50e653
·
verified ·
1 Parent(s): 9de62de

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -10
README.md CHANGED
@@ -1,12 +1,9 @@
1
  ---
2
  license: other
3
- datasets:
4
- - allenai/c4
5
- language:
6
- - en
7
- metrics:
8
- - perplexity
9
- - accuracy
10
  base_model:
11
  - jeffwan/llama-7b-hf
12
  pipeline_tag: text-generation
@@ -45,10 +42,10 @@ from transformers import AutoModel
45
 
46
  model = AutoModel.from_pretrained("MerantixMomentum/acip_llama1_7b", trust_remote_code=True)
47
  ```
48
- This will download and create a fully parameterized ACIP model that can be pruned to any compression ratio you wish.
49
  For example,
50
  ```python
51
- model.prune_model_by_score(compression_ratio=0.4)
52
  ```
53
  will prune `model` to 40% if its original size measured in number of parameters, i.e., 60% compression rate.
54
  A unique feature of ACIP is that this operation is revertible in the sense that you can rerun `model.prune_model_by_score` as often as you like to evaluate your model at different sizes. Finally, you can "commit" to a certain ratio and run
@@ -65,7 +62,7 @@ to save even more memory (we have only tested 4bit quantization with `bitsandbyt
65
 
66
  **🚀 That's it! You can now use your compressed model for inference or fine-tuning as any other Causal Language Model from 🤗 transformers.**
67
 
68
- **Note**: The parameter `compression_ratio` ranges from 1.0 to 0.0, indicating the model size after compression. For example, 0.4 means that the model has only 40% of the original number of parameters and 1.0 means no compression at all.
69
 
70
  # Dependencies
71
 
 
1
  ---
2
  license: other
3
+ datasets: ['allenai/c4']
4
+ language: ['en']
5
+ metrics: ['perplexity', 'accuracy']
6
+ tags: ['acip', 'pytorch']
 
 
 
7
  base_model:
8
  - jeffwan/llama-7b-hf
9
  pipeline_tag: text-generation
 
42
 
43
  model = AutoModel.from_pretrained("MerantixMomentum/acip_llama1_7b", trust_remote_code=True)
44
  ```
45
+ This will download and create a fully parameterized ACIP model that can be pruned to any compression rate you wish.
46
  For example,
47
  ```python
48
+ model.prune_model_by_score(size_ratio=0.4)
49
  ```
50
  will prune `model` to 40% if its original size measured in number of parameters, i.e., 60% compression rate.
51
  A unique feature of ACIP is that this operation is revertible in the sense that you can rerun `model.prune_model_by_score` as often as you like to evaluate your model at different sizes. Finally, you can "commit" to a certain ratio and run
 
62
 
63
  **🚀 That's it! You can now use your compressed model for inference or fine-tuning as any other Causal Language Model from 🤗 transformers.**
64
 
65
+ **Note**: The parameter `size_ratio` ranges from 1.0 to 0.0, indicating the model size after compression. For example, 0.4 means that the model has only 40% of the original number of parameters and 1.0 means no compression at all. Alternatively, you can also set `compression_rate` in `prune_model_by_score`, which is equivalent to `size_ratio = 1.0 - compression_rate`.
66
 
67
  # Dependencies
68