Hemanth-thunder commited on
Commit
1b4786f
·
verified ·
1 Parent(s): 950f753

Created README.md file for model description

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: text-generation
4
+ language:
5
+ - ta
6
+ tags:
7
+ - pretrained
8
+ inference:
9
+ parameters:
10
+ temperature: 0.7
11
+ datasets:
12
+ - Hemanth-thunder/tamil-madlad-400
13
+ ---
14
+ # Model Card for Mistral-7B-v0.1
15
+
16
+ The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
17
+ Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
18
+
19
+ For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
20
+
21
+ ## Model Architecture
22
+
23
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
24
+ - Grouped-Query Attention
25
+ - Sliding-Window Attention
26
+ - Byte-fallback BPE tokenizer
27
+
28
+ ## Troubleshooting
29
+
30
+ - If you see the following error:
31
+ ```
32
+ KeyError: 'mistral'
33
+ ```
34
+ - Or:
35
+ ```
36
+ NotImplementedError: Cannot copy out of meta tensor; no data!
37
+ ```
38
+
39
+ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
40
+
41
+ ## Notice
42
+
43
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
44
+
45
+ ## The Mistral AI Team
46
+
47
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.