Update README.md
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ To further enhance the model's multimodal capabilities, we employ trainable spec
|
|
25 |
2. Once the adapter has learned to map ViT's visual embeddings to the language model's textual space, we proceed to unfreeze Mistral for improved understanding of dialog formats and complex queries.
|
26 |
|
27 |
<p align="left">
|
28 |
-
<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/datasets.png" width="
|
29 |
</p>
|
30 |
|
31 |
### Results
|
@@ -45,7 +45,7 @@ Model Performance on Visual Dialog Benchmark
|
|
45 |
### Examples
|
46 |
|
47 |
<p align="left">
|
48 |
-
<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/examples.png" width="
|
49 |
</p>
|
50 |
|
51 |
### Future Plans
|
|
|
25 |
2. Once the adapter has learned to map ViT's visual embeddings to the language model's textual space, we proceed to unfreeze Mistral for improved understanding of dialog formats and complex queries.
|
26 |
|
27 |
<p align="left">
|
28 |
+
<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/datasets.png" width="50%">
|
29 |
</p>
|
30 |
|
31 |
### Results
|
|
|
45 |
### Examples
|
46 |
|
47 |
<p align="left">
|
48 |
+
<img src="https://raw.githubusercontent.com/AIRI-Institute/OmniFusion/main/content/examples.png" width="70%">
|
49 |
</p>
|
50 |
|
51 |
### Future Plans
|