Update README.md
Browse files
README.md
CHANGED
@@ -25,12 +25,19 @@ Yingda Chen,</span>
|
|
25 |
|
26 |
[🏡 Project Page](https://garygutc.github.io/UniME) | [📄 Paper](https://arxiv.org/pdf/2504.17432) | [💻 Github](https://github.com/deepglint/UniME)
|
27 |
|
|
|
28 |
|
29 |
<p align="center">
|
30 |
-
<img src="figures/
|
31 |
</p>
|
32 |
|
|
|
|
|
33 |
## 💡 Highlights
|
|
|
|
|
|
|
|
|
34 |
To enhance the MLLM's embedding capability, we propose textual discriminative knowledge distillation. The training process involves decoupling the MLLM's LLM component and processing text with the prompt "Summarize the above sentences in one word.", followed by aligning the student (MLLM) and teacher (NV-Embed V2) embeddings via KL divergence on batch-wise similarity distributions. **Notably, only the LLM component is fine-tuned during this process, while all other parameters remain frozen**.
|
35 |
|
36 |
<p align="center">
|
|
|
25 |
|
26 |
[🏡 Project Page](https://garygutc.github.io/UniME) | [📄 Paper](https://arxiv.org/pdf/2504.17432) | [💻 Github](https://github.com/deepglint/UniME)
|
27 |
|
28 |
+
UniME achieves the top ranking on the MMEB leaderboard training using a 336×336 image resolution.(The screenshot is captured at 08:00 UTC+8 on May 6, 2025.)
|
29 |
|
30 |
<p align="center">
|
31 |
+
<img src="figures/MMEB.png">
|
32 |
</p>
|
33 |
|
34 |
+
|
35 |
+
|
36 |
## 💡 Highlights
|
37 |
+
<p align="center">
|
38 |
+
<img src="figures/fig1.png">
|
39 |
+
</p>
|
40 |
+
|
41 |
To enhance the MLLM's embedding capability, we propose textual discriminative knowledge distillation. The training process involves decoupling the MLLM's LLM component and processing text with the prompt "Summarize the above sentences in one word.", followed by aligning the student (MLLM) and teacher (NV-Embed V2) embeddings via KL divergence on batch-wise similarity distributions. **Notably, only the LLM component is fine-tuned during this process, while all other parameters remain frozen**.
|
42 |
|
43 |
<p align="center">
|