Update README.md
Browse files
README.md
CHANGED
@@ -29,11 +29,7 @@ The HuggingFace `transformers` 🤗 implementation was contributed by Tony Wu ([
|
|
29 |
|
30 |
## Model Description
|
31 |
|
32 |
-
|
33 |
-
We finetuned it to create [BiSigLIP](https://huggingface.co/vidore/bisiglip) and fed the patch-embeddings output by SigLIP to an LLM, [PaliGemma-3B](https://huggingface.co/google/paligemma-3b-mix-448) to create [BiPali](https://huggingface.co/vidore/bipali).
|
34 |
-
|
35 |
-
One benefit of inputting image patch embeddings through a language model is that they are natively mapped to a latent space similar to textual input (query).
|
36 |
-
This enables leveraging the [ColBERT](https://arxiv.org/abs/2004.12832) strategy to compute interactions between text tokens and image patches, which enables a step-change improvement in performance compared to BiPali.
|
37 |
|
38 |
## Model Training
|
39 |
|
|
|
29 |
|
30 |
## Model Description
|
31 |
|
32 |
+
Read the `transformers` 🤗 model card: https://huggingface.co/docs/transformers/en/model_doc/colpali.
|
|
|
|
|
|
|
|
|
33 |
|
34 |
## Model Training
|
35 |
|