allen-li1231 nielsr HF Staff commited on
Commit
4ae4690
·
verified ·
1 Parent(s): 2fa3291

Add pipeline tag text-ranking (#1)

Browse files

- Add pipeline tag text-ranking (9753821b1cdcb9de65bd14e5c493e996ef310471)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +6 -3
README.md CHANGED
@@ -1,21 +1,22 @@
1
  ---
 
 
2
  library_name: treehop-rag
3
  license: mit
 
4
  tags:
5
  - Information Retrieval
6
  - Retrieval-Augmented Generation
7
  - model_hub_mixin
8
  - multi-hop question answering
9
  - pytorch_model_hub_mixin
10
- base_model:
11
- - BAAI/bge-m3
12
  ---
13
 
14
-
15
  # TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering
16
 
17
 
18
  [![arXiv](https://img.shields.io/badge/arXiv-2504.20114-b31b1b.svg?style=flat)](https://arxiv.org/abs/2504.20114)
 
19
  [![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://img.shields.io/badge/license-MIT-blue)
20
  [![Python 3.9+](https://img.shields.io/badge/Python-3.9+-green.svg)](https://www.python.org/downloads/)
21
 
@@ -36,6 +37,7 @@ base_model:
36
  ## Introduction
37
  TreeHop is a lightweight, embedding-level framework designed to address the computational inefficiencies of traditional recursive retrieval paradigm in the realm of Retrieval-Augmented Generation (RAG). By eliminating the need for iterative LLM-based query rewriting, TreeHop significantly reduces latency while maintaining state-of-the-art performance. It achieves this through dynamic query embedding updates and pruning strategies, enabling a streamlined "Retrieve-Embed-Retrieve" workflow.
38
 
 
39
 
40
  ## Why TreeHop for Multi-hop Retrieval?
41
  - **Handle Complex Queries**: Real-world questions often require multiple hops to retrieve relevant information, which traditional retrieval methods struggle with.
@@ -43,6 +45,7 @@ TreeHop is a lightweight, embedding-level framework designed to address the comp
43
  - **Speed**: 99% faster inference compared to iterative LLM approaches, ideal for industrial applications where response speed is crucial.
44
  - **Performant**: Maintains high recall with controlled number of retrieved passages, ensuring relevance without overwhelming the system.
45
 
 
46
 
47
 
48
  ## System Requirement
 
1
  ---
2
+ base_model:
3
+ - BAAI/bge-m3
4
  library_name: treehop-rag
5
  license: mit
6
+ pipeline_tag: text-ranking
7
  tags:
8
  - Information Retrieval
9
  - Retrieval-Augmented Generation
10
  - model_hub_mixin
11
  - multi-hop question answering
12
  - pytorch_model_hub_mixin
 
 
13
  ---
14
 
 
15
  # TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering
16
 
17
 
18
  [![arXiv](https://img.shields.io/badge/arXiv-2504.20114-b31b1b.svg?style=flat)](https://arxiv.org/abs/2504.20114)
19
+ [![HuggingFace](https://img.shields.io/badge/HuggingFace-Model-blue.svg)](https://huggingface.co/allen-li1231/treehop-rag)
20
  [![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://img.shields.io/badge/license-MIT-blue)
21
  [![Python 3.9+](https://img.shields.io/badge/Python-3.9+-green.svg)](https://www.python.org/downloads/)
22
 
 
37
  ## Introduction
38
  TreeHop is a lightweight, embedding-level framework designed to address the computational inefficiencies of traditional recursive retrieval paradigm in the realm of Retrieval-Augmented Generation (RAG). By eliminating the need for iterative LLM-based query rewriting, TreeHop significantly reduces latency while maintaining state-of-the-art performance. It achieves this through dynamic query embedding updates and pruning strategies, enabling a streamlined "Retrieve-Embed-Retrieve" workflow.
39
 
40
+ ![Simplified Iteration Enabled by TreeHop in RAG system](pics/TreeHop_iteration.png)
41
 
42
  ## Why TreeHop for Multi-hop Retrieval?
43
  - **Handle Complex Queries**: Real-world questions often require multiple hops to retrieve relevant information, which traditional retrieval methods struggle with.
 
45
  - **Speed**: 99% faster inference compared to iterative LLM approaches, ideal for industrial applications where response speed is crucial.
46
  - **Performant**: Maintains high recall with controlled number of retrieved passages, ensuring relevance without overwhelming the system.
47
 
48
+ ![Main Experiment](pics/main_experiment.png)
49
 
50
 
51
  ## System Requirement