Update README.md
Browse files
README.md
CHANGED
@@ -15,15 +15,15 @@ We fine-tuned the [Open-Orca/OpenOrca-Platypus2-13B](https://huggingface.co/Open
|
|
15 |
Its performance is competitive, rivaling previous state-of-the-art algorithms and LLMs such as OpenAI's GPT 3.5 and GPT 4 ([as demonstrated in our earlier studies](https://arxiv.org/abs/2308.16361)).
|
16 |
It is notable that, as a 13B model, Jellyfish allows for cost-effective local execution without compromising data security.
|
17 |
|
18 |
-
We release two distinct versions of Jellyfish: Jellyfish-13B (the main branch) and Jellyfish-13B-
|
19 |
As the names suggest, Jellyfish-13B is tailored to deliver precise, straightforward answers.
|
20 |
-
In contrast, Jellyfish-13B-
|
21 |
|
22 |
The two versions are designed for different application scenarios.
|
23 |
Jellyfish-13B is suitable for integration into larger data management systems due to its simple and clear responses that can be easily transformed into code.
|
24 |
-
On the other hand, Jellyfish-13B-
|
25 |
|
26 |
-
| Task | Dataset | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup> | Jellyfish-13B| Jellyfish-13B-
|
27 |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
28 |
| Entity Matching | Fodors-Zagats | 100 | 100 | 100 | 100 | 100 | 100 |
|
29 |
| Entity Matching | Beer | 94.37| 96.30 | 100 | 93.33 | 100 | 96.55 |
|
@@ -39,7 +39,7 @@ On the other hand, Jellyfish-13B-Reasoning is more user-oriented, with responses
|
|
39 |
| Schema Matching | Sythea | 38.50| 57.14 | 66.67 | 36.36 | 30.77 | NA |
|
40 |
|
41 |
_Accuracy as the metric for data imputation and the F1 score for other tasks._
|
42 |
-
_For GPT-3.5, GPT-4 we used the few-shot approach, while for Jellyfish and Jellyfish-
|
43 |
1.
|
44 |
[Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
|
45 |
[SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
|
@@ -136,7 +136,7 @@ Attribute B is [name: {value of name}, description: {value of description}].
|
|
136 |
Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
|
137 |
```
|
138 |
|
139 |
-
### JellyFish-13B-
|
140 |
#### For Entity Matching
|
141 |
```
|
142 |
You are tasked with determining whether two products listed below are the same based on the information provided.
|
@@ -185,7 +185,7 @@ Attribute B is [name: {value of name}, description: {value of description}].
|
|
185 |
After your reasoning, finish your response in a separate line with and ONLY with your final answer. Choose your final answer from [Yes, No].
|
186 |
```
|
187 |
|
188 |
-
## Sample Responses from Jellyfish-13B-
|
189 |
We provide a few sample responses from Jellyfish-13B-Reasoning to demonstrate its performance.
|
190 |
|
191 |
_For easier readability, we display the raw data record instead of the entire prompt._
|
|
|
15 |
Its performance is competitive, rivaling previous state-of-the-art algorithms and LLMs such as OpenAI's GPT 3.5 and GPT 4 ([as demonstrated in our earlier studies](https://arxiv.org/abs/2308.16361)).
|
16 |
It is notable that, as a 13B model, Jellyfish allows for cost-effective local execution without compromising data security.
|
17 |
|
18 |
+
We release two distinct versions of Jellyfish: Jellyfish-13B (the main branch) and Jellyfish-13B-Interpreter.
|
19 |
As the names suggest, Jellyfish-13B is tailored to deliver precise, straightforward answers.
|
20 |
+
In contrast, Jellyfish-13B-Interpreter, is fine-tuned with data that includes reasoning and sequential thought processes for handling data preprocessing tasks, distilling knowledge from GPT-4.
|
21 |
|
22 |
The two versions are designed for different application scenarios.
|
23 |
Jellyfish-13B is suitable for integration into larger data management systems due to its simple and clear responses that can be easily transformed into code.
|
24 |
+
On the other hand, Jellyfish-13B-Interpreter is more user-oriented, with responses that provide them with in-depth data insights without the necessity for advanced coding skills or an intricate grasp of statistics.
|
25 |
|
26 |
+
| Task | Dataset | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup> | Jellyfish-13B| Jellyfish-13B-Interpreter | Jellyfish-13B-1.1<sup>3</sup> |
|
27 |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
|
28 |
| Entity Matching | Fodors-Zagats | 100 | 100 | 100 | 100 | 100 | 100 |
|
29 |
| Entity Matching | Beer | 94.37| 96.30 | 100 | 93.33 | 100 | 96.55 |
|
|
|
39 |
| Schema Matching | Sythea | 38.50| 57.14 | 66.67 | 36.36 | 30.77 | NA |
|
40 |
|
41 |
_Accuracy as the metric for data imputation and the F1 score for other tasks._
|
42 |
+
_For GPT-3.5, GPT-4 we used the few-shot approach, while for Jellyfish and Jellyfish-Interpreter, the zero-shot approach was employed._
|
43 |
1.
|
44 |
[Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
|
45 |
[SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
|
|
|
136 |
Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
|
137 |
```
|
138 |
|
139 |
+
### JellyFish-13B-Interpreter
|
140 |
#### For Entity Matching
|
141 |
```
|
142 |
You are tasked with determining whether two products listed below are the same based on the information provided.
|
|
|
185 |
After your reasoning, finish your response in a separate line with and ONLY with your final answer. Choose your final answer from [Yes, No].
|
186 |
```
|
187 |
|
188 |
+
## Sample Responses from Jellyfish-13B-Interpreter
|
189 |
We provide a few sample responses from Jellyfish-13B-Reasoning to demonstrate its performance.
|
190 |
|
191 |
_For easier readability, we display the raw data record instead of the entire prompt._
|