NECOUDBFM
/

Jellyfish-13B

Text Generation

Transformers

PyTorch

English

llama

text-generation-inference

Model card Files Files and versions Community

HCZhang commited on Oct 26, 2023

Commit

fd9f8a3

1 Parent(s): 5a66fdc

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -15,15 +15,15 @@ We fine-tuned the [Open-Orca/OpenOrca-Platypus2-13B](https://huggingface.co/Open
 Its performance is competitive, rivaling previous state-of-the-art algorithms and LLMs such as OpenAI's GPT 3.5 and GPT 4 ([as demonstrated in our earlier studies](https://arxiv.org/abs/2308.16361)).
 It is notable that, as a 13B model, Jellyfish allows for cost-effective local execution without compromising data security.
-We release two distinct versions of Jellyfish: Jellyfish-13B (the main branch) and Jellyfish-13B-Reasoning.
 As the names suggest, Jellyfish-13B is tailored to deliver precise, straightforward answers.
-In contrast, Jellyfish-13B-Reasoning, is fine-tuned with data that includes reasoning and sequential thought processes for handling data preprocessing tasks, distilling knowledge from GPT-4.
 The two versions are designed for different application scenarios.
 Jellyfish-13B is suitable for integration into larger data management systems due to its simple and clear responses that can be easily transformed into code.
-On the other hand, Jellyfish-13B-Reasoning is more user-oriented, with responses that provide them with in-depth data insights without the necessity for advanced coding skills or an intricate grasp of statistics.
-|  Task  | Dataset | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup> | Jellyfish-13B| Jellyfish-13B-Resoning | Jellyfish-13B-1.1<sup>3</sup> |
 | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
 | Entity Matching  | Fodors-Zagats  | 100  | 100 | 100 | 100 | 100 | 100 |
 | Entity Matching  | Beer           | 94.37| 96.30 | 100 | 93.33 | 100 | 96.55 |
@@ -39,7 +39,7 @@ On the other hand, Jellyfish-13B-Reasoning is more user-oriented, with responses
 | Schema Matching  |  Sythea        | 38.50| 57.14 | 66.67 | 36.36 | 30.77 | NA |
 _Accuracy as the metric for data imputation and the F1 score for other tasks._
-_For GPT-3.5, GPT-4 we used the few-shot approach, while for Jellyfish and Jellyfish-Reasoning, the zero-shot approach was employed._
 1.
   [Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
   [SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
@@ -136,7 +136,7 @@ Attribute B is [name: {value of name}, description: {value of description}].
 Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
 ```
-### JellyFish-13B-reasoning
 #### For Entity Matching
 ```
 You are tasked with determining whether two products listed below are the same based on the information provided.
@@ -185,7 +185,7 @@ Attribute B is [name: {value of name}, description: {value of description}].
 After your reasoning, finish your response in a separate line with and ONLY with your final answer. Choose your final answer from [Yes, No].
 ```
-## Sample Responses from Jellyfish-13B-Reasoning
 We provide a few sample responses from Jellyfish-13B-Reasoning to demonstrate its performance.
 _For easier readability, we display the raw data record instead of the entire prompt._

 Its performance is competitive, rivaling previous state-of-the-art algorithms and LLMs such as OpenAI's GPT 3.5 and GPT 4 ([as demonstrated in our earlier studies](https://arxiv.org/abs/2308.16361)).
 It is notable that, as a 13B model, Jellyfish allows for cost-effective local execution without compromising data security.
+We release two distinct versions of Jellyfish: Jellyfish-13B (the main branch) and Jellyfish-13B-Interpreter.
 As the names suggest, Jellyfish-13B is tailored to deliver precise, straightforward answers.
+In contrast, Jellyfish-13B-Interpreter, is fine-tuned with data that includes reasoning and sequential thought processes for handling data preprocessing tasks, distilling knowledge from GPT-4.
 The two versions are designed for different application scenarios.
 Jellyfish-13B is suitable for integration into larger data management systems due to its simple and clear responses that can be easily transformed into code.
+On the other hand, Jellyfish-13B-Interpreter is more user-oriented, with responses that provide them with in-depth data insights without the necessity for advanced coding skills or an intricate grasp of statistics.
+|  Task  | Dataset | Non-LLM SoTA<sup>1</sup> | GPT-3.5<sup>2</sup> | GPT-4<sup>2</sup> | Jellyfish-13B| Jellyfish-13B-Interpreter | Jellyfish-13B-1.1<sup>3</sup> |
 | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
 | Entity Matching  | Fodors-Zagats  | 100  | 100 | 100 | 100 | 100 | 100 |
 | Entity Matching  | Beer           | 94.37| 96.30 | 100 | 93.33 | 100 | 96.55 |
 | Schema Matching  |  Sythea        | 38.50| 57.14 | 66.67 | 36.36 | 30.77 | NA |
 _Accuracy as the metric for data imputation and the F1 score for other tasks._
+_For GPT-3.5, GPT-4 we used the few-shot approach, while for Jellyfish and Jellyfish-Interpreter, the zero-shot approach was employed._
 1.
   [Ditto](https://arxiv.org/abs/2004.00584) for Entity Matching
   [SMAT](https://www.researchgate.net/publication/353920530_SMAT_An_Attention-Based_Deep_Learning_Solution_to_the_Automation_of_Schema_Matching) for Schema Matching
 Are Attribute A and Attribute B semantically equivalent? Choose your answer from: [Yes, No].
 ```
+### JellyFish-13B-Interpreter
 #### For Entity Matching
 ```
 You are tasked with determining whether two products listed below are the same based on the information provided.
 After your reasoning, finish your response in a separate line with and ONLY with your final answer. Choose your final answer from [Yes, No].
 ```
+## Sample Responses from Jellyfish-13B-Interpreter
 We provide a few sample responses from Jellyfish-13B-Reasoning to demonstrate its performance.
 _For easier readability, we display the raw data record instead of the entire prompt._