AjayP13's picture
Upload tokenizer
b78f98f verified
|
raw
history blame
17.7 kB
metadata
base_model: BAAI/bge-m3
tags:
  - datadreamer
  - datadreamer-0.46.0
  - synthetic
  - sentence-transformers
  - feature-extraction
  - sentence-similarity
library_name: sentence-transformers
pipeline_tag: sentence-similarity
widget:
  - example_title: Example 1
    source_sentence: >-
      Tammy Tran, an undergraduate student in the Preston Lab, received the
      Rapaport-King Thesis Scholarship from the College of Liberal Arts. The
      Rapaport-King Scholarship is awarded to Honors Program students in the
      College of Liberal Arts who are conducting research and writing a senior
      thesis. Tammy also received an Undergraduate Research Fellowship awarded
      to students for […]

      Dr. Ila Fiete recently published her paper entitled “Fundamental limits on
      persistent activity in networks of noisy neurons” in PNAS. The research
      investigates memory, diffusion and information-diffusion inequality in the
      brain. Y. Burak and I. R. Fiete. (2012). Fundamental limits on persistent
      activity in networks of noisy neurons. PNAS Early Edition (Oct. 9).

      Dr. Hiroshi Nishiyama received a R01 grant from the NINDS for his project
      entitled “CNS Mechanisms of developmental synapse elimination”. This
      project investigates how the precision of synaptic circuitry is created in
      the developing mammalian brain by observing the process in the intact,
      live animals.

      John Widloski, graduate student in the Fiete Lab, received the
      Burroughs-Welcome Fund Award to attend the Methods in Computational
      Neuroscience summer course in Woods Hole, MA.

      Akram Bakkor, graduate student in the Poldrack lab, received a National
      Defense Science & Engineering Graduate Fellowship. This fellowship
      supports Akram’s research project investigating the neural mechanisms
      underlying how learned behaviors are changed. Using behavioral testing,
      modeling and fMRI analyses on human subjects the project will shed light
      on why habits can be so difficult []

      Dr. Boris Zemelman is the recipient of a Human Frontiers in Science
      Program Grant. This grant entitled “In vivo functional imaging and
      high-resolution manipulations of hippocampal memory circuits” is a
      collaborative project that will investigate how the brain encodes and
      processes spatial memory. Dr. Zemelman will use genetic tools for
      activation and silencing neurons and in []
    sentences:
      - >-
        A document that provides information about academic achievements,
        research funding, and scholarly publications of students and faculty
        members in a specific institution or department, such as the Preston
        Lab, Fiete Lab, or Poldrack lab, would be relevant. The document should
        contain specific details about the awards, scholarships, or grants
        received by the individuals, including the name of the award, the
        recipient, and the purpose or focus of the research project, to allow
        for a determination of the areas of study and research interests. This
        could include announcements, press releases, or news articles from
        academic institutions, research organizations, or scientific journals,
        and should provide a clear explanation of the research projects,
        enabling a reader to understand the objectives, methods, and
        significance of the studies. Additionally, the document would include
        information about the research topics, such as memory, diffusion, and
        information-diffusion inequality in the brain, or the neural mechanisms
        underlying learned behaviors, and would discuss the methodologies and
        techniques used, such as behavioral testing, modeling, and fMRI
        analyses. The document should also provide information about the funding
        sources, such as the Rapaport-King Thesis Scholarship, the National
        Defense Science & Engineering Graduate Fellowship, or the Human
        Frontiers in Science Program Grant, and explain the criteria or
        selection process for these awards. Furthermore, the document would
        describe the collaborative projects, such as the investigation of how
        the brain encodes and processes spatial memory, and the use of genetic
        tools for activation and silencing neurons, and would discuss the
        potential impact or contributions of the research to the field. Overall,
        a document that provides a comprehensive and detailed account of the
        academic achievements, research funding, and scholarly publications of
        students and faculty members would be able to provide an answer to the
        question.
  - example_title: Example 2
    source_sentence: >-
      Question: Yang, could you tell about yourself?

      Yang: I was born in Nanjing, now I live in the capital of China - Beijing.
      When I was 8, my father brought me to a chess center in Nanjing. There
      were three kinds of chess: Chinese chess, chess and I-go. We decided to
      choose chess: despite the popularity of Chinese chess in our country, they
      are not popular abroad.

      Now I study in Tsinghua University University, which is one of our best,
      at economics and management faculty. I am the second-year student.

      Q: Will you choose economics or chess as your main profession?

      Yang: I used to be a professional chessplayer, but now I spend some time
      for studying. I will make the final decision after my graduation. If I can
      improve my level, I will go on playing chess.

      Q: How do you divide your time between chess and other things?

      Yang: I spend half of my time on chess and half on study.

      Q: What are you interested in?

      Yang: I like to read, listen to the music and write stories. When I was in
      my childhood, I wrote some cartoons, flesh-stories. Now I write novels.

      Q: What are your preferences in the literature and music?

      Yang: Light and classic music. About literature: usually I prefer Chinese
      books. Recently I got very interested in environment subjects. I learn
      some materials and environment issues.

      Q: Do you read some chess literature?

      Yang: Very few.

      Q: Do you take any sports activities?

      Yang: I do a little yoga. I like swimming, but I cannot swim often.

      Q: You travel a lot - which country do you like most?

      Yang: I like all the countries I have visited. Every place has its beauty,
      its own unique culture and rich history. Human history.

      Q: Do you collect any information about the new country before your visit?

      Yang: Yes, sometimes when I check it in the Internet.

      Q: Do you have some goals for the nearest future?

      Yang: My main goal is connected with chess: I have some problems in my
      career. I always blunder in good positions. It lasts the last several
      years. My goal is to cover it.
    sentences:
      - >-
        A document that provides a personal and introspective account of an
        individual's life, interests, and goals, particularly focusing on their
        background, education, and passions, would be suitable. The document
        should contain detailed information about the individual's birthplace,
        current residence, and educational institution, as well as their field
        of study and faculty, and should discuss their early introduction to
        chess and their decision to pursue it despite its relatively low
        popularity abroad. It should also delve into the individual's
        profession, including their experience as a professional chess player
        and their current balance between studying and playing chess, as well as
        their future plans and aspirations. Additionally, the document should
        explore the individual's hobbies and interests, including reading,
        listening to music, and writing stories, and should provide insight into
        their preferences in literature and music, including their fondness for
        light and classic music and Chinese books. The document should also
        touch on the individual's sports activities, such as yoga and swimming,
        and their travel experiences, including their approach to learning about
        new countries before visiting them. Furthermore, the document should
        discuss the individual's goals and challenges, particularly in relation
        to their chess career, including their struggles with blundering in good
        positions and their desire to improve. The document would offer a
        comprehensive and personal portrait of the individual, including their
        thoughts, feelings, and experiences, and would provide a unique
        perspective on their life and aspirations. Additionally, the document
        would be written in a conversational style, with a question-and-answer
        format, making it an engaging and relatable read. Overall, the document
        should provide a nuanced and detailed understanding of the individual's
        life, interests, and goals, allowing readers to gain insight into their
        thoughts, feelings, and experiences.
  - example_title: Example 3
    source_sentence: >-
      A document that provides guidance on the self-moderation of the
      adventurous activity permit scheme within Scouting in the UK, would be
      relevant, and should include detailed information on the moderation
      process, the roles and responsibilities of Managers of the Activity Permit
      Scheme (MAPS) and County Commissioners, and the importance of ensuring the
      scheme's effectiveness and robustness. This document should offer a
      comprehensive overview of the moderation scheme, including its design,
      purpose, and benefits, and would cover the key aspects of the scheme, such
      as the minimum standards and good practice areas that Counties must adhere
      to, as well as the process for identifying and addressing areas for
      improvement. The document should also provide information on the County
      Self Moderation form, its structure, and how it is used to record and
      track progress, including the ability to record action plans for areas not
      met, and would explain the requirements for implementing action plans,
      particularly for minimum standards that are not met. Additionally, the
      document would discuss the role of the UK Activities Team in providing
      support to Counties that are not meeting one or more standards, and the
      process for requesting and receiving support, including the development of
      action plans and the provision of guidance and resources. Furthermore, the
      document should cover the sampling process, where a selection of
      self-moderations are reviewed each year, and would explain the purpose of
      this process, which is to identify trends, document the operation of the
      permit scheme, and demonstrate The Scout Association's ability to manage
      the provision of adventurous activities internally. Overall, the document
      should be a detailed and informative guide for MAPS, County Commissioners,
      and other stakeholders, providing a clear understanding of the
      self-moderation process and its importance in ensuring the safe and
      effective delivery of adventurous activities within Scouting.
    sentences:
      - >-
        A document that provides information on data parsing and extraction
        methods, specifically focusing on the efficient handling of
        <fi>description of the input data</fi> to obtain <fi>specific
        information or value</fi>, and discusses the use of <fi>programming
        language or tool</fi> for this purpose, would be suitable. This document
        should include examples or representations of input data, such as
        <fi>representation of the input data</fi>, and clearly outline the
        expected output or result, like `<fi>expected output or result</fi>`, to
        guide the extraction process. It may come from various domains,
        including but not limited to, computer science, data analysis, and
        software development, and could be in the form of a web page, article,
        book, or essay, as long as it offers detailed insights into efficient
        data parsing techniques and the application of specific programming
        languages or tools to achieve the desired outcome. Furthermore, the
        document should cover potential challenges or considerations in the
        parsing and extraction process, ensuring that the reader can adapt the
        methods to different scenarios involving <fi>description of the input
        data</fi> and <fi>programming language or tool</fi>. The document must
        also demonstrate how to work with the specified <fi>input data</fi> to
        produce the intended `<fi>expected output or result</fi>`, serving as a
        comprehensive resource for individuals seeking to efficiently parse and
        extract specific information using <fi>programming language or
        tool</fi>. Additionally, it should be able to discuss the relevance of
        efficiently parsing <fi>description of the input data</fi> and the
        benefits of using <fi>programming language or tool</fi> for the
        extraction of <fi>specific information or value</fi>, providing a
        well-rounded understanding of the topic. Overall, a suitable document
        would be one that not only provides technical guidance but also
        contextual understanding and practical applications of data parsing and
        extraction techniques.

Model Card

Add more information here

Example Usage

from sentence_transformers import SentenceTransformer
from sentence_transformers.util import cos_sim

model = SentenceTransformer('fineinstructions/matching_embedding') # Load model

input = model.encode('Tammy Tran, an undergraduate student in the Preston Lab, received the Rapaport-King Thesis Scholarship from the College of Liberal Arts. The Rapaport-King Scholarship is awarded to Honors Program students in the College of Liberal Arts who are conducting research and writing a senior thesis. Tammy also received an Undergraduate Research Fellowship awarded to students for […]\nDr. Ila Fiete recently published her paper entitled “Fundamental limits on persistent activity in networks of noisy neurons” in PNAS. The research investigates memory, diffusion and information-diffusion inequality in the brain. Y. Burak and I. R. Fiete. (2012). Fundamental limits on persistent activity in networks of noisy neurons. PNAS Early Edition (Oct. 9).\nDr. Hiroshi Nishiyama received a R01 grant from the NINDS for his project entitled “CNS Mechanisms of developmental synapse elimination”. This project investigates how the precision of synaptic circuitry is created in the developing mammalian brain by observing the process in the intact, live animals.\nJohn Widloski, graduate student in the Fiete Lab, received the Burroughs-Welcome Fund Award to attend the Methods in Computational Neuroscience summer course in Woods Hole, MA.\nAkram Bakkor, graduate student in the Poldrack lab, received a National Defense Science & Engineering Graduate Fellowship. This fellowship supports Akram’s research project investigating the neural mechanisms underlying how learned behaviors are changed. Using behavioral testing, modeling and fMRI analyses on human subjects the project will shed light on why habits can be so difficult […]\nDr. Boris Zemelman is the recipient of a Human Frontiers in Science Program Grant. This grant entitled “In vivo functional imaging and high-resolution manipulations of hippocampal memory circuits” is a collaborative project that will investigate how the brain encodes and processes spatial memory. Dr. Zemelman will use genetic tools for activation and silencing neurons and in […]')
others = model.encode(['A document that provides information about academic achievements, research funding, and scholarly publications of students and faculty members in a specific institution or department, such as the Preston Lab, Fiete Lab, or Poldrack lab, would be relevant. The document should contain specific details about the awards, scholarships, or grants received by the individuals, including the name of the award, the recipient, and the purpose or focus of the research project, to allow for a determination of the areas of study and research interests. This could include announcements, press releases, or news articles from academic institutions, research organizations, or scientific journals, and should provide a clear explanation of the research projects, enabling a reader to understand the objectives, methods, and significance of the studies. Additionally, the document would include information about the research topics, such as memory, diffusion, and information-diffusion inequality in the brain, or the neural mechanisms underlying learned behaviors, and would discuss the methodologies and techniques used, such as behavioral testing, modeling, and fMRI analyses. The document should also provide information about the funding sources, such as the Rapaport-King Thesis Scholarship, the National Defense Science & Engineering Graduate Fellowship, or the Human Frontiers in Science Program Grant, and explain the criteria or selection process for these awards. Furthermore, the document would describe the collaborative projects, such as the investigation of how the brain encodes and processes spatial memory, and the use of genetic tools for activation and silencing neurons, and would discuss the potential impact or contributions of the research to the field. Overall, a document that provides a comprehensive and detailed account of the academic achievements, research funding, and scholarly publications of students and faculty members would be able to provide an answer to the question.'])
print(cos_sim(input, others))

This model was trained with a synthetic dataset with DataDreamer 🤖💤. The synthetic dataset card and model card can be found here. The training arguments can be found here.