YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Open-ASQA-Speech for R1-A

Now support for:

  • MOSEI
  • LibriTTS
  • IMOCAP

Dataset Usage

MOSEI

You can assess the data with datasets/affect/get_data.py from https://github.com/pliang279/MultiBench, which will return [vision, audio, text, ind, label].

# Example code
traindata, validdata, test_robust = get_dataloader('./mosei_raw.pkl', data_type='mosei')

LibriTTS

LibriTTS is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate.

There are 7 splits (dots replace dashes from the original dataset, to comply with hf naming requirements):

  • dev.clean dev.other
  • test.clean test.other
  • train.clean.100 train.clean.360 train.other.500

** Configurations ** The default configuration is "all".

  • "dev": only the "dev.clean" split (good for testing the dataset quickly)
  • "clean": contains only "clean" splits
  • "other": contains only "other" splits
  • "all": contains only "all" splits
# Example code
load_dataset("{your path}/libritts", "clean", split="train.clean.100")

IMOCAP

The Interactive Emotional Dyadic Motion Capture (IEMOCAP) database is an acted, multimodal and multispeaker database, recently collected at SAIL lab at USC. It contains approximately 12 hours of audiovisual data, including video, speech, motion capture of face, text transcriptions. IEMOCAP database is annotated by multiple annotators into categorical labels, such as anger, happiness, sadness, neutrality, as well as dimensional labels such as valence, activation and dominance.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support