ziyue
/

open-asqa-speech

Model card Files Files and versions Community

ziyue commited on Mar 13

Commit

d15450b

·

verified ·

1 Parent(s): 50dc7f6

Create README.md

Files changed (1) hide show

README.md +35 -0

README.md ADDED Viewed

	@@ -0,0 +1,35 @@

+# Open-ASQA-Speech for R1-A
+Now support for:
+- LibriTTS
+- MOSEI
+## Dataset Usage
+### MOSEI
+You can assess the data with `datasets/affect/get_data.py` from `https://github.com/pliang279/MultiBench`, which will return [vision, audio, text, ind, label].
+``` python
+# Example code
+traindata, validdata, test_robust = get_dataloader('./mosei_raw.pkl', data_type='mosei')
+```
+### LibriTTS
+LibriTTS is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate.
+There are 7 splits (dots replace dashes from the original dataset, to comply with hf naming requirements):
+- dev.clean dev.other
+- test.clean test.other
+- train.clean.100 train.clean.360 train.other.500
+** Configurations **
+The default configuration is "all".
+- "dev": only the "dev.clean" split (good for testing the dataset quickly)
+- "clean": contains only "clean" splits
+- "other": contains only "other" splits
+- "all": contains only "all" splits
+``` python
+# Example code
+load_dataset("blabble-io/libritts", "clean", split="train.clean.100")
+```