Kimang18
/

whisper-tiny-khmer-mlx-fp32

@@ -1,28 +1,51 @@
 ---
 library_name: mlx
 license: apache-2.0
 datasets:
-- google/fleurs
-- seanghay/khmer_mpwt_speech
 - seanghay/km-speech-corpus
-- openslr/openslr
-metrics:
-- wer
 tags:
-- mlx
 - Khmer
 ---
 # whisper-tiny-khmer-mlx-fp32
-This model was converted to MLX format from [`openai-whisper-tiny`](https://github.com/openai/whisper), then fine tuned to Khmer language using two datasets:
-- [seanghay/khmer_mpwt_speech](https://huggingface.co/datasets/seanghay/khmer_mpwt_speech)
-- [seanghay/km-speech-corpus](https://huggingface.co/datasets/seanghay/km-speech-corpus)
 It achieves the following __word error rate__ (`wer`) on 2 popular datasets:
-- 0.938 on [google/fleurs](https://huggingface.co/datasets/google/fleurs) `km-kh`, `test` split
-- 0.697 on [openslr/openslr](https://huggingface.co/datasets/openslr/openslr) `SLR42`, `train` split
- __NOTE__ MLX format is usable for M-chip series of Apple.
 ## Use with mlx
 ```bash
@@ -38,12 +61,11 @@ result = mlx_whisper.transcribe(
     path_or_hf_repo="Kimang18/whisper-tiny-khmer-mlx-fp32",
     fp16=False
 )
-print(result['text']) # print khmer text in SPEECH_FILE_NAME
 ```
 Then execute this script `example.py` to see the result.
 You can also use command line in terminal
 ```bash
 mlx_whisper --model Kimang18/whisper-tiny-khmer-mlx-fp32 --task transcribe SPEECH_FILE_NAME --fp16 False
-```

 ---
 library_name: mlx
 license: apache-2.0
+language:
+- kh
+pipeline_tag: automatic-speech-recognition
 datasets:
 - seanghay/km-speech-corpus
+- seanghay/khmer_mwpt_speech
 tags:
 - Khmer
+- mlx
+base_model: openai-whisper-tiny
+model-index:
+- name: whisper-tiny-khmer-mlx-fp32 by Kimang KHUN
+  results:
+  - task:
+      type: automatic-speech-recognition
+      name: Speech Recognition
+    dataset:
+      name: test split of "km_kh" in google/fleurs
+      type: google/fleurs
+    metrics:
+    - type: wer
+      value: 93.8%
+      name: test
+  - task:
+      type: automatic-speech-recognition
+      name: Speech Recognition
+    dataset:
+      name: train split of "SLR42" in openslr/openslr
+      type: openslr/openslr
+    metrics:
+    - type: wer
+      value: 69.7%
+      name: test
 ---
 # whisper-tiny-khmer-mlx-fp32
+This model was converted to MLX format from [`openai-whisper-tiny`](https://github.com/openai/whisper), then fine-tined to Khmer language using two datasets:
+- [seanghay/khmer_mpwt_speech](https://huggingface.com/datasets/seanghay/khmer_mpwt_speech)
+- [seanghay/km-speech-corpus](https://huggingface.com/datasets/seanghay/km-speech-corpus)
 It achieves the following __word error rate__ (`wer`) on 2 popular datasets:
+- ??? on `test` split of [google/fleurs](https://huggingface.co/datasets/google/fleurs) `km-kh`
+- ??? on `train` split of [openslr/openslr](https://huggingface.co/datasets/openslr/openslr) `SLR42`
+__NOTE__ MLX format is usable for M-chip series of Apple.
 ## Use with mlx
 ```bash
     path_or_hf_repo="Kimang18/whisper-tiny-khmer-mlx-fp32",
     fp16=False
 )
+print(result['text'])
 ```
 Then execute this script `example.py` to see the result.
 You can also use command line in terminal
 ```bash
 mlx_whisper --model Kimang18/whisper-tiny-khmer-mlx-fp32 --task transcribe SPEECH_FILE_NAME --fp16 False
+```