metadata
datasets:
- seungheondoh/LP-MusicCaps-MSD
- DynamicSuperb/MusicGenreClassification_FMA
- DynamicSuperb/MARBLEMusicTagging_MagnaTagATune
- agkphysics/AudioSet
language:
- en
license: mit
pipeline_tag: text-to-audio
tags:
- music
- art
- text-to-audio
model_type: diffusers
library_name: diffusers
Model Description
This model, QA-MDT, allows for easy setup and usage for generating music from text prompts. It incorporates a quality-aware training strategy to improve the fidelity of generated music.
How to Use
A Hugging Face Diffusers implementation is available at this model and this space. For more detailed instructions and the official PyTorch implementation, please refer to the project's Github repository and project page.
The model was presented in the paper QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation.