Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# 🎵 GhostAI Music Generator 🎸
|
2 |
|
3 |
Welcome to the GhostAI Music Generator! This web-based tool utilizes Meta AI's `musicgen-medium` model to craft high-quality instrumental tracks across genres such as Rock, Techno, Jazz, Classical, and Hip-Hop. The application structures compositions with sections like intros, verses, and choruses, all accessible through an intuitive Gradio interface. Outputs are high-quality MP3 files at 320 kbps, complete with embedded metadata. To enhance audio quality, we've integrated processing features including equalization (EQ), a chorus effect, and peak limiting for a polished sound.
|
@@ -27,3 +35,139 @@ To get started, ensure your system meets the following requirements:
|
|
27 |
```bash
|
28 |
git clone https://huggingface.co/your-username/ghostai-music-generator
|
29 |
cd ghostai-music-generator
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
tags:
|
6 |
+
- python
|
7 |
+
- ai
|
8 |
+
---
|
9 |
# 🎵 GhostAI Music Generator 🎸
|
10 |
|
11 |
Welcome to the GhostAI Music Generator! This web-based tool utilizes Meta AI's `musicgen-medium` model to craft high-quality instrumental tracks across genres such as Rock, Techno, Jazz, Classical, and Hip-Hop. The application structures compositions with sections like intros, verses, and choruses, all accessible through an intuitive Gradio interface. Outputs are high-quality MP3 files at 320 kbps, complete with embedded metadata. To enhance audio quality, we've integrated processing features including equalization (EQ), a chorus effect, and peak limiting for a polished sound.
|
|
|
35 |
```bash
|
36 |
git clone https://huggingface.co/your-username/ghostai-music-generator
|
37 |
cd ghostai-music-generator
|
38 |
+
```
|
39 |
+
|
40 |
+
2. **Set Up a Virtual Environment**:
|
41 |
+
```bash
|
42 |
+
python3 -m venv venv
|
43 |
+
source venv/bin/activate
|
44 |
+
```
|
45 |
+
|
46 |
+
3. **Install PyTorch**:
|
47 |
+
For CUDA 12.1:
|
48 |
+
```bash
|
49 |
+
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
|
50 |
+
```
|
51 |
+
For other CUDA versions, refer to the [PyTorch installation guide](https://pytorch.org/get-started/locally/).
|
52 |
+
|
53 |
+
4. **Install Other Dependencies**:
|
54 |
+
```bash
|
55 |
+
pip install -r requirements.txt
|
56 |
+
```
|
57 |
+
|
58 |
+
5. **Install ffmpeg**:
|
59 |
+
```bash
|
60 |
+
sudo apt-get install ffmpeg
|
61 |
+
```
|
62 |
+
|
63 |
+
6. **Authenticate with Hugging Face**:
|
64 |
+
```bash
|
65 |
+
huggingface-cli login
|
66 |
+
```
|
67 |
+
Retrieve your token from [Hugging Face Tokens](https://huggingface.co/settings/tokens).
|
68 |
+
|
69 |
+
7. **Request Access to the Model**:
|
70 |
+
Visit [facebook/musicgen-medium](https://huggingface.co/facebook/musicgen-medium) and request access.
|
71 |
+
|
72 |
+
8. **Download and Place Model Weights**:
|
73 |
+
```bash
|
74 |
+
mkdir -p /home/ubuntu/ghostai_music_generator/models/musicgen-medium
|
75 |
+
```
|
76 |
+
Place the model weights in the directory above. If you store the model elsewhere, update the `local_model_path` in `app.py` accordingly.
|
77 |
+
|
78 |
+
## Running the Application
|
79 |
+
|
80 |
+
Start the application by executing:
|
81 |
+
```bash
|
82 |
+
python app.py
|
83 |
+
```
|
84 |
+
This will launch a Gradio UI at `http://0.0.0.0:9999`. Open this URL in your browser to access the interface.
|
85 |
+
|
86 |
+
## Using the Interface
|
87 |
+
|
88 |
+
Within the Gradio interface:
|
89 |
+
|
90 |
+
- **Select a Genre**: Choose from Rock, Techno, Jazz, Classical, or Hip-Hop.
|
91 |
+
- **Custom Prompt**: Enter a custom prompt, such as:
|
92 |
+
```
|
93 |
+
Hard rock with a dynamic intro, expressive verse, and powerful chorus, featuring electric guitars, steady heavy drums, and deep bass.
|
94 |
+
```
|
95 |
+
- **Adjust Parameters**:
|
96 |
+
- **Guidance Scale (CFG)**: Default is 3.0.
|
97 |
+
- **Top-K Sampling**: Default is 300.
|
98 |
+
- **Top-P Sampling**: Default is 0.95.
|
99 |
+
- **Temperature**: Default is 1.0.
|
100 |
+
- **Total Duration**: Set to 30 seconds (range: 10-60).
|
101 |
+
- **Crossfade Duration**: Set to 500 ms (range: 100-2000).
|
102 |
+
- **Generate Music**: Click "Generate Music" to create the track. The output will be saved as `output_cleaned.mp3` and played within Gradio.
|
103 |
+
|
104 |
+
Monitor the terminal output for VRAM and GPU memory usage to ensure smooth operation.
|
105 |
+
|
106 |
+
## Troubleshooting and Customization
|
107 |
+
|
108 |
+
- **Quiet Spots in Waveform**: Edit `app.py` to increase gain before crossfading:
|
109 |
+
```python
|
110 |
+
next_segment = next_segment + 3
|
111 |
+
```
|
112 |
+
Use tools like Audacity to inspect and adjust the waveform.
|
113 |
+
|
114 |
+
- **Enhancing the Chorus**: Modify the second chunk prompt to:
|
115 |
+
```
|
116 |
+
explosive chorus with soaring guitars and pounding drums
|
117 |
+
```
|
118 |
+
Or increase the temperature to 1.2 and `top_k` to 350 in the UI.
|
119 |
+
|
120 |
+
- **Audio Distortion**: Reduce the chorus effect gain in `apply_chorus`:
|
121 |
+
```python
|
122 |
+
delayed = segment - 6
|
123 |
+
```
|
124 |
+
Adjust EQ settings in `apply_eq` with a high-pass at 80 Hz and low-pass at 5000 Hz.
|
125 |
+
|
126 |
+
- **MP3 Export Issues**: Ensure `ffmpeg` is installed:
|
127 |
+
```bash
|
128 |
+
sudo apt-get install ffmpeg
|
129 |
+
```
|
130 |
+
Check the existence of `chunk_{i}.mp3` and `output_cleaned.mp3` files.
|
131 |
+
|
132 |
+
- **VRAM Constraints**: Reduce the total duration to 20 seconds, close other GPU-intensive applications using `nvidia-smi`, and monitor usage with:
|
133 |
+
```python
|
134 |
+
print(torch.cuda.memory_summary())
|
135 |
+
```
|
136 |
+
|
137 |
+
## Customization Options
|
138 |
+
|
139 |
+
- **Lock Dependencies**:
|
140 |
+
```bash
|
141 |
+
pip freeze > requirements.txt
|
142 |
+
```
|
143 |
+
|
144 |
+
- **Add New Genres**: In `app.py`, define a new genre prompt:
|
145 |
+
```python
|
146 |
+
def set_pop_prompt():
|
147 |
+
return "Pop with a catchy intro, upbeat verse, and anthemic chorus, featuring bright synths, punchy drums, and groovy bass"
|
148 |
+
```
|
149 |
+
Add a button for the new genre:
|
150 |
+
```python
|
151 |
+
pop_btn = gr.Button("Pop", elem_classes="genre-btn")
|
152 |
+
pop_btn.click(set_pop_prompt, inputs=None, outputs=[instrumental_prompt])
|
153 |
+
```
|
154 |
+
|
155 |
+
- **Edit MP3 Files**: Use Audacity or similar tools for more control over the final output.
|
156 |
+
|
157 |
+
- **Use a Smaller Model**: If VRAM is limited, switch to `musicgen-small` by updating `app.py`:
|
158 |
+
```python
|
159 |
+
musicgen_model = MusicGen.get_pretrained('facebook/musicgen-small', device=device)
|
160 |
+
```
|
161 |
+
|
162 |
+
## License and Acknowledgments
|
163 |
+
|
164 |
+
This project is licensed under the MIT License. Please include a LICENSE file with the MIT License text.
|
165 |
+
|
166 |
+
Special thanks to:
|
167 |
+
- Meta AI for `musicgen-medium` and Audiocraft.
|
168 |
+
- Hugging Face for hosting and CLI tools.
|
169 |
+
- Gradio for the web interface.
|
170 |
+
- pydub for audio processing and MP3 export.
|
171 |
+
- xAI for their support.
|
172 |
+
|
173 |
+
Enjoy creating music! If you have questions or suggestions, feel free to open an issue on the repository. Let's make some tunes! 🎉
|