ghostai1
/

GHOSTSONAFB

English

python

Model card Files Files and versions

xet

Community

ghostai1 commited on 13 days ago

Commit

f9202ba

verified ·

1 Parent(s): 1b053c8

Update README.md

Browse files

Files changed (1) hide show

README.md +144 -0

README.md CHANGED Viewed

@@ -1,3 +1,11 @@
 # 🎵 GhostAI Music Generator 🎸
 Welcome to the GhostAI Music Generator! This web-based tool utilizes Meta AI's `musicgen-medium` model to craft high-quality instrumental tracks across genres such as Rock, Techno, Jazz, Classical, and Hip-Hop. The application structures compositions with sections like intros, verses, and choruses, all accessible through an intuitive Gradio interface. Outputs are high-quality MP3 files at 320 kbps, complete with embedded metadata. To enhance audio quality, we've integrated processing features including equalization (EQ), a chorus effect, and peak limiting for a polished sound.
@@ -27,3 +35,139 @@ To get started, ensure your system meets the following requirements:
    ```bash
    git clone https://huggingface.co/your-username/ghostai-music-generator
    cd ghostai-music-generator

+---
+license: mit
+language:
+- en
+tags:
+- python
+- ai
+---
 # 🎵 GhostAI Music Generator 🎸
 Welcome to the GhostAI Music Generator! This web-based tool utilizes Meta AI's `musicgen-medium` model to craft high-quality instrumental tracks across genres such as Rock, Techno, Jazz, Classical, and Hip-Hop. The application structures compositions with sections like intros, verses, and choruses, all accessible through an intuitive Gradio interface. Outputs are high-quality MP3 files at 320 kbps, complete with embedded metadata. To enhance audio quality, we've integrated processing features including equalization (EQ), a chorus effect, and peak limiting for a polished sound.
    ```bash
    git clone https://huggingface.co/your-username/ghostai-music-generator
    cd ghostai-music-generator
+   ```
+2. **Set Up a Virtual Environment**:
+   ```bash
+   python3 -m venv venv
+   source venv/bin/activate
+   ```
+3. **Install PyTorch**:
+   For CUDA 12.1:
+   ```bash
+   pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu121
+   ```
+   For other CUDA versions, refer to the [PyTorch installation guide](https://pytorch.org/get-started/locally/).
+4. **Install Other Dependencies**:
+   ```bash
+   pip install -r requirements.txt
+   ```
+5. **Install ffmpeg**:
+   ```bash
+   sudo apt-get install ffmpeg
+   ```
+6. **Authenticate with Hugging Face**:
+   ```bash
+   huggingface-cli login
+   ```
+   Retrieve your token from [Hugging Face Tokens](https://huggingface.co/settings/tokens).
+7. **Request Access to the Model**:
+   Visit [facebook/musicgen-medium](https://huggingface.co/facebook/musicgen-medium) and request access.
+8. **Download and Place Model Weights**:
+   ```bash
+   mkdir -p /home/ubuntu/ghostai_music_generator/models/musicgen-medium
+   ```
+   Place the model weights in the directory above. If you store the model elsewhere, update the `local_model_path` in `app.py` accordingly.
+## Running the Application
+Start the application by executing:
+```bash
+python app.py
+```
+This will launch a Gradio UI at `http://0.0.0.0:9999`. Open this URL in your browser to access the interface.
+## Using the Interface
+Within the Gradio interface:
+- **Select a Genre**: Choose from Rock, Techno, Jazz, Classical, or Hip-Hop.
+- **Custom Prompt**: Enter a custom prompt, such as:
+  ```
+  Hard rock with a dynamic intro, expressive verse, and powerful chorus, featuring electric guitars, steady heavy drums, and deep bass.
+  ```
+- **Adjust Parameters**:
+  - **Guidance Scale (CFG)**: Default is 3.0.
+  - **Top-K Sampling**: Default is 300.
+  - **Top-P Sampling**: Default is 0.95.
+  - **Temperature**: Default is 1.0.
+  - **Total Duration**: Set to 30 seconds (range: 10-60).
+  - **Crossfade Duration**: Set to 500 ms (range: 100-2000).
+- **Generate Music**: Click "Generate Music" to create the track. The output will be saved as `output_cleaned.mp3` and played within Gradio.
+Monitor the terminal output for VRAM and GPU memory usage to ensure smooth operation.
+## Troubleshooting and Customization
+- **Quiet Spots in Waveform**: Edit `app.py` to increase gain before crossfading:
+  ```python
+  next_segment = next_segment + 3
+  ```
+  Use tools like Audacity to inspect and adjust the waveform.
+- **Enhancing the Chorus**: Modify the second chunk prompt to:
+  ```
+  explosive chorus with soaring guitars and pounding drums
+  ```
+  Or increase the temperature to 1.2 and `top_k` to 350 in the UI.
+- **Audio Distortion**: Reduce the chorus effect gain in `apply_chorus`:
+  ```python
+  delayed = segment - 6
+  ```
+  Adjust EQ settings in `apply_eq` with a high-pass at 80 Hz and low-pass at 5000 Hz.
+- **MP3 Export Issues**: Ensure `ffmpeg` is installed:
+  ```bash
+  sudo apt-get install ffmpeg
+  ```
+  Check the existence of `chunk_{i}.mp3` and `output_cleaned.mp3` files.
+- **VRAM Constraints**: Reduce the total duration to 20 seconds, close other GPU-intensive applications using `nvidia-smi`, and monitor usage with:
+  ```python
+  print(torch.cuda.memory_summary())
+  ```
+## Customization Options
+- **Lock Dependencies**:
+  ```bash
+  pip freeze > requirements.txt
+  ```
+- **Add New Genres**: In `app.py`, define a new genre prompt:
+  ```python
+  def set_pop_prompt():
+      return "Pop with a catchy intro, upbeat verse, and anthemic chorus, featuring bright synths, punchy drums, and groovy bass"
+  ```
+  Add a button for the new genre:
+  ```python
+  pop_btn = gr.Button("Pop", elem_classes="genre-btn")
+  pop_btn.click(set_pop_prompt, inputs=None, outputs=[instrumental_prompt])
+  ```
+- **Edit MP3 Files**: Use Audacity or similar tools for more control over the final output.
+- **Use a Smaller Model**: If VRAM is limited, switch to `musicgen-small` by updating `app.py`:
+  ```python
+  musicgen_model = MusicGen.get_pretrained('facebook/musicgen-small', device=device)
+  ```
+## License and Acknowledgments
+This project is licensed under the MIT License. Please include a LICENSE file with the MIT License text.
+Special thanks to:
+- Meta AI for `musicgen-medium` and Audiocraft.
+- Hugging Face for hosting and CLI tools.
+- Gradio for the web interface.
+- pydub for audio processing and MP3 export.
+- xAI for their support.
+Enjoy creating music! If you have questions or suggestions, feel free to open an issue on the repository. Let's make some tunes! 🎉