Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,11 @@ tags:
|
|
10 |
|
11 |
Welcome to the GhostAI Music Generator! This web-based tool utilizes Meta AI's `musicgen-medium` model to craft high-quality instrumental tracks across genres such as Rock, Techno, Jazz, Classical, and Hip-Hop. The application structures compositions with sections like intros, verses, and choruses, all accessible through an intuitive Gradio interface. Outputs are high-quality MP3 files at 320 kbps, complete with embedded metadata. To enhance audio quality, we've integrated processing features including equalization (EQ), a chorus effect, and peak limiting for a polished sound.
|
12 |
|
|
|
|
|
|
|
|
|
|
|
13 |
## Project Evolution and Optimization
|
14 |
|
15 |
Initially, the project faced VRAM limitations on an NVIDIA RTX 3060 Ti with 7.69 GiB. To address this, we divided 30-second tracks into manageable chunks—first into three 10-second segments, then into two 15-second segments—to optimize memory usage. The Bark model was removed to focus solely on instrumental generation, and we standardized the output format to MP3 for broader compatibility. To achieve a more natural song flow, we varied prompts for each chunk. For instance, the first chunk might use "dynamic intro and expressive verse," while the second employs "powerful chorus and energetic outro," providing a realistic song structure.
|
|
|
10 |
|
11 |
Welcome to the GhostAI Music Generator! This web-based tool utilizes Meta AI's `musicgen-medium` model to craft high-quality instrumental tracks across genres such as Rock, Techno, Jazz, Classical, and Hip-Hop. The application structures compositions with sections like intros, verses, and choruses, all accessible through an intuitive Gradio interface. Outputs are high-quality MP3 files at 320 kbps, complete with embedded metadata. To enhance audio quality, we've integrated processing features including equalization (EQ), a chorus effect, and peak limiting for a polished sound.
|
12 |
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+

|
17 |
+
|
18 |
## Project Evolution and Optimization
|
19 |
|
20 |
Initially, the project faced VRAM limitations on an NVIDIA RTX 3060 Ti with 7.69 GiB. To address this, we divided 30-second tracks into manageable chunks—first into three 10-second segments, then into two 15-second segments—to optimize memory usage. The Bark model was removed to focus solely on instrumental generation, and we standardized the output format to MP3 for broader compatibility. To achieve a more natural song flow, we varied prompts for each chunk. For instance, the first chunk might use "dynamic intro and expressive verse," while the second employs "powerful chorus and energetic outro," providing a realistic song structure.
|