File size: 9,506 Bytes
ef46f0f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
# FramePack Dancing Image-to-Video Generation
This repository contains the necessary steps and scripts to generate videos using the Dancing image-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create high-quality anime-style videos based on textual prompts.
## Prerequisites
Before proceeding, ensure that you have the following installed on your system:
• **Ubuntu** (or a compatible Linux distribution)
• **Python 3.x**
• **pip** (Python package manager)
• **Git**
• **Git LFS** (Git Large File Storage)
• **FFmpeg**
## Installation
1. **Update and Install Dependencies**
```bash
sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
```
2. **Clone the Repository**
```bash
git clone https://huggingface.co/svjack/YiChen_FramePack_lora_early
cd YiChen_FramePack_lora_early
```
3. **Install Python Dependencies**
```bash
pip install torch torchvision
pip install -r requirements.txt
pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
pip install moviepy==1.0.3
pip install sageattention==1.0.6
```
4. **Download Model Weights**
```bash
git clone https://huggingface.co/lllyasviel/FramePackI2V_HY
git clone https://huggingface.co/hunyuanvideo-community/HunyuanVideo
git clone https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged
git clone https://huggingface.co/Comfy-Org/sigclip_vision_384
```
## Usage
To generate a video, use the `fpack_generate_video.py` script with the appropriate parameters. Below are examples of how to generate videos using the Dancing model.
### 1. Furina
- Source Image
```bash
python fpack_generate_video.py \
--dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
--vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
--text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
--text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
--image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
--image_path fln.png \
--prompt "In the style of Yi Chen Dancing White Background , The character's movements shift dynamically throughout the video, transitioning from poised stillness to lively dance steps. Her expressions evolve seamlessly—starting with focused determination, then flashing surprise as she executes a quick spin, before breaking into a joyful smile mid-leap. Her hands flow through choreographed positions, sometimes extending gracefully like unfolding wings, other times clapping rhythmically against her wrists. During a dramatic hip sway, her fingers fan open near her cheek, then sweep downward as her whole body dips into a playful crouch, the sequins on her costume catching the light with every motion." \
--video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
--attn_mode sdpa --fp8_scaled \
--vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
--save_path save --output_type both \
--seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors
```
- Without Lora
- With Lora
### 2. Roper
- Source Image
```bash
python fpack_generate_video.py \
--dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
--vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
--text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
--text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
--image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
--image_path shengjiang.png \
--prompt "In the style of Yi Chen Dancing White Background , The character's movements shift dynamically throughout the video, transitioning from poised stillness to lively dance steps. Her expressions evolve seamlessly—starting with focused determination, then flashing surprise as she executes a quick spin, before breaking into a joyful smile mid-leap. Her hands flow through choreographed positions, sometimes extending gracefully like unfolding wings, other times clapping rhythmically against her wrists. During a dramatic hip sway, her fingers fan open near her cheek, then sweep downward as her whole body dips into a playful crouch, the sequins on her costume catching the light with every motion." \
--video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
--attn_mode sdpa --fp8_scaled \
--vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
--save_path save --output_type both \
--seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors
```
- With Lora
### 3. Varesa
- Source Image
```bash
python fpack_generate_video.py \
--dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
--vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
--text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
--text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
--image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
--image_path waliesha.jpg \
--prompt "In the style of Yi Chen Dancing White Background , The dancer’s energy pulses in waves—one moment a statue, poised and precise, the next a whirl of motion as her feet flicker across the floor. Her face tells its own story: brows knit in concentration, then eyes widening mid-turn as if startled by her own speed, before dissolving into laughter as she springs upward, weightless. Her arms carve the air—now arcing like ribbons unfurling, now snapping sharp as a whip’s crack, palms meeting wrists in staccato beats. A roll of her hips sends her fingers fluttering near her temple, then cascading down as she folds into a teasing dip, the beads on her dress scattering light like sparks." \
--video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
--attn_mode sdpa --fp8_scaled \
--vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
--save_path save --output_type both \
--seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors
```
- With Lora
### 4. Scaramouche
- Source Image
```bash
python fpack_generate_video.py \
--dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
--vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
--text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
--text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
--image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
--image_path shanbing.jpg \
--prompt "In the style of Yi Chen Dancing White Background , The dancer’s energy pulses in waves—one moment a statue, poised and precise, the next a whirl of motion as her feet flicker across the floor. Her face tells its own story: brows knit in concentration, then eyes widening mid-turn as if startled by her own speed, before dissolving into laughter as she springs upward, weightless. Her arms carve the air—now arcing like ribbons unfurling, now snapping sharp as a whip’s crack, palms meeting wrists in staccato beats. A roll of her hips sends her fingers fluttering near her temple, then cascading down as she folds into a teasing dip, the beads on her dress scattering light like sparks." \
--video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
--attn_mode sdpa --fp8_scaled \
--vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
--save_path save --output_type both \
--seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors
```
- With Lora
## Parameters
* `--fp8`: Enable FP8 precision (optional).
* `--task`: Specify the task (e.g., `t2v-1.3B`).
* `--video_size`: Set the resolution of the generated video (e.g., `1024 1024`).
* `--video_length`: Define the length of the video in frames.
* `--infer_steps`: Number of inference steps.
* `--save_path`: Directory to save the generated video.
* `--output_type`: Output type (e.g., `both` for video and frames).
* `--dit`: Path to the diffusion model weights.
* `--vae`: Path to the VAE model weights.
* `--t5`: Path to the T5 model weights.
* `--attn_mode`: Attention mode (e.g., `torch`).
* `--lora_weight`: Path to the LoRA weights.
* `--lora_multiplier`: Multiplier for LoRA weights.
* `--prompt`: Textual prompt for video generation.
## Output
The generated video and frames will be saved in the specified `save_path` directory.
## Troubleshooting
• Ensure all dependencies are correctly installed.
• Verify that the model weights are downloaded and placed in the correct locations.
• Check for any missing Python packages and install them using `pip`.
## License
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
## Acknowledgments
• **Hugging Face** for hosting the model weights.
• **Wan-AI** for providing the pre-trained models.
• **DeepBeepMeep** for contributing to the model weights.
## Contact
For any questions or issues, please open an issue on the repository or contact the maintainer.
---
|