File size: 9,506 Bytes
ef46f0f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
# FramePack Dancing Image-to-Video Generation

This repository contains the necessary steps and scripts to generate videos using the Dancing image-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create high-quality anime-style videos based on textual prompts.

## Prerequisites

Before proceeding, ensure that you have the following installed on your system:

• **Ubuntu** (or a compatible Linux distribution)
• **Python 3.x****pip** (Python package manager)
• **Git****Git LFS** (Git Large File Storage)
• **FFmpeg**

## Installation

1. **Update and Install Dependencies**

   ```bash
   sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
   ```

2. **Clone the Repository**

   ```bash
   git clone https://huggingface.co/svjack/YiChen_FramePack_lora_early
   cd YiChen_FramePack_lora_early
   ```

3. **Install Python Dependencies**

   ```bash
   pip install torch torchvision
   pip install -r requirements.txt
   pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
   pip install moviepy==1.0.3
   pip install sageattention==1.0.6
   ```

4. **Download Model Weights**

   ```bash
    git clone https://huggingface.co/lllyasviel/FramePackI2V_HY
    git clone https://huggingface.co/hunyuanvideo-community/HunyuanVideo
    git clone https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged
    git clone https://huggingface.co/Comfy-Org/sigclip_vision_384
   ```

## Usage

To generate a video, use the `fpack_generate_video.py` script with the appropriate parameters. Below are examples of how to generate videos using the Dancing model.



### 1. Furina
- Source Image


```bash
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path fln.png \
    --prompt "In the style of Yi Chen Dancing White Background , The character's movements shift dynamically throughout the video, transitioning from poised stillness to lively dance steps. Her expressions evolve seamlessly—starting with focused determination, then flashing surprise as she executes a quick spin, before breaking into a joyful smile mid-leap. Her hands flow through choreographed positions, sometimes extending gracefully like unfolding wings, other times clapping rhythmically against her wrists. During a dramatic hip sway, her fingers fan open near her cheek, then sweep downward as her whole body dips into a playful crouch, the sequins on her costume catching the light with every motion." \
    --video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --output_type both \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors


```

- Without Lora

- With Lora


### 2. Roper 
- Source Image



```bash
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path shengjiang.png \
    --prompt "In the style of Yi Chen Dancing White Background , The character's movements shift dynamically throughout the video, transitioning from poised stillness to lively dance steps. Her expressions evolve seamlessly—starting with focused determination, then flashing surprise as she executes a quick spin, before breaking into a joyful smile mid-leap. Her hands flow through choreographed positions, sometimes extending gracefully like unfolding wings, other times clapping rhythmically against her wrists. During a dramatic hip sway, her fingers fan open near her cheek, then sweep downward as her whole body dips into a playful crouch, the sequins on her costume catching the light with every motion." \
    --video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --output_type both \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors

``` 

- With Lora



### 3. Varesa
- Source Image


```bash
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path waliesha.jpg \
    --prompt "In the style of Yi Chen Dancing White Background , The dancer’s energy pulses in waves—one moment a statue, poised and precise, the next a whirl of motion as her feet flicker across the floor. Her face tells its own story: brows knit in concentration, then eyes widening mid-turn as if startled by her own speed, before dissolving into laughter as she springs upward, weightless. Her arms carve the air—now arcing like ribbons unfurling, now snapping sharp as a whip’s crack, palms meeting wrists in staccato beats. A roll of her hips sends her fingers fluttering near her temple, then cascading down as she folds into a teasing dip, the beads on her dress scattering light like sparks." \
    --video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --output_type both \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors

```
- With Lora



### 4. Scaramouche  
- Source Image


```bash
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path shanbing.jpg \
    --prompt "In the style of Yi Chen Dancing White Background , The dancer’s energy pulses in waves—one moment a statue, poised and precise, the next a whirl of motion as her feet flicker across the floor. Her face tells its own story: brows knit in concentration, then eyes widening mid-turn as if startled by her own speed, before dissolving into laughter as she springs upward, weightless. Her arms carve the air—now arcing like ribbons unfurling, now snapping sharp as a whip’s crack, palms meeting wrists in staccato beats. A roll of her hips sends her fingers fluttering near her temple, then cascading down as she folds into a teasing dip, the beads on her dress scattering light like sparks." \
    --video_size 960 544 --video_seconds 3 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --output_type both \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_yichen_output/framepack-yichen-lora-000006.safetensors

```

- With Lora




## Parameters

* `--fp8`: Enable FP8 precision (optional).
* `--task`: Specify the task (e.g., `t2v-1.3B`).
* `--video_size`: Set the resolution of the generated video (e.g., `1024 1024`).
* `--video_length`: Define the length of the video in frames.
* `--infer_steps`: Number of inference steps.
* `--save_path`: Directory to save the generated video.
* `--output_type`: Output type (e.g., `both` for video and frames).
* `--dit`: Path to the diffusion model weights.
* `--vae`: Path to the VAE model weights.
* `--t5`: Path to the T5 model weights.
* `--attn_mode`: Attention mode (e.g., `torch`).
* `--lora_weight`: Path to the LoRA weights.
* `--lora_multiplier`: Multiplier for LoRA weights.
* `--prompt`: Textual prompt for video generation.



## Output

The generated video and frames will be saved in the specified `save_path` directory.

## Troubleshooting

• Ensure all dependencies are correctly installed.
• Verify that the model weights are downloaded and placed in the correct locations.
• Check for any missing Python packages and install them using `pip`.

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Acknowledgments**Hugging Face** for hosting the model weights.
• **Wan-AI** for providing the pre-trained models.
• **DeepBeepMeep** for contributing to the model weights.

## Contact

For any questions or issues, please open an issue on the repository or contact the maintainer.

---