cai-qi commited on
Commit
4c21cbc
·
verified ·
1 Parent(s): a52cf31

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -15
README.md CHANGED
@@ -9,17 +9,16 @@ pipeline_tag: text-to-image
9
  library_name: diffusers
10
  ---
11
 
12
- `HiDream-I1` is a series of state-of-the-art open-source image generation models featuring a 16 billion parameter rectified flow transformer with Mixture of Experts architecture, designed to create high-quality images from text prompts.
13
 
14
  ## Key Features
15
-
16
- - ✨ **Superior Image Quality** - Produces exceptional results across multiple styles including photorealistic, cartoon, artistic, and more
17
- - 🎯 **Best-in-Class Prompt Following** - Achieves industry-leading scores on GenEval and DPG benchmarks, outperforming all other open source models
18
- - 🔓 **Open Source** - Released under MIT license to foster scientific advancement and enable creative innovation
19
- - 💼 **Commercial-Friendly** - Generated images can be freely used for personal projects, scientific research, and commercial applications
20
 
21
  ## Quick Start
22
- Please make sure you have installed [Flash Attention](https://github.com/Dao-AILab/flash-attention). We recommend CUDA versions 12.4 for the manual installation.
23
  ```
24
  pip install -r requirements.txt
25
  ```
@@ -30,19 +29,25 @@ git clone https://github.com/HiDream-ai/HiDream-I1
30
 
31
  Then you can run the inference scripts to generate images:
32
 
33
- ``` python
34
-
35
  # For full model inference
36
- python ./inference.py
 
37
  # For distilled dev model inference
38
- INFERENCE_STEP=28 PRETRAINED_MODEL_NAME_OR_PATH=XXX python inference_distilled.py
39
 
40
  # For distilled fast model inference
41
- INFERENCE_STEP=16 PRETRAINED_MODEL_NAME_OR_PATH=XXX python inference_distilled.py
42
-
43
  ```
44
  > **Note:** The inference script will automatically download `meta-llama/Meta-Llama-3.1-8B-Instruct` model files. If you encounter network issues, you can download these files ahead of time and place them in the appropriate cache directory to avoid download failures during inference.
45
 
 
 
 
 
 
 
 
46
 
47
  ## Evaluation Metrics
48
 
@@ -88,9 +93,9 @@ INFERENCE_STEP=16 PRETRAINED_MODEL_NAME_OR_PATH=XXX python inference_distilled.p
88
 
89
 
90
  ## License Agreement
91
- The Transformer models in this repository are licensed under the MIT License. The VAE is from `FLUX.1 [dev]`, and text encoders from `google/t5-v1_1-xxl` and `meta-llama/Meta-Llama-3.1-8B-Instruct`. Please follow the license terms specified for these components. You own all content you create with this model. You can use your generated content freely, but must comply with this license agreement. You are responsible for how you use the models. Do not create illegal content, harmful material, personal information that could harm others, false information, or content targeting vulnerable groups.
92
 
93
 
94
  ## Acknowledgements
95
- - The VAE component is from `FLUX.1 [dev]` Model, licensed under the `FLUX.1 [dev]` Non-Commercial License by Black Forest Labs, Inc.
96
  - The text encoders are from `google/t5-v1_1-xxl` (licensed under Apache 2.0) and `meta-llama/Meta-Llama-3.1-8B-Instruct` (licensed under the Llama 3.1 Community License Agreement).
 
9
  library_name: diffusers
10
  ---
11
 
12
+ `HiDream-I1` is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.
13
 
14
  ## Key Features
15
+ - ✨ **Superior Image Quality** - Produces exceptional results across multiple styles including photorealistic, cartoon, artistic, and more. Achieves state-of-the-art HPS v2.1 score, which aligns with human preferences.
16
+ - 🎯 **Best-in-Class Prompt Following** - Achieves industry-leading scores on GenEval and DPG benchmarks, outperforming all other open-source models.
17
+ - 🔓 **Open Source** - Released under the MIT license to foster scientific advancement and enable creative innovation.
18
+ - 💼 **Commercial-Friendly** - Generated images can be freely used for personal projects, scientific research, and commercial applications.
 
19
 
20
  ## Quick Start
21
+ Please make sure you have installed [Flash Attention](https://github.com/Dao-AILab/flash-attention). We recommend CUDA version 12.4 for the manual installation.
22
  ```
23
  pip install -r requirements.txt
24
  ```
 
29
 
30
  Then you can run the inference scripts to generate images:
31
 
32
+ ```python
 
33
  # For full model inference
34
+ python ./inference.py --model_type full
35
+
36
  # For distilled dev model inference
37
+ python ./inference.py --model_type dev
38
 
39
  # For distilled fast model inference
40
+ python ./inference.py --model_type fast
 
41
  ```
42
  > **Note:** The inference script will automatically download `meta-llama/Meta-Llama-3.1-8B-Instruct` model files. If you encounter network issues, you can download these files ahead of time and place them in the appropriate cache directory to avoid download failures during inference.
43
 
44
+ ## Gradio Demo
45
+
46
+ We also provide a Gradio demo for interactive image generation. You can run the demo with:
47
+
48
+ ```python
49
+ python gradio_demo.py
50
+ ```
51
 
52
  ## Evaluation Metrics
53
 
 
93
 
94
 
95
  ## License Agreement
96
+ The Transformer models in this repository are licensed under the MIT License. The VAE is from `FLUX.1 [schnell]`, and the text encoders from `google/t5-v1_1-xxl` and `meta-llama/Meta-Llama-3.1-8B-Instruct`. Please follow the license terms specified for these components. You own all content you create with this model. You can use your generated content freely, but you must comply with this license agreement. You are responsible for how you use the models. Do not create illegal content, harmful material, personal information that could harm others, false information, or content targeting vulnerable groups.
97
 
98
 
99
  ## Acknowledgements
100
+ - The VAE component is from `FLUX.1 [schnell]`, licensed under Apache 2.0.
101
  - The text encoders are from `google/t5-v1_1-xxl` (licensed under Apache 2.0) and `meta-llama/Meta-Llama-3.1-8B-Instruct` (licensed under the Llama 3.1 Community License Agreement).