prithivMLmods commited on
Commit
965d1cb
Β·
verified Β·
1 Parent(s): 1460cb8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -1
README.md CHANGED
@@ -9,6 +9,13 @@ base_model:
9
  pipeline_tag: image-classification
10
  library_name: transformers
11
  ---
 
 
 
 
 
 
 
12
  ```py
13
  Classification Report:
14
  precision recall f1-score support
@@ -23,4 +30,83 @@ Cartoon Portrait 0.9964 0.9926 0.9945 4444
23
  weighted avg 0.9972 0.9972 0.9972 17776
24
  ```
25
 
26
- ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/vOq9EHhJGLzRJSQJ5_liQ.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pipeline_tag: image-classification
10
  library_name: transformers
11
  ---
12
+
13
+ ![zdfghgdftg.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/N8f6dbYHMmE02vq5wseAE.png)
14
+
15
+ # **Multilabel-Portrait-SigLIP2**
16
+
17
+ > **Multilabel-Portrait-SigLIP2** is a vision-language model fine-tuned from [**google/siglip2-base-patch16-224**](https://huggingface.co/google/siglip2-base-patch16-224) using the `SiglipForImageClassification` architecture. It classifies portrait-style images into one of the following **visual portrait categories**:
18
+
19
  ```py
20
  Classification Report:
21
  precision recall f1-score support
 
30
  weighted avg 0.9972 0.9972 0.9972 17776
31
  ```
32
 
33
+ ![download.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/vOq9EHhJGLzRJSQJ5_liQ.png)
34
+
35
+ ---
36
+
37
+ # **Model Objective**
38
+
39
+ The model is designed to **analyze portrait images** and categorize them into **one of four distinct portrait types**:
40
+
41
+ - **0:** Anime Portrait
42
+ - **1:** Cartoon Portrait
43
+ - **2:** Real Portrait
44
+ - **3:** Sketch Portrait
45
+
46
+ ---
47
+
48
+ # **Try it with Transformers πŸ€—**
49
+
50
+ Install dependencies:
51
+
52
+ ```bash
53
+ pip install -q transformers torch pillow gradio
54
+ ```
55
+
56
+ Run the model with the following script:
57
+
58
+ ```python
59
+ import gradio as gr
60
+ from transformers import AutoImageProcessor, SiglipForImageClassification
61
+ from PIL import Image
62
+ import torch
63
+
64
+ # Load model and processor
65
+ model_name = "prithivMLmods/Multilabel-Portrait-SigLIP2" # Replace with actual HF model path
66
+ model = SiglipForImageClassification.from_pretrained(model_name)
67
+ processor = AutoImageProcessor.from_pretrained(model_name)
68
+
69
+ # Label mapping
70
+ id2label = {
71
+ 0: "Anime Portrait",
72
+ 1: "Cartoon Portrait",
73
+ 2: "Real Portrait",
74
+ 3: "Sketch Portrait"
75
+ }
76
+
77
+ def classify_portrait(image):
78
+ """Predict the type of portrait style from an image."""
79
+ image = Image.fromarray(image).convert("RGB")
80
+ inputs = processor(images=image, return_tensors="pt")
81
+
82
+ with torch.no_grad():
83
+ outputs = model(**inputs)
84
+ logits = outputs.logits
85
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
86
+
87
+ predictions = {id2label[i]: round(probs[i], 3) for i in range(len(probs))}
88
+ predictions = dict(sorted(predictions.items(), key=lambda item: item[1], reverse=True))
89
+ return predictions
90
+
91
+ # Gradio interface
92
+ iface = gr.Interface(
93
+ fn=classify_portrait,
94
+ inputs=gr.Image(type="numpy"),
95
+ outputs=gr.Label(label="Portrait Type Prediction Scores"),
96
+ title="Multilabel-Portrait-SigLIP2",
97
+ description="Upload a portrait-style image (anime, cartoon, real, or sketch) to predict its most likely visual category."
98
+ )
99
+
100
+ if __name__ == "__main__":
101
+ iface.launch()
102
+ ```
103
+
104
+ ---
105
+
106
+ # **Intended Use Cases**
107
+
108
+ - **AI Art Curation** β€” Automatically organize large-scale datasets of artistic portraits.
109
+ - **Style-based Portrait Analysis** β€” Determine artistic style in user-uploaded or curated portrait datasets.
110
+ - **Content Filtering for Platforms** β€” Group and recommend based on visual aesthetics.
111
+ - **Dataset Pre-labeling** β€” Helps reduce manual effort in annotation tasks.
112
+ - **User Avatar Classification** β€” Profile categorization in social or gaming platforms.