File size: 1,395 Bytes
d3dea6a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
license: cc-by-4.0
language:
- en
base_model: Qwen/Qwen2.5-7B-Instruct
---

# Safe-o1 Model Card πŸ€–βœ¨

## Model Overview πŸ“
`Safe-o1` is an innovative language model that introduces a **self-monitoring thinking process** to detect and filter unsafe content, achieving more robust safety performance πŸš€.

---

## Features and Highlights 🌟
- **Safety First** πŸ”’: Through a self-monitoring mechanism, it detects potential unsafe content in the thinking process in real-time, ensuring outputs consistently align with ethical and safety standards.  
- **Enhanced Robustness** πŸ’‘: Compared to traditional models, `Safe-o1` performs more stably in complex scenarios, reducing unexpected "derailments."  
- **User-Friendly** 😊: Designed to provide users with a trustworthy conversational partner, suitable for various application scenarios, striking a balance between helpfulness and harmfulness.  

---

## Usage πŸš€
You can load `Safe-o1` using the Hugging Face `transformers` library:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("PKU-Alignment/Safe-o1")
model = AutoModelForCausalLM.from_pretrained("PKU-Alignment/Safe-o1")

input_text = "Hello, World!"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

```