File size: 1,395 Bytes

d3dea6a

---
license: cc-by-4.0
language:
- en
base_model: Qwen/Qwen2.5-7B-Instruct
---

# Safe-o1 Model Card 🤖✨

## Model Overview 📝
`Safe-o1` is an innovative language model that introduces a **self-monitoring thinking process** to detect and filter unsafe content, achieving more robust safety performance 🚀.

---

## Features and Highlights 🌟
- **Safety First** 🔒: Through a self-monitoring mechanism, it detects potential unsafe content in the thinking process in real-time, ensuring outputs consistently align with ethical and safety standards.  
- **Enhanced Robustness** 💡: Compared to traditional models, `Safe-o1` performs more stably in complex scenarios, reducing unexpected "derailments."  
- **User-Friendly** 😊: Designed to provide users with a trustworthy conversational partner, suitable for various application scenarios, striking a balance between helpfulness and harmfulness.  

---

## Usage 🚀
You can load `Safe-o1` using the Hugging Face `transformers` library:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("PKU-Alignment/Safe-o1")
model = AutoModelForCausalLM.from_pretrained("PKU-Alignment/Safe-o1")

input_text = "Hello, World!"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

```