|
--- |
|
license: cc-by-4.0 |
|
datasets: |
|
- ai4privacy/pii-masking-400k |
|
language: |
|
- it |
|
- en |
|
- fr |
|
- nl |
|
- es |
|
base_model: |
|
- distilbert/distilbert-base-multilingual-cased |
|
pipeline_tag: token-classification |
|
library_name: transformers |
|
--- |
|
|
|
|
|
# Neural Wave - Hackathon 2024 - Lugano |
|
|
|
This repository contains the code produced by the `Molise.ai` team in the Neural Wave Hackathon 2024 competition in |
|
Lugano. |
|
|
|
## Challenge |
|
|
|
Here is a brief explanation of the challenge: |
|
The challenge was proposed by **Ai4Privacy**, a company that builds global solutions that enhance **privacy protections** |
|
|
|
in the rapidly evolving world of **Artificial Intelligence**. |
|
The challenge goal is to create a machine learning model capable of detecting and masking **PII** (Personal Identifiable |
|
Information) in text data across several languages and locales. The task requires working with a synthetic dataset to |
|
train models that can automatically identify and redact **17 types of PII** in natural language texts. The solution |
|
should aim for high accuracy while maintaining the **usability** of the underlying data. |
|
The final solution could be integrated into various systems and enhance privacy protections across industries, |
|
including client support, legal, and general data anonymization tools. Success in this project will contribute to |
|
scaling privacy-conscious AI systems without compromising the UX or operational performance. |
|
|
|
|
|
## Disclaimer |
|
|
|
The publisher of this repository is not affiliated with Ai4Privacy and Ai Suisse SA. |