Class Attribute Inference Attacks: Inferring Sensitive Class Information by Diffusion-Based Attribute Manipulations Paper • 2303.09289 • Published Mar 16, 2023 • 1
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge Paper • 2309.11575 • Published Sep 20, 2023
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation Paper • 2305.15296 • Published May 24, 2023
Mitigating Inappropriateness in Image Generation: Can there be Value in Reflecting the World's Ugliness? Paper • 2305.18398 • Published May 28, 2023 • 1
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis Paper • 2209.08891 • Published Sep 19, 2022 • 1
The Stable Artist: Steering Semantics in Diffusion Latent Space Paper • 2212.06013 • Published Dec 12, 2022
LLavaGuard: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment Paper • 2406.05113 • Published Jun 7, 2024 • 2
AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation Paper • 2301.08110 • Published Jan 19, 2023 • 1
SCAR: Sparse Conditioned Autoencoders for Concept Detection and Steering in LLMs Paper • 2411.07122 • Published Nov 11, 2024
Judging Quality Across Languages: A Multilingual Approach to Pretraining Data Filtering with Language Models Paper • 2505.22232 • Published 3 days ago • 17
Tokenizer Choice For LLM Training: Negligible or Crucial? Paper • 2310.08754 • Published Oct 12, 2023 • 2
Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings Paper • 2202.06671 • Published Feb 14, 2022 • 2
Specialized Document Embeddings for Aspect-based Similarity of Research Papers Paper • 2203.14541 • Published Mar 28, 2022
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning Paper • 2301.09626 • Published Jan 23, 2023 • 2