Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks Paper • 2501.16902 • Published Jan 28 • 1
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers Paper • 2502.18460 • Published Feb 25 • 3
ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations Paper • 2504.00824 • Published 23 days ago • 40
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Paper • 2504.08837 • Published 13 days ago • 42
DRAMA Collection A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26 • 6
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 25 days ago • 128
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers Paper • 2503.11579 • Published Mar 14 • 20
ABC: Achieving Better Control of Multimodal Embeddings using VLMs Paper • 2503.00329 • Published Mar 1 • 19
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published Jan 29 • 59
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale Paper • 2412.05237 • Published Dec 6, 2024 • 48
OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision Paper • 2411.07199 • Published Nov 11, 2024 • 50
Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering Paper • 2309.02233 • Published Sep 5, 2023 • 1
WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild Paper • 2409.03753 • Published Sep 5, 2024 • 19