πŸͺœ LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers

Project Paper Code Model License


πŸ“Œ Summary

LADDER is a general framework that enables vision classifiers to automatically discover subpopulations (or "slices") of data where the model is underperforming β€” without requiring group annotations. It leverages vision-language representations and the reasoning capabilities of large language models (LLMs) to detect and rectify bias-inducing features in both natural and medical imaging domains.


🧠 Architecture & Components

  • πŸ” Slice Discovery using:
    • CLIP, Mammo-CLIP, and CXR-CLIP features
    • BLIP and GPT-4o-generated captions
  • 🧠 Hypothesis Generation using:
    • GPT-4o, Claude, Gemini, LLaMA
  • βœ… Bias Mitigation via reweighting & pseudo-labeling

πŸ“Š Datasets Used

  • Natural Images: Waterbirds, CelebA, MetaShift
  • Medical Images: NIH ChestX-ray, RSNA Mammograms, VinDr Mammograms

πŸ“¦ Files Included

File Description
model.pt Pretrained model checkpoint
feature_cache.pkl Cached representations (CLIP/Mammo-CLIP/CXR-CLIP)
metadata.csv Metadata with discovered slice labels
caption_blip.json BLIP-generated captions
caption_gpt4o.json GPT-4o-generated captions
predictions.json Model predictions on test set

πŸ§ͺ Benchmarks

LADDER outperforms traditional slice discovery methods (Domino, FACTS) across 6 datasets and >200 classifiers. It is especially effective in:

  • Discovering hidden biases without explicit attribute labels
  • Reasoning about non-visual factors (e.g., preprocessing artifacts)
  • Operating without human-written captions

πŸ“œ Citation

@article{ghosh2024ladder,
  title={LADDER: Language Driven Slice Discovery and Error Rectification},
  author={Ghosh, Shantanu and Syed, Rayan and Wang, Chenyu and Poynton, Clare B and Visweswaran, Shyam and Batmanghelich, Kayhan},
  journal={arXiv preprint arXiv:2408.07832},
  year={2024}
}

🀝 Acknowledgements

Boston University, Stanford University, BUMC, and the University of Pittsburgh.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support