CoMP: Continual Multimodal Pre-training for Vision Foundation Models Paper • 2503.18931 • Published Mar 24 • 30
LiFT-HRA Collection LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment • 3 items • Updated 29 days ago • 2