8 4 3

You Li

Michael4933

Michael4933

AI & ML interests

NLP, Multi-modal LLM

Recent Activity

upvoted a paper about 24 hours ago

Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL

upvoted a paper about 24 hours ago

DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning

updated a Space 13 days ago

Michael4933/Migician

View all activity

Organizations

None yet

Michael4933's activity

upvoted 2 papers about 24 hours ago

Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL

Paper • 2505.15436 • Published 15 days ago • 1

DeepEyes: Incentivizing "Thinking with Images" via Reinforcement Learning

Paper • 2505.14362 • Published 16 days ago • 1

updated a Space 13 days ago

Migician

💬

Demo for Multi-image Grounding model: Migician

published a Space 13 days ago

Migician

💬

Demo for Multi-image Grounding model: Migician

updated a model 15 days ago

Michael4933/Migician

Image-Text-to-Text • Updated 15 days ago • 317 • 1

liked a model about 2 months ago

Michael4933/Migician

Image-Text-to-Text • Updated 15 days ago • 317 • 1

New activity in Michael4933/Migician 2 months ago

Add pipeline tag and library name

#1 opened 4 months ago by

nielsr

upvoted a paper 3 months ago

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Paper • 2503.12797 • Published Mar 17 • 30

liked a dataset 3 months ago

OS-Copilot/OS-Atlas-data

Updated Dec 4, 2024 • 2.69k • 30

New activity in Michael4933/MIG-Bench 3 months ago

Update dataset card with paper link, task category

#2 opened 4 months ago by

nielsr

updated a dataset 3 months ago

Michael4933/MGrounding-630k

Updated Feb 22 • 391 • 2

New activity in Michael4933/MGrounding-630k 3 months ago

Add link to Github repo

#1 opened 4 months ago by

nielsr

authored 2 papers 5 months ago

A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers

Paper • 2405.10936 • Published May 17, 2024 • 1

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 30

updated a dataset 5 months ago

Michael4933/MIG-Bench

Viewer • Updated Feb 22 • 5.89k • 41 • 1

upvoted a paper 5 months ago

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Paper • 2501.05767 • Published Jan 10 • 30

updated a model 5 months ago

Michael4933/Migician

Image-Text-to-Text • Updated 15 days ago • 317 • 1

updated 3 datasets 5 months ago