Shengqiong Wu's picture

6 7

Shengqiong Wu

ChocoWu

·

https://chocowu.github.io/

ChocoWu

AI & ML interests

Large Language Model, Multimodal learning, Scene graph Generation

Recent Activity

authored a paper 7 days ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

upvoted a paper 8 days ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

updated a dataset 9 days ago

General-Level/General-Bench-Closeset

View all activity

Organizations

ChocoWu's activity

upvoted a paper 8 days ago

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Paper • 2504.13122 • Published 8 days ago • 21

upvoted a paper 18 days ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 25 days ago • 258

upvoted a paper 21 days ago

JavisDiT: Joint Audio-Video Diffusion Transformer with Hierarchical Spatio-Temporal Prior Synchronization

Paper • 2503.23377 • Published 26 days ago • 52

upvoted a paper 24 days ago

Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation

Paper • 2503.24379 • Published 25 days ago • 75

upvoted 2 papers about 1 month ago

Position: Interactive Generative Video as Next-Generation Game Engine

Paper • 2503.17359 • Published Mar 21 • 62

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16 • 34

upvoted a paper 10 months ago

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 55