gao's picture

17 11

gao

ym9

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 20 days ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

liked a Space 27 days ago

black-forest-labs/FLUX.1-dev

liked a Space 29 days ago

yanze/PuLID-FLUX

View all activity

Organizations

ym9's activity

upvoted a paper 20 days ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published 20 days ago • 55

upvoted 4 papers about 1 month ago

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Paper • 2503.14487 • Published Mar 18 • 27

Unleashing Vecset Diffusion Model for Fast Shape Generation

Paper • 2503.16302 • Published Mar 20 • 44

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Paper • 2503.08677 • Published Mar 11 • 29

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12 • 45

upvoted a paper about 2 months ago

How far can we go with ImageNet for Text-to-Image generation?

Paper • 2502.21318 • Published Feb 28 • 26

upvoted 2 papers 2 months ago

Fast Video Generation with Sliding Tile Attention

Paper • 2502.04507 • Published Feb 6 • 51

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published Feb 7 • 65

upvoted 3 papers 3 months ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 385

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Paper • 2501.12202 • Published Jan 21 • 43

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Paper • 2501.12224 • Published Jan 21 • 48

upvoted 2 papers 4 months ago

1.58-bit FLUX

Paper • 2412.18653 • Published Dec 24, 2024 • 84

Large Motion Video Autoencoding with Cross-modal Video VAE

Paper • 2412.17805 • Published Dec 23, 2024 • 24

upvoted a paper 5 months ago

TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Paper • 2412.03069 • Published Dec 4, 2024 • 35

upvoted a collection 8 months ago

Papers I want to read

Papers in my to-read list • 259 items • Updated Jan 10 • 31

upvoted a paper 10 months ago

MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions

Paper • 2407.06358 • Published Jul 8, 2024 • 19

upvoted a paper about 1 year ago

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

Paper • 2403.14621 • Published Mar 21, 2024 • 16