Anthonny Olime's picture

Anthonny Olime

Aviv-anthonnyolime

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Tina: Tiny Reasoning Models via LoRA

upvoted a paper 1 day ago

Describe Anything: Detailed Localized Image and Video Captioning

liked a model 1 day ago

nvidia/DAM-3B-Video

View all activity

Organizations

Aviv-anthonnyolime's activity

upvoted 2 papers 1 day ago

Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published 2 days ago • 15

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published 2 days ago • 46

upvoted a paper 2 days ago

HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation

Paper • 2504.13072 • Published 7 days ago • 11

upvoted a collection 2 days ago

blt

4 items • Updated 7 days ago • 17

upvoted a paper 10 days ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published 23 days ago • 82

upvoted 2 papers 14 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 17 days ago • 171

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published 17 days ago • 98

upvoted a paper about 1 month ago

Vision-Speech Models: Teaching Speech Models to Converse about Images

Paper • 2503.15633 • Published Mar 19 • 1

upvoted 6 papers about 2 months ago

Self-Guided Diffusion Models

Paper • 2210.06462 • Published Oct 12, 2022 • 3

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 85

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 80

Thinking Preference Optimization

Paper • 2502.13173 • Published Feb 17 • 17

S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Paper • 2502.12853 • Published Feb 18 • 29

One-step Diffusion Models with f-Divergence Distribution Matching

Paper • 2502.15681 • Published Feb 21 • 7

upvoted an article about 2 months ago

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 153

upvoted 3 papers 2 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 143

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 182

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

Paper • 2502.04328 • Published Feb 6 • 30