Zhengzhong Tu's picture

3 8 2

Zhengzhong Tu

vztu

·

https://vztu.github.io

_vztu
vztu

AI & ML interests

Generative AI, Multimodal AI, Trustworthy AI

Recent Activity

upvoted a paper 2 days ago

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

authored a paper 2 days ago

MULLER: Multilayer Laplacian Resizer for Vision

authored a paper 2 days ago

MAXIM: Multi-Axis MLP for Image Processing

View all activity

Organizations

vztu's activity

upvoted a paper 2 days ago

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Paper • 2411.16832 • Published Nov 25, 2024 • 2

authored 19 papers 2 days ago

MULLER: Multilayer Laplacian Resizer for Vision

Paper • 2304.02859 • Published Apr 6, 2023

MAXIM: Multi-Axis MLP for Image Processing

Paper • 2201.02973 • Published Jan 9, 2022

MaxViT: Multi-Axis Vision Transformer

Paper • 2204.01697 • Published Apr 4, 2022

Bigger is not Always Better: Scaling Properties of Latent Diffusion Models

Paper • 2404.01367 • Published Apr 1, 2024 • 23

Unraveling Cross-Modality Knowledge Conflict in Large Vision-Language Models

Paper • 2410.03659 • Published Oct 4, 2024 • 6

AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results

Paper • 2404.16205 • Published Apr 24, 2024

Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing

Paper • 2411.16832 • Published Nov 25, 2024 • 2

AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving

Paper • 2412.15206 • Published Dec 19, 2024

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

Paper • 2502.13146 • Published Feb 18 • 1

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Paper • 2502.14296 • Published Feb 20 • 46

Complex LLM Planning via Automated Heuristics Discovery

Paper • 2502.19295 • Published Feb 26 • 1

Can Large Vision Language Models Read Maps Like a Human?

Paper • 2503.14607 • Published Mar 18 • 9

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

Paper • 2412.15208 • Published Dec 19, 2024

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Paper • 2503.24381 • Published Mar 31 • 1

NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results

Paper • 2505.03007 • Published about 1 month ago

The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report

Paper • 2504.10686 • Published Apr 14

DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization

Paper • 2505.12366 • Published 18 days ago

VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction

Paper • 2505.20279 • Published 10 days ago • 4

Generative AI for Autonomous Driving: Frontiers and Opportunities

Paper • 2505.08854 • Published 23 days ago • 1