Collections of ICLR 2026 paper: "OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models"
Zekun Qi
qizekun
AI & ML interests
Embodied Intelligence, Large Langugae Model, 3D Computer Vision
Recent Activity
authored a paper about 23 hours ago
Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance
for Self-supervised Monocular Depth Estimation authored a paper about 23 hours ago
VLA-JEPA: Enhancing Vision-Language-Action Model with Latent World Model authored a paper about 23 hours ago
ReWorld: Multi-Dimensional Reward Modeling for Embodied World Models