Beyond the Last Layer: Multi-Layer Representation Fusion for Visual Tokenization Paper • 2605.10780 • Published 1 day ago • 24
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 1 day ago • 123
δ-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 1 day ago • 84
Prox-E: Fine-Grained 3D Shape Editing via Primitive-Based Abstractions Paper • 2604.23774 • Published 15 days ago • 15
Let ViT Speak: Generative Language-Image Pre-training Paper • 2605.00809 • Published 13 days ago • 32
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons Paper • 2604.28130 • Published 14 days ago • 22
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published 23 days ago • 87
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published 24 days ago • 90
The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping Paper • 2604.11297 • Published about 1 month ago • 142
Training a Student Expert via Semi-Supervised Foundation Model Distillation Paper • 2604.03841 • Published Apr 4 • 10
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published Apr 8 • 324