-
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search
Paper • 2412.18319 • Published • 39 -
Token-Budget-Aware LLM Reasoning
Paper • 2412.18547 • Published • 46 -
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper • 2412.20993 • Published • 36 -
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
Paper • 2412.17256 • Published • 47
Kai Zuberbühler
kaizuberbuehler
AI & ML interests
language models, agents, image generation, music generation
Recent Activity
liked a model about 22 hours ago
black-forest-labs/FLUX.2-klein-9B upvoted a paper 3 months ago
SWE-Universe: Scale Real-World Verifiable Environments to Millions updated a collection 6 months ago
Reasoning, Thinking, RL and Test-Time ScalingOrganizations
None yet