Episodes

Latest Episode
MiMo-VL Technical Report

MiMo-VL Technical Report

Episode 877 · · 19:19

🤗 Upvotes: 58 | cs.CL Authors: Xiaomi LLM-Core Team, :, Zihao Yue, Zhenru Lin, Yifan Song, Weikun Wang, Shuhuai Ren, Shuhao Gu, Shicheng Li, Pei...

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Episode 876 · · 20:09

🤗 Upvotes: 41 | cs.LG, cs.AI, cs.CL, cs.CV Authors: Shuang Chen, Yue Guo, Zhaochen Su, Yafu Li, Yulun Wu, Jiacheng Chen, Jiayu Chen, Weijie Wang...

AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

Episode 875 · · 20:56

🤗 Upvotes: 39 | cs.LG, cs.AI, cs.CL, cs.RO Authors: Anastasiia Ivanova, Eva Bakaeva, Zoya Volovikova, Alexey K. Kovalev, Aleksandr I. Panov ...

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

Episode 874 · · 22:48

🤗 Upvotes: 35 | cs.AR, cs.AI, cs.CL, cs.LG, cs.PL Authors: Ahmed Heakl, Sarim Hashmi, Gustavo Bertolo Stahl, Seung Hun Eddie Han, Salman Khan, A...

A Controllable Examination for Long-Context Language Models

A Controllable Examination for Long-Context Language Models

Episode 873 · · 21:39

🤗 Upvotes: 30 | cs.CL Authors: Yijun Yang, Zeyu Huang, Wenhao Zhu, Zihan Qiu, Fei Yuan, Jeff Z. Pan, Ivan Titov Title: ...

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

Episode 872 · · 23:16

🤗 Upvotes: 25 | cs.CV, cs.CL Authors: Kejian Zhu, Zhuoran Jin, Hongbang Yuan, Jiachun Li, Shangqing Tu, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zh...

Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis

Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis

Episode 871 · · 20:17

🤗 Upvotes: 23 | cs.CL Authors: Kejian Zhu, Shangqing Tu, Zhuoran Jin, Lei Hou, Juanzi Li, Jun Zhao Title: Establishing ...

SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

Episode 870 · · 22:04

🤗 Upvotes: 23 | cs.CL Authors: Yuhao Wu, Yushi Bai, Zhiqiang Hu, Juanzi Li, Roy Ka-Wei Lee Title: SuperWriter: Reflecti...

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Episode 869 · · 22:54

🤗 Upvotes: 144 | cs.CL Authors: Shelly Bensal, Umar Jamil, Christopher Bryant, Melisa Russak, Kiran Kamble, Dmytro Mozolevskyi, Muayad Ali, Wase...

VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments

VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments

Episode 868 · · 24:23

🤗 Upvotes: 51 | cs.AI Authors: Zelai Xu, Zhexuan Xu, Xiangmin Yi, Huining Yuan, Xinlei Chen, Yi Wu, Chao Yu, Yu Wang Title: ...

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Episode 867 · · 19:11

🤗 Upvotes: 49 | cs.CV, cs.AI, cs.CL Authors: Bin Lin, Zongjian Li, Xinhua Cheng, Yuwei Niu, Yang Ye, Xianyi He, Shenghai Yuan, Wangbo Yu, Shaodo...

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Episode 866 · · 18:18

🤗 Upvotes: 46 | cs.LG, cs.CL, cs.CV Authors: Zijian Wu, Jinjie Ni, Xiangyan Liu, Zichen Liu, Hang Yan, Michael Qizhe Shieh Title: ...

CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs

CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs

Episode 865 · · 21:03

🤗 Upvotes: 43 | cs.CV, cs.AI Authors: Ai Jian, Weijie Qiu, Xiaokun Wang, Peiyu Wang, Yunzhuo Hao, Jiangbo Pei, Yichen Wei, Yi Peng, Xuchen Song ...

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Episode 864 · · 22:23

🤗 Upvotes: 29 | cs.CL, cs.AI, cs.CV Authors: Qianhui Wu, Kanzhi Cheng, Rui Yang, Chaoyun Zhang, Jianwei Yang, Huiqiang Jiang, Jian Mu, Baolin Pe...

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Episode 863 · · 22:17

🤗 Upvotes: 29 | cs.CV, cs.RO Authors: Gen Luo, Ganlin Yang, Ziyang Gong, Guanzhou Chen, Haonan Duan, Erfei Cui, Ronglei Tong, Zhi Hou, Tianyi Zh...

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Episode 862 · · 24:28

🤗 Upvotes: 28 | cs.AI Authors: Shengjia Zhang, Junjie Wu, Jiawei Chen, Changwang Zhang, Xingyu Lou, Wangchunshu Zhou, Sheng Zhou, Can Wang, Jun ...

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Episode 861 · · 22:08

🤗 Upvotes: 99 | cs.CL, cs.AI, cs.LG Authors: Shenzhi Wang, Le Yu, Chang Gao, Chujie Zheng, Shixuan Liu, Rui Lu, Kai Dang, Xionghui Chen, Jianxin...

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

REASONING GYM: Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Episode 860 · · 21:38

🤗 Upvotes: 52 | cs.LG, cs.AI, cs.CL Authors: Zafir Stojanovski, Oliver Stanley, Joe Sharratt, Richard Jones, Abdulhakeem Adefioye, Jean Kaddour,...

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Episode 859 · · 21:06

🤗 Upvotes: 48 | cs.LG, cs.RO Authors: Mustafa Shukor, Dana Aubakirova, Francesco Capuano, Pepijn Kooijmans, Steven Palma, Adil Zouitine, Michel ...

Taming LLMs by Scaling Learning Rates with Gradient Grouping

Taming LLMs by Scaling Learning Rates with Gradient Grouping

Episode 858 · · 20:20

🤗 Upvotes: 33 | cs.LG, cs.AI Authors: Siyuan Li, Juanxi Tian, Zedong Wang, Xin Jin, Zicheng Liu, Wentao Zhang, Dan Xu Title: ...

ARIA: Training Language Agents with Intention-Driven Reward Aggregation

ARIA: Training Language Agents with Intention-Driven Reward Aggregation

Episode 857 · · 23:45

🤗 Upvotes: 26 | cs.CL Authors: Ruihan Yang, Yikai Zhang, Aili Chen, Xintao Wang, Siyu Yuan, Jiangjie Chen, Deqing Yang, Yanghua Xiao ...

Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

Temporal In-Context Fine-Tuning for Versatile Control of Video Diffusion Models

Episode 856 · · 20:03

🤗 Upvotes: 24 | cs.CV Authors: Kinam Kim, Junha Hyung, Jaegul Choo Title: Temporal In-Context Fine-Tuning for Versatile...

LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks

LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks

Episode 855 · · 19:27

🤗 Upvotes: 24 | cs.RO, cs.AI Authors: Yi Yang, Jiaxuan Sun, Siqi Kou, Yihan Wang, Zhijie Deng Title: LoHoVLA: A Unified...

Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles

Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles

Episode 854 · · 25:00

🤗 Upvotes: 24 | cs.CV, cs.AI, cs.CL Authors: Zifu Wang, Junyi Zhu, Bo Tang, Zhiyu Li, Feiyu Xiong, Jiaqian Yu, Matthew B. Blaschko ...

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding

Episode 853 · · 22:06

🤗 Upvotes: 23 | cs.CV Authors: Junliang Ye, Zhengyi Wang, Ruowen Zhao, Shenghao Xie, Jun Zhu Title: ShapeLLM-Omni: A Na...

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

Episode 852 · · 20:39

🤗 Upvotes: 21 | cs.CL Authors: Zhongwei Wan, Zhihao Dou, Che Liu, Yu Zhang, Dongfei Cui, Qinjian Zhao, Hui Shen, Jing Xiong, Yi Xin, Yifan Jiang...

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Episode 851 · · 21:28

🤗 Upvotes: 83 | cs.CL, cs.AI Authors: Mingjie Liu, Shizhe Diao, Ximing Lu, Jian Hu, Xin Dong, Yejin Choi, Jan Kautz, Yi Dong Title:...

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Episode 850 · · 20:57

🤗 Upvotes: 63 | cs.CL Authors: Junyu Zhang, Runpei Dong, Han Wang, Xuying Ning, Haoran Geng, Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, S...

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Episode 849 · · 22:28

🤗 Upvotes: 59 | cs.CV, cs.AI Authors: Ujjwal Upadhyay, Mukul Ranjan, Zhiqiang Shen, Mohamed Elhoseiny Title: Time Blind...

HardTests: Synthesizing High-Quality Test Cases for LLM Coding

HardTests: Synthesizing High-Quality Test Cases for LLM Coding

Episode 848 · · 21:40

🤗 Upvotes: 37 | cs.CL Authors: Zhongmou He, Yee Man Choi, Kexun Zhang, Jiabao Ji, Junting Zhou, Dejia Xu, Ivan Bercovich, Aidan Zhang, Lei Li ...