Episodes

Latest Episode
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs

Episode 1203 · · 21:00

πŸ€— Upvotes: 58 | cs.CL Authors: Yuhan Song, Linhao Zhang, Chuhan Wu, Aiwei Liu, Wei Jia, Houfeng Wang, Xiao Zhou Title: ...

Multiplayer Nash Preference Optimization

Multiplayer Nash Preference Optimization

Episode 1202 · · 26:13

πŸ€— Upvotes: 52 | cs.AI, cs.CL Authors: Fang Wu, Xu Huang, Weihao Xuan, Zhiwei Zhang, Yijia Xiao, Guancheng Wan, Xiaomin Li, Bing Hu, Peng Xia, Ju...

RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark

RealUnify: Do Unified Models Truly Benefit from Unification? A Comprehensive Benchmark

Episode 1201 · · 25:04

πŸ€— Upvotes: 41 | cs.AI Authors: Yang Shi, Yuhao Dong, Yue Ding, Yuran Wang, Xuanyu Zhu, Sheng Zhou, Wenting Liu, Haochen Tian, Rundong Wang, Huan...

Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Beyond the Exploration-Exploitation Trade-off: A Hidden State Approach for LLM Reasoning in RLVR

Episode 1200 · · 24:39

πŸ€— Upvotes: 39 | cs.LG, cs.CL Authors: Fanding Huang, Guanbo Huang, Xiao Fan, Yi He, Xiao Liang, Xiao Chen, Qinting Jiang, Faisal Nadeem Khan, Ji...

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing

Episode 1199 · · 19:35

πŸ€— Upvotes: 38 | cs.CV, cs.AI Authors: Zhihong Chen, Xuehai Bai, Yang Shi, Chaoyou Fu, Huanyu Zhang, Haotian Wang, Xiaoyan Sun, Zhang Zhang, Lian...

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer

Episode 1198 · · 26:23

πŸ€— Upvotes: 36 | cs.CV, cs.AI Authors: Junsong Chen, Yuyang Zhao, Jincheng Yu, Ruihang Chu, Junyu Chen, Shuai Yang, Xianbang Wang, Yicheng Pan, D...

Democratizing AI scientists using ToolUniverse

Democratizing AI scientists using ToolUniverse

Episode 1197 · · 26:05

πŸ€— Upvotes: 33 | cs.AI, cs.LG Authors: Shanghua Gao, Richard Zhu, Pengwei Sui, Zhenglun Kong, Sufian Aldogom, Yepeng Huang, Ayush Noori, Reza Sha...

Visual Jigsaw Post-Training Improves MLLMs

Visual Jigsaw Post-Training Improves MLLMs

Episode 1196 · · 23:16

πŸ€— Upvotes: 33 | cs.CV Authors: Penghao Wu, Yushan Zhang, Haiwen Diao, Bo Li, Lewei Lu, Ziwei Liu Title: Visual Jigsaw P...

When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance

When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance

Episode 1195 · · 24:47

πŸ€— Upvotes: 29 | cs.CL Authors: Nicolas Boizard, Hippolyte Gisserot-Boukhlef, Kevin El-Haddad, CΓ©line Hudelot, Pierre Colombo Title:...

LongLive: Real-time Interactive Long Video Generation

LongLive: Real-time Interactive Long Video Generation

Episode 1194 · · 24:54

πŸ€— Upvotes: 136 | cs.CV Authors: Shuai Yang, Wei Huang, Ruihang Chu, Yicheng Xiao, Yuyang Zhao, Xianbang Wang, Muyang Li, Enze Xie, Yingcong Chen...

Quantile Advantage Estimation for Entropy-Safe Reasoning

Quantile Advantage Estimation for Entropy-Safe Reasoning

Episode 1193 · · 23:16

πŸ€— Upvotes: 102 | cs.LG, cs.AI Authors: Junkang Wu, Kexin Huang, Jiancan Wu, An Zhang, Xiang Wang, Xiangnan He Title: Qu...

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Episode 1192 · · 27:27

πŸ€— Upvotes: 98 | cs.LG, cs.CL Authors: Xu Wujiang, Wentian Zhao, Zhenting Wang, Li Yu-Jhe, Jin Can, Jin Mingyu, Mei Kai, Wan Kun, Metaxas Dimitri...

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Episode 1191 · · 25:00

πŸ€— Upvotes: 81 | cs.CV, cs.CL Authors: Junbo Niu, Zheng Liu, Zhuangcheng Gu, Bin Wang, Linke Ouyang, Zhiyuan Zhao, Tao Chu, Tianyao He, Fan Wu, Q...

ReviewScore: Misinformed Peer Review Detection with Large Language Models

ReviewScore: Misinformed Peer Review Detection with Large Language Models

Episode 1190 · · 21:58

πŸ€— Upvotes: 54 | cs.CL Authors: Hyun Ryu, Doohyuk Jang, Hyemin S. Lee, Joonhyun Jeong, Gyeongman Kim, Donghyeon Cho, Gyouk Chu, Minyeong Hwang, H...

Variational Reasoning for Language Models

Variational Reasoning for Language Models

Episode 1189 · · 22:33

πŸ€— Upvotes: 51 | cs.CL, cs.AI, cs.LG Authors: Xiangxin Zhou, Zichen Liu, Haonan Wang, Chao Du, Min Lin, Chongxuan Li, Liang Wang, Tianyu Pang ...

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Episode 1188 · · 23:30

πŸ€— Upvotes: 48 | cs.CL, cs.AI, cs.LG Authors: Renjie Luo, Zichen Liu, Xiangyan Liu, Chao Du, Min Lin, Wenhu Chen, Wei Lu, Tianyu Pang ...

MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning

MesaTask: Towards Task-Driven Tabletop Scene Generation via 3D Spatial Reasoning

Episode 1187 · · 25:33

πŸ€— Upvotes: 28 | cs.CV, cs.RO Authors: Jinkun Hao, Naifu Liang, Zhen Luo, Xudong Xu, Weipeng Zhong, Ran Yi, Yichen Jin, Zhaoyang Lyu, Feng Zheng,...

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

Episode 1186 · · 23:54

πŸ€— Upvotes: 28 | cs.CV, cs.AI, cs.CL Authors: Long Xing, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Jianze Liang, Qidong Huang, Jiaqi Wang, Feng Wu, D...

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

No Prompt Left Behind: Exploiting Zero-Variance Prompts in LLM Reinforcement Learning via Entropy-Guided Advantage Shaping

Episode 1185 · · 27:53

πŸ€— Upvotes: 27 | cs.CL, cs.AI, cs.LG Authors: Thanh-Long V. Le, Myeongho Jeon, Kim Vu, Viet Lai, Eunho Yang Title: No Pr...

VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models

VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models

Episode 1184 · · 22:17

πŸ€— Upvotes: 95 | cs.LG, cs.CL Authors: Guochao Jiang, Wenfeng Feng, Guofeng Quan, Chuzhan Hao, Yuewei Zhang, Guohua Liu, Hao Wang Ti...