Episodes

Latest Episode
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Episode 787 · · 22:49

πŸ€— Upvotes: 80 | cs.CL Authors: Hyungjoo Chae, Sunghwan Kim, Junhee Cho, Seungone Kim, Seungjun Moon, Gyeom Hwangbo, Dongha Lim, Minjin Kim, Yeon...

MMaDA: Multimodal Large Diffusion Language Models

MMaDA: Multimodal Large Diffusion Language Models

Episode 786 · · 21:00

πŸ€— Upvotes: 56 | cs.CV Authors: Ling Yang, Ye Tian, Bowen Li, Xinchen Zhang, Ke Shen, Yunhai Tong, Mengdi Wang Title: MM...

Scaling Law for Quantization-Aware Training

Scaling Law for Quantization-Aware Training

Episode 785 · · 19:44

πŸ€— Upvotes: 56 | cs.LG, cs.CL Authors: Mengzhao Chen, Chaoyi Zhang, Jing Liu, Yutao Zeng, Zeyue Xue, Zhiheng Liu, Yunshui Li, Jin Ma, Jie Huang, ...

UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning

Episode 784 · · 18:15

πŸ€— Upvotes: 43 | cs.CV Authors: Sule Bai, Mingxing Li, Yong Liu, Jing Tang, Haoji Zhang, Lei Sun, Xiangxiang Chu, Yansong Tang Title...

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Episode 783 · · 20:47

πŸ€— Upvotes: 41 | cs.CL Authors: Siyue Zhang, Yilun Zhao, Liyuan Geng, Arman Cohan, Anh Tuan Luu, Chen Zhao Title: Diffus...

Efficient Agent Training for Computer Use

Efficient Agent Training for Computer Use

Episode 782 · · 23:07

πŸ€— Upvotes: 32 | cs.AI, cs.CL, cs.LG Authors: Yanheng He, Jiahe Jin, Pengfei Liu Title: Efficient Agent Training for Com...

This Time is Different: An Observability Perspective on Time Series Foundation Models

This Time is Different: An Observability Perspective on Time Series Foundation Models

Episode 781 · · 22:05

πŸ€— Upvotes: 28 | cs.LG, cs.AI Authors: Ben Cohen, Emaad Khwaja, Youssef Doubli, Salahidine Lemaachi, Chris Lettieri, Charles Masson, Hugo Miccini...

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Episode 780 · · 17:46

πŸ€— Upvotes: 24 | cs.CL, cs.AI, cs.LG Authors: Wei Liu, Ruochen Zhou, Yiyun Deng, Yuzhen Huang, Junteng Liu, Yuntian Deng, Yizhe Zhang, Junxian He...

Emerging Properties in Unified Multimodal Pretraining

Emerging Properties in Unified Multimodal Pretraining

Episode 779 · · 22:46

πŸ€— Upvotes: 87 | cs.CV Authors: Chaorui Deng, Deyao Zhu, Kunchang Li, Chenhui Gou, Feng Li, Zeyu Wang, Shu Zhong, Weihao Yu, Xiaonan Nie, Ziang S...

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Episode 778 · · 21:11

πŸ€— Upvotes: 48 | cs.LG, cs.AI, cs.AR, cs.CV, cs.PF Authors: Jintao Zhang, Jia Wei, Pengle Zhang, Xiaoming Xu, Haofeng Huang, Haoxu Wang, Kai Jian...

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Episode 777 · · 21:59

πŸ€— Upvotes: 30 | cs.LG, cs.AI, cs.CL Authors: Penghui Qi, Zichen Liu, Tianyu Pang, Chao Du, Wee Sun Lee, Min Lin Title: ...

VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank

VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank

Episode 776 · · 20:39

πŸ€— Upvotes: 28 | cs.CV Authors: Tianhe Wu, Jian Zou, Jie Liang, Lei Zhang, Kede Ma Title: VisualQuality-R1: Reasoning-In...

Visual Agentic Reinforcement Fine-Tuning

Visual Agentic Reinforcement Fine-Tuning

Episode 775 · · 23:31

πŸ€— Upvotes: 26 | cs.CV, cs.AI Authors: Ziyu Liu, Yuhang Zang, Yushan Zou, Zijian Liang, Xiaoyi Dong, Yuhang Cao, Haodong Duan, Dahua Lin, Jiaqi W...

Neurosymbolic Diffusion Models

Neurosymbolic Diffusion Models

Episode 774 · · 23:44

πŸ€— Upvotes: 25 | cs.LG Authors: Emile van Krieken, Pasquale Minervini, Edoardo Ponti, Antonio Vergari Title: Neurosymbol...

Chain-of-Model Learning for Language Model

Chain-of-Model Learning for Language Model

Episode 773 · · 23:37

πŸ€— Upvotes: 70 | cs.CL Authors: Kaitao Song, Xiaohua Wang, Xu Tan, Huiqiang Jiang, Chengruidong Zhang, Yongliang Shen, Cen LU, Zihao Li, Zifan So...

AdaptThink: Reasoning Models Can Learn When to Think

AdaptThink: Reasoning Models Can Learn When to Think

Episode 772 · · 20:31

πŸ€— Upvotes: 58 | cs.CL, cs.AI, cs.LG Authors: Jiajie Zhang, Nianyi Lin, Lei Hou, Ling Feng, Juanzi Li Title: AdaptThink:...

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning

Episode 771 · · 20:57

πŸ€— Upvotes: 46 | cs.LG, cs.AI Authors: Chenwei Lou, Zewei Sun, Xinnian Liang, Meng Qu, Wei Shen, Wenqi Wang, Yuntao Li, Qingping Yang, Shuangzhi ...

Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Episode 770 · · 20:30

πŸ€— Upvotes: 39 | cs.LG Authors: Jeffrey Willette, Heejun Lee, Sung Ju Hwang Title: Delta Attention: Fast and Accurate Sp...

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis

Episode 769 · · 22:05

πŸ€— Upvotes: 34 | cs.AI, cs.CL, cs.CV, cs.HC Authors: Tianbao Xie, Jiaqi Deng, Xiaochuan Li, Junlin Yang, Haoyuan Wu, Jixuan Chen, Wenjing Hu, Xin...

Faster Video Diffusion with Trainable Sparse Attention

Faster Video Diffusion with Trainable Sparse Attention

Episode 768 · · 25:40

πŸ€— Upvotes: 29 | cs.CV Authors: Peiyuan Zhang, Haofeng Huang, Yongqi Chen, Will Lin, Zhengzhong Liu, Ion Stoica, Eric P. Xing, Hao Zhang ...

Thinkless: LLM Learns When to Think

Thinkless: LLM Learns When to Think

Episode 767 · · 17:59

πŸ€— Upvotes: 28 | cs.CL, cs.AI Authors: Gongfan Fang, Xinyin Ma, Xinchao Wang Title: Thinkless: LLM Learns When to Think ...

Model Merging in Pre-training of Large Language Models

Model Merging in Pre-training of Large Language Models

Episode 766 · · 23:02

πŸ€— Upvotes: 27 | cs.CL, cs.LG Authors: Yunshui Li, Yiyuan Ma, Shen Yan, Chaoyi Zhang, Jing Liu, Jianqiao Lu, Ziwen Xu, Mengzhao Chen, Minrui Wang...

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Episode 765 · · 24:52

πŸ€— Upvotes: 23 | cs.LG, cs.AI, cs.CL Authors: Hengli Li, Chenxi Li, Tong Wu, Xuekai Zhu, Yuxuan Wang, Zhaoxin Yu, Eric Hanchen Jiang, Song-Chun Z...

Qwen3 Technical Report

Qwen3 Technical Report

Episode 764 · · 21:31

πŸ€— Upvotes: 117 | cs.CL Authors: An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chen...

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Episode 763 · · 23:50

πŸ€— Upvotes: 43 | cs.AI, cs.CR Authors: Yue Liu, Shengfang Zhai, Mingzhe Du, Yulin Chen, Tri Cao, Hongcheng Gao, Cheng Wang, Xinfeng Li, Kun Wang,...

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Episode 762 · · 19:26

πŸ€— Upvotes: 42 | cs.CV, cs.CL Authors: Zhaowei Wang, Wenhao Yu, Xiyu Ren, Jipeng Zhang, Yu Zhao, Rohit Saxena, Liang Cheng, Ginny Wong, Simon See...

Visual Planning: Let's Think Only with Images

Visual Planning: Let's Think Only with Images

Episode 761 · · 21:56

πŸ€— Upvotes: 33 | cs.LG, cs.AI, cs.CL, cs.CV Authors: Yi Xu, Chengzu Li, Han Zhou, Xingchen Wan, Caiqi Zhang, Anna Korhonen, Ivan VuliΔ‡ ...

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Episode 760 · · 21:52

πŸ€— Upvotes: 76 | cs.CL Authors: Zhiyuan Hu, Yibo Wang, Hanze Dong, Yuhui Xu, Amrita Saha, Caiming Xiong, Bryan Hooi, Junnan Li Title...

System Prompt Optimization with Meta-Learning

System Prompt Optimization with Meta-Learning

Episode 759 · · 21:40

πŸ€— Upvotes: 48 | cs.CL, cs.AI, cs.LG Authors: Yumin Choi, Jinheon Baek, Sung Ju Hwang Title: System Prompt Optimization ...

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Episode 758 · · 19:27

πŸ€— Upvotes: 49 | cs.CV, cs.AI Authors: Jiuhai Chen, Zhiyang Xu, Xichen Pan, Yushi Hu, Can Qin, Tom Goldstein, Lifu Huang, Tianyi Zhou, Saining Xi...