Episodes

Latest Episode
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Episode 402 · · 25:15

πŸ€— Upvotes: 59 | cs.CV, cs.AI, cs.CL Authors: Yilun Zhao, Lujing Xie, Haowei Zhang, Guo Gan, Yitao Long, Zhiyuan Hu, Tongyan Hu, Weiyuan Chen, Ch...

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Episode 401 · · 23:40

πŸ€— Upvotes: 51 | cs.LG, cs.CL Authors: Zihan Qiu, Zeyu Huang, Bo Zheng, Kaiyue Wen, Zekun Wang, Rui Men, Ivan Titov, Dayiheng Liu, Jingren Zhou, ...

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Episode 400 · · 26:00

πŸ€— Upvotes: 32 | cs.CV Authors: Daniel Garibi, Shahar Yadin, Roni Paiss, Omer Tov, Shiran Zada, Ariel Ephrat, Tomer Michaeli, Inbar Mosseri, Tali...

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Episode 399 · · 20:57

πŸ€— Upvotes: 31 | cs.AI, cs.CL, cs.CV, cs.HC Authors: Yujia Qin, Yining Ye, Junjie Fang, Haoming Wang, Shihao Liang, Shizuo Tian, Junda Zhang, Jia...

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Episode 398 · · 21:26

πŸ€— Upvotes: 26 | cs.CV, cs.CL Authors: Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Ziyu Liu, Shengyuan Ding, Shenxi Wu, Yubo Ma, Haodong Dua...

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

Episode 397 · · 23:22

πŸ€— Upvotes: 20 | cs.CL, cs.CV Authors: Zhenhailong Wang, Haiyang Xu, Junyang Wang, Xi Zhang, Ming Yan, Ji Zhang, Fei Huang, Heng Ji ...

Reasoning Language Models: A Blueprint

Reasoning Language Models: A Blueprint

Episode 396 · · 21:43

πŸ€— Upvotes: 18 | cs.AI, cs.CL Authors: Maciej Besta, Julia Barth, Eric Schreiber, Ales Kubicek, Afonso Catarino, Robert Gerstenberger, Piotr Nycz...

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation

Episode 395 · · 20:46

πŸ€— Upvotes: 16 | cs.CV Authors: Zibo Zhao, Zeqiang Lai, Qingxiang Lin, Yunfei Zhao, Haolin Liu, Shuhui Yang, Yifei Feng, Mingxin Yang, Sheng Zhan...

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments

Episode 394 · · 19:28

πŸ€— Upvotes: 15 | cs.LG, cs.AI Authors: Hongjin Su, Ruoxi Sun, Jinsung Yoon, Pengcheng Yin, Tao Yu, Sercan Γ–. ArΔ±k Title: ...

GameFactory: Creating New Games with Generative Interactive Videos

GameFactory: Creating New Games with Generative Interactive Videos

Episode 393 · · 22:49

πŸ€— Upvotes: 48 | cs.CV Authors: Jiwen Yu, Yiran Qin, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu Title: GameFactory: C...

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

VideoWorld: Exploring Knowledge Learning from Unlabeled Videos

Episode 392 · · 19:25

πŸ€— Upvotes: 8 | cs.CV Authors: Zhongwei Ren, Yunchao Wei, Xun Guo, Yao Zhao, Bingyi Kang, Jiashi Feng, Xiaojie Jin Title: ...

SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

Episode 391 · · 22:16

πŸ€— Upvotes: 2 | cs.AI, cs.CR Authors: Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, Youngjae Yu Title...

The Lessons of Developing Process Reward Models in Mathematical Reasoning

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Episode 390 · · 18:53

πŸ€— Upvotes: 53 | cs.CL, cs.AI, cs.LG Authors: Zhenru Zhang, Chujie Zheng, Yangzhen Wu, Beichen Zhang, Runji Lin, Bowen Yu, Dayiheng Liu, Jingren ...

Tensor Product Attention Is All You Need

Tensor Product Attention Is All You Need

Episode 389 · · 20:57

πŸ€— Upvotes: 38 | cs.CL, cs.AI, cs.LG Authors: Yifan Zhang, Yifeng Liu, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew Chi-Chih Yao ...

$\text{Transformer}^2$: Self-adaptive LLMs

$\text{Transformer}^2$: Self-adaptive LLMs

Episode 388 · · 26:40

πŸ€— Upvotes: 25 | cs.LG, cs.AI, cs.CL Authors: Qi Sun, Edoardo Cetin, Yujin Tang Title: $\text{Transformer}^2$: Self-adap...

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Episode 387 · · 23:32

πŸ€— Upvotes: 21 | cs.CL, cs.AI, cs.HC, cs.SD, eess.AS Authors: Qian Chen, Yafeng Chen, Yanni Chen, Mengzhe Chen, Yingda Chen, Chong Deng, Zhihao D...

VideoAuteur: Towards Long Narrative Video Generation

VideoAuteur: Towards Long Narrative Video Generation

Episode 386 · · 21:43

πŸ€— Upvotes: 21 | cs.CV Authors: Junfei Xiao, Feng Cheng, Lu Qi, Liangke Gui, Jiepeng Cen, Zhibei Ma, Alan Yuille, Lu Jiang Title: ...

O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

Episode 385 · · 24:56

πŸ€— Upvotes: 18 | cs.CL Authors: Zhongzhen Huang, Gui Geng, Shengyi Hua, Zhen Huang, Haoyang Zou, Shaoting Zhang, Pengfei Liu, Xiaofan Zhang ...

WebWalker: Benchmarking LLMs in Web Traversal

WebWalker: Benchmarking LLMs in Web Traversal

Episode 384 · · 24:03

πŸ€— Upvotes: 16 | cs.CL, cs.AI Authors: Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zh...

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Episode 383 · · 20:43

πŸ€— Upvotes: 12 | cs.LG, cs.AI, cs.CL Authors: Tianjin Huang, Ziquan Zhu, Gaojie Jin, Lu Liu, Zhangyang Wang, Shiwei Liu Title: ...

UnCommon Objects in 3D

UnCommon Objects in 3D

Episode 382 · · 21:50

πŸ€— Upvotes: 8 | cs.CV, cs.AI, cs.GR Authors: Xingchen Liu, Piyush Tayal, Jianyuan Wang, Jesus Zarzar, Tom Monnier, Konstantinos Tertikas, Jiali D...

VideoRAG: Retrieval-Augmented Generation over Video Corpus

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Episode 381 · · 22:20

πŸ€— Upvotes: 43 | cs.CV, cs.AI, cs.CL, cs.IR, cs.LG Authors: Soyeong Jeong, Kangsan Kim, Jinheon Baek, Sung Ju Hwang Title: ...

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Episode 380 · · 22:27

πŸ€— Upvotes: 29 | cs.CV, cs.AI Authors: Yifei Li, Junbo Niu, Ziyang Miao, Chunjiang Ge, Yuanhang Zhou, Qihao He, Xiaoyi Dong, Haodong Duan, Shuang...

Enabling Scalable Oversight via Self-Evolving Critic

Enabling Scalable Oversight via Self-Evolving Critic

Episode 379 · · 27:51

πŸ€— Upvotes: 22 | cs.CL, cs.AI, cs.LG Authors: Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang...

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Episode 378 · · 21:58

πŸ€— Upvotes: 14 | cs.CL, cs.AI, cs.CV Authors: You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li...

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

Episode 377 · · 22:48

πŸ€— Upvotes: 10 | cs.CV, cs.CL Authors: Xingyu Fu, Minqian Liu, Zhengyuan Yang, John Corring, Yijuan Lu, Jianwei Yang, Dan Roth, Dinei Florencio, ...

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

Episode 376 · · 23:39

πŸ€— Upvotes: 10 | cs.CV Authors: Yuzhou Huang, Ziyang Yuan, Quande Liu, Qiulin Wang, Xintao Wang, Ruimao Zhang, Pengfei Wan, Di Zhang, Kun Gai ...

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Episode 375 · · 22:24

πŸ€— Upvotes: 8 | cs.CL, cs.AI, cs.LG Authors: Vighnesh Subramaniam, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Shuang Li, Igor Mordatch ...

The GAN is dead; long live the GAN! A Modern GAN Baseline

The GAN is dead; long live the GAN! A Modern GAN Baseline

Episode 374 · · 20:12

πŸ€— Upvotes: 27 | cs.LG, cs.CV Authors: Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, James Tompkin Title: The GAN is ...

An Empirical Study of Autoregressive Pre-training from Videos

An Empirical Study of Autoregressive Pre-training from Videos

Episode 373 · · 21:47

πŸ€— Upvotes: 17 | cs.CV, cs.AI Authors: Jathushan Rajasegaran, Ilija Radosavovic, Rahul Ravishankar, Yossi Gandelsman, Christoph Feichtenhofer, Ji...