Episodes

Latest Episode
$\text{Transformer}^2$: Self-adaptive LLMs

$\text{Transformer}^2$: Self-adaptive LLMs

Episode 388 · · 26:40

🤗 Upvotes: 25 | cs.LG, cs.AI, cs.CL Authors: Qi Sun, Edoardo Cetin, Yujin Tang Title: $\text{Transformer}^2$: Self-adap...

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Episode 387 · · 23:32

🤗 Upvotes: 21 | cs.CL, cs.AI, cs.HC, cs.SD, eess.AS Authors: Qian Chen, Yafeng Chen, Yanni Chen, Mengzhe Chen, Yingda Chen, Chong Deng, Zhihao D...

VideoAuteur: Towards Long Narrative Video Generation

VideoAuteur: Towards Long Narrative Video Generation

Episode 386 · · 21:43

🤗 Upvotes: 21 | cs.CV Authors: Junfei Xiao, Feng Cheng, Lu Qi, Liangke Gui, Jiepeng Cen, Zhibei Ma, Alan Yuille, Lu Jiang Title: ...

O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning

Episode 385 · · 24:56

🤗 Upvotes: 18 | cs.CL Authors: Zhongzhen Huang, Gui Geng, Shengyi Hua, Zhen Huang, Haoyang Zou, Shaoting Zhang, Pengfei Liu, Xiaofan Zhang ...

WebWalker: Benchmarking LLMs in Web Traversal

WebWalker: Benchmarking LLMs in Web Traversal

Episode 384 · · 24:03

🤗 Upvotes: 16 | cs.CL, cs.AI Authors: Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zh...

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Episode 383 · · 20:43

🤗 Upvotes: 12 | cs.LG, cs.AI, cs.CL Authors: Tianjin Huang, Ziquan Zhu, Gaojie Jin, Lu Liu, Zhangyang Wang, Shiwei Liu Title: ...

UnCommon Objects in 3D

UnCommon Objects in 3D

Episode 382 · · 21:50

🤗 Upvotes: 8 | cs.CV, cs.AI, cs.GR Authors: Xingchen Liu, Piyush Tayal, Jianyuan Wang, Jesus Zarzar, Tom Monnier, Konstantinos Tertikas, Jiali D...

VideoRAG: Retrieval-Augmented Generation over Video Corpus

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Episode 381 · · 22:20

🤗 Upvotes: 43 | cs.CV, cs.AI, cs.CL, cs.IR, cs.LG Authors: Soyeong Jeong, Kangsan Kim, Jinheon Baek, Sung Ju Hwang Title: ...

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Episode 380 · · 22:27

🤗 Upvotes: 29 | cs.CV, cs.AI Authors: Yifei Li, Junbo Niu, Ziyang Miao, Chunjiang Ge, Yuanhang Zhou, Qihao He, Xiaoyi Dong, Haodong Duan, Shuang...

Enabling Scalable Oversight via Self-Evolving Critic

Enabling Scalable Oversight via Self-Evolving Critic

Episode 379 · · 27:51

🤗 Upvotes: 22 | cs.CL, cs.AI, cs.LG Authors: Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang...

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models

Episode 378 · · 21:58

🤗 Upvotes: 14 | cs.CL, cs.AI, cs.CV Authors: You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li...

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding

Episode 377 · · 22:48

🤗 Upvotes: 10 | cs.CV, cs.CL Authors: Xingyu Fu, Minqian Liu, Zhengyuan Yang, John Corring, Yijuan Lu, Jianwei Yang, Dan Roth, Dinei Florencio, ...

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning

Episode 376 · · 23:39

🤗 Upvotes: 10 | cs.CV Authors: Yuzhou Huang, Ziyang Yuan, Quande Liu, Qiulin Wang, Xintao Wang, Ruimao Zhang, Pengfei Wan, Di Zhang, Kun Gai ...

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Episode 375 · · 22:24

🤗 Upvotes: 8 | cs.CL, cs.AI, cs.LG Authors: Vighnesh Subramaniam, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Shuang Li, Igor Mordatch ...

The GAN is dead; long live the GAN! A Modern GAN Baseline

The GAN is dead; long live the GAN! A Modern GAN Baseline

Episode 374 · · 20:12

🤗 Upvotes: 27 | cs.LG, cs.CV Authors: Yiwen Huang, Aaron Gokaslan, Volodymyr Kuleshov, James Tompkin Title: The GAN is ...

An Empirical Study of Autoregressive Pre-training from Videos

An Empirical Study of Autoregressive Pre-training from Videos

Episode 373 · · 21:47

🤗 Upvotes: 17 | cs.CV, cs.AI Authors: Jathushan Rajasegaran, Ilija Radosavovic, Rahul Ravishankar, Yossi Gandelsman, Christoph Feichtenhofer, Ji...

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

Episode 372 · · 21:45

🤗 Upvotes: 10 | cs.CV, cs.RO Authors: Shaoyuan Xie, Lingdong Kong, Yuhao Dong, Chonghao Sima, Wenwei Zhang, Qi Alfred Chen, Ziwei Liu, Liang Pan...

Entropy-Guided Attention for Private LLMs

Entropy-Guided Attention for Private LLMs

Episode 371 · · 24:10

🤗 Upvotes: 6 | cs.LG, cs.CR Authors: Nandan Kumar Jha, Brandon Reagen Title: Entropy-Guided Attention for Private LLMs ...

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis

On Computational Limits and Provably Efficient Criteria of Visual Autoregressive Models: A Fine-Grained Complexity Analysis

Episode 370 · · 19:15

🤗 Upvotes: 5 | cs.LG, cs.AI, cs.CC, cs.CV Authors: Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhizhou Sha, Zhenmei Shi, Zhao Song Title: ...

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model

Episode 369 · · 22:09

🤗 Upvotes: 5 | cs.CL, cs.CV Authors: Gregor Geigle, Florian Schneider, Carolin Holtermann, Chris Biemann, Radu Timofte, Anne Lauscher, Goran Gla...

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Episode 368 · · 21:42

🤗 Upvotes: 4 | cs.CL Authors: Chengxing Xie, Bowen Li, Chang Gao, He Du, Wai Lam, Difan Zou, Kai Chen Title: SWE-Fixer:...

Building Foundations for Natural Language Processing of Historical Turkish: Resources and Models

Building Foundations for Natural Language Processing of Historical Turkish: Resources and Models

Episode 367 · · 26:35

🤗 Upvotes: 3 | cs.CL Authors: Şaziye Betül Özateş, Tarık Emre Tıraş, Ece Elif Adak, Berat Doğan, Fatih Burak Karagöz, Efe Eren Genç, Esma F. Bil...

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Episode 366 · · 27:07

🤗 Upvotes: 116 | cs.CL Authors: Xinyu Guan, Li Lyna Zhang, Yifei Liu, Ning Shang, Youran Sun, Yi Zhu, Fan Yang, Mao Yang Title: ...

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought

Episode 365 · · 24:54

🤗 Upvotes: 47 | cs.AI, cs.CL Authors: Violet Xiang, Charlie Snell, Kanishk Gandhi, Alon Albalak, Anikait Singh, Chase Blagden, Duy Phung, Rafael...

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

Episode 364 · · 23:38

🤗 Upvotes: 38 | cs.CL, cs.AI, cs.LG Authors: Ruilin Luo, Zhuofan Zheng, Yifan Wang, Yiyao Yu, Xinzhe Ni, Zicheng Lin, Jin Zeng, Yujiu Yang ...

Agent Laboratory: Using LLM Agents as Research Assistants

Agent Laboratory: Using LLM Agents as Research Assistants

Episode 363 · · 23:32

🤗 Upvotes: 38 | cs.HC, cs.AI, cs.CL, cs.LG Authors: Samuel Schmidgall, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Zich...

LLM4SR: A Survey on Large Language Models for Scientific Research

LLM4SR: A Survey on Large Language Models for Scientific Research

Episode 362 · · 25:14

🤗 Upvotes: 21 | cs.CL, cs.DL Authors: Ziming Luo, Zonglin Yang, Zexin Xu, Wei Yang, Xinya Du Title: LLM4SR: A Survey on...

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection

Episode 361 · · 21:01

🤗 Upvotes: 16 | cs.AI, cs.CL, cs.HC Authors: Yuhang Liu, Pengxiang Li, Zishu Wei, Congkai Xie, Xueyu Hu, Xinchen Xu, Shengyu Zhang, Xiaotian Han...

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Episode 360 · · 22:59

🤗 Upvotes: 12 | cs.CV, cs.GR Authors: Zixuan Huang, Mark Boss, Aaryaman Vasishta, James M. Rehg, Varun Jampani Title: S...

GeAR: Generation Augmented Retrieval

GeAR: Generation Augmented Retrieval

Episode 359 · · 22:10

🤗 Upvotes: 12 | cs.IR, cs.CL Authors: Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, Qi Zhang ...