Episodes

Latest Episode
PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

Episode 283 · · 26:24

πŸ€— Upvotes: 5 | cs.CV Authors: Minghao Chen, Roman Shapovalov, Iro Laina, Tom Monnier, Jianyuan Wang, David Novotny, Andrea Vedaldi ...

MotiF: Making Text Count in Image Animation with Motion Focal Loss

MotiF: Making Text Count in Image Animation with Motion Focal Loss

Episode 282 · · 22:35

πŸ€— Upvotes: 3 | cs.CV, cs.AI Authors: Shijie Wang, Samaneh Azadi, Rohit Girdhar, Saketh Rambhatla, Chen Sun, Xi Yin Title: ...

Bridging the Data Provenance Gap Across Text, Speech and Video

Bridging the Data Provenance Gap Across Text, Speech and Video

Episode 281 · · 25:29

πŸ€— Upvotes: 3 | cs.AI, cs.CL, cs.CY, cs.LG, cs.MM Authors: Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, Will...

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response

Episode 280 · · 21:39

πŸ€— Upvotes: 64 | cs.CL, cs.AI Authors: Junyu Luo, Xiao Luo, Kaize Ding, Jingyang Yuan, Zhiping Xiao, Ming Zhang Title: R...

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Episode 279 · · 20:41

πŸ€— Upvotes: 29 | cs.AI, cs.CL, cs.LG Authors: Weihao Zeng, Yuzhen Huang, Lulu Zhao, Yijun Wang, Zifei Shan, Junxian He Title: ...

Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching

Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching

Episode 278 · · 24:00

πŸ€— Upvotes: 26 | cs.CV, cs.LG Authors: Enshu Liu, Xuefei Ning, Yu Wang, Zinan Lin Title: Distilled Decoding 1: One-step ...

Diving into Self-Evolving Training for Multimodal Reasoning

Diving into Self-Evolving Training for Multimodal Reasoning

Episode 277 · · 21:08

πŸ€— Upvotes: 23 | cs.CL, cs.AI, cs.CV, cs.LG Authors: Wei Liu, Junlong Li, Xiwen Zhang, Fan Zhou, Yu Cheng, Junxian He Title: ...

Deliberation in Latent Space via Differentiable Cache Augmentation

Deliberation in Latent Space via Differentiable Cache Augmentation

Episode 276 · · 22:28

πŸ€— Upvotes: 16 | cs.CL, cs.AI, cs.LG Authors: Luyang Liu, Jonas Pfeiffer, Jiaxing Wu, Jun Xie, Arthur Szlam Title: Delib...

Large Motion Video Autoencoding with Cross-modal Video VAE

Large Motion Video Autoencoding with Cross-modal Video VAE

Episode 275 · · 25:08

πŸ€— Upvotes: 15 | cs.CV Authors: Yazhou Xing, Yang Fei, Yingqing He, Jingye Chen, Jiaxin Xie, Xiaowei Chi, Qifeng Chen Title: ...

OpenAI o1 System Card

OpenAI o1 System Card

Episode 274 · · 25:01

πŸ€— Upvotes: 12 | cs.AI Authors: OpenAI, :, Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksan...

Revisiting In-Context Learning with Long Context Language Models

Revisiting In-Context Learning with Long Context Language Models

Episode 273 · · 23:40

πŸ€— Upvotes: 12 | cs.CL, cs.AI, cs.LG Authors: Jinheon Baek, Sun Jae Lee, Prakhar Gupta, Geunseob, Oh, Siddharth Dalmia, Prateek Kolhar ...

Outcome-Refining Process Supervision for Code Generation

Outcome-Refining Process Supervision for Code Generation

Episode 272 · · 21:12

πŸ€— Upvotes: 11 | cs.CL, cs.AI, cs.LG, cs.SE Authors: Zhuohao Yu, Weizheng Gu, Yidong Wang, Zhengran Zeng, Jindong Wang, Wei Ye, Shikun Zhang ...

LearnLM: Improving Gemini for Learning

LearnLM: Improving Gemini for Learning

Episode 271 · · 27:18

πŸ€— Upvotes: 9 | cs.CY, cs.AI, cs.LG Authors: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Brett Wiltshire...

Parallelized Autoregressive Visual Generation

Parallelized Autoregressive Visual Generation

Episode 270 · · 22:32

πŸ€— Upvotes: 34 | cs.CV Authors: Yuqing Wang, Shuhuai Ren, Zhijie Lin, Yujin Han, Haoyuan Guo, Zhenheng Yang, Difan Zou, Jiashi Feng, Xihui Liu ...

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Offline Reinforcement Learning for LLM Multi-Step Reasoning

Episode 269 · · 20:59

πŸ€— Upvotes: 19 | cs.LG, cs.AI, cs.CL Authors: Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu Title:...

SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation

SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation

Episode 268 · · 22:00

πŸ€— Upvotes: 17 | cs.CL Authors: Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, Deyu Zhou Title: SCOPE: O...

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Episode 267 · · 25:23

πŸ€— Upvotes: 13 | cs.CV Authors: Songhua Liu, Zhenxiong Tan, Xinchao Wang Title: CLEAR: Conv-Like Linearization Revs Pre-...

Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Episode 266 · · 23:23

πŸ€— Upvotes: 12 | cs.CV, cs.LG, cs.SD, eess.AS Authors: Ho Kei Cheng, Masato Ishii, Akio Hayakawa, Takashi Shibuya, Alexander Schwing, Yuki Mitsuf...

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

Episode 265 · · 28:03

πŸ€— Upvotes: 9 | cs.CV Authors: Saehyung Lee, Seunghyun Yoon, Trung Bui, Jing Shi, Sungroh Yoon Title: Toward Robust Hype...

Sequence Matters: Harnessing Video Models in 3D Super-Resolution

Sequence Matters: Harnessing Video Models in 3D Super-Resolution

Episode 264 · · 22:10

πŸ€— Upvotes: 6 | cs.CV, 68U10, 68T10, I.4.5; I.2.10 Authors: Hyun-kyu Ko, Dongheok Park, Youngin Park, Byeonghyeon Lee, Juhee Han, Eunbyung Park ...

TRecViT: A Recurrent Video Transformer

TRecViT: A Recurrent Video Transformer

Episode 263 · · 25:04

πŸ€— Upvotes: 5 | cs.CV, cs.LG Authors: Viorica PΔƒtrΔƒucean, Xu Owen He, Joseph Heyward, Chuhan Zhang, Mehdi S. M. Sajjadi, George-Cristian Muraru, ...

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

Episode 262 · · 23:01

πŸ€— Upvotes: 4 | cs.LG Authors: Zhen Zheng, Xiaonan Song, Chuanjie Liu Title: MixLLM: LLM Quantization with Global Mixed-...

Multi-LLM Text Summarization

Multi-LLM Text Summarization

Episode 261 · · 23:19

πŸ€— Upvotes: 3 | cs.CL Authors: Jiangnan Fang, Cheng-Tse Liu, Jieun Kim, Yash Bhedaru, Ethan Liu, Nikhil Singh, Nedim Lipka, Puneet Mathur, Nesree...

Qwen2.5 Technical Report

Qwen2.5 Technical Report

Episode 260 · · 25:31

πŸ€— Upvotes: 236 | cs.CL Authors: Qwen, :, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei ...

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Episode 259 · · 23:02

πŸ€— Upvotes: 44 | cs.CV, cs.CL Authors: Junjie Zhou, Zheng Liu, Ze Liu, Shitao Xiao, Yueze Wang, Bo Zhao, Chen Jason Zhang, Defu Lian, Yongping Xi...

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Episode 258 · · 23:11

πŸ€— Upvotes: 23 | cs.CL, cs.AI Authors: Yushi Bai, Shangqing Tu, Jiajie Zhang, Hao Peng, Xiaozhi Wang, Xin Lv, Shulin Cao, Jiazheng Xu, Lei Hou, Y...

How to Synthesize Text Data without Model Collapse?

How to Synthesize Text Data without Model Collapse?

Episode 257 · · 24:20

πŸ€— Upvotes: 19 | cs.CL, cs.AI, cs.LG Authors: Xuekai Zhu, Daixuan Cheng, Hengli Li, Kaiyan Zhang, Ermo Hua, Xingtai Lv, Ning Ding, Zhouhan Lin, Z...

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Episode 256 · · 19:57

πŸ€— Upvotes: 17 | cs.CV Authors: Qihao Liu, Xi Yin, Alan Yuille, Andrew Brown, Mannat Singh Title: Flowing from Words to ...

Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion

Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion

Episode 255 · · 20:44

πŸ€— Upvotes: 13 | cs.CV Authors: Jixuan He, Wanhua Li, Ye Liu, Junsik Kim, Donglai Wei, Hanspeter Pfister Title: Affordan...

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

Episode 254 · · 21:08

πŸ€— Upvotes: 12 | cs.CV Authors: Hanlin Wang, Hao Ouyang, Qiuyu Wang, Wen Wang, Ka Leong Cheng, Qifeng Chen, Yujun Shen, Limin Wang T...