Episodes

Latest Episode
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation

SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation

Episode 268 · · 22:00

πŸ€— Upvotes: 17 | cs.CL Authors: Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, Deyu Zhou Title: SCOPE: O...

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Episode 267 · · 25:23

πŸ€— Upvotes: 13 | cs.CV Authors: Songhua Liu, Zhenxiong Tan, Xinchao Wang Title: CLEAR: Conv-Like Linearization Revs Pre-...

Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Episode 266 · · 23:23

πŸ€— Upvotes: 12 | cs.CV, cs.LG, cs.SD, eess.AS Authors: Ho Kei Cheng, Masato Ishii, Akio Hayakawa, Takashi Shibuya, Alexander Schwing, Yuki Mitsuf...

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

Episode 265 · · 28:03

πŸ€— Upvotes: 9 | cs.CV Authors: Saehyung Lee, Seunghyun Yoon, Trung Bui, Jing Shi, Sungroh Yoon Title: Toward Robust Hype...

Sequence Matters: Harnessing Video Models in 3D Super-Resolution

Sequence Matters: Harnessing Video Models in 3D Super-Resolution

Episode 264 · · 22:10

πŸ€— Upvotes: 6 | cs.CV, 68U10, 68T10, I.4.5; I.2.10 Authors: Hyun-kyu Ko, Dongheok Park, Youngin Park, Byeonghyeon Lee, Juhee Han, Eunbyung Park ...

TRecViT: A Recurrent Video Transformer

TRecViT: A Recurrent Video Transformer

Episode 263 · · 25:04

πŸ€— Upvotes: 5 | cs.CV, cs.LG Authors: Viorica PΔƒtrΔƒucean, Xu Owen He, Joseph Heyward, Chuhan Zhang, Mehdi S. M. Sajjadi, George-Cristian Muraru, ...

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

Episode 262 · · 23:01

πŸ€— Upvotes: 4 | cs.LG Authors: Zhen Zheng, Xiaonan Song, Chuanjie Liu Title: MixLLM: LLM Quantization with Global Mixed-...

Multi-LLM Text Summarization

Multi-LLM Text Summarization

Episode 261 · · 23:19

πŸ€— Upvotes: 3 | cs.CL Authors: Jiangnan Fang, Cheng-Tse Liu, Jieun Kim, Yash Bhedaru, Ethan Liu, Nikhil Singh, Nedim Lipka, Puneet Mathur, Nesree...

Qwen2.5 Technical Report

Qwen2.5 Technical Report

Episode 260 · · 25:31

πŸ€— Upvotes: 236 | cs.CL Authors: Qwen, :, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei ...

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Episode 259 · · 23:02

πŸ€— Upvotes: 44 | cs.CV, cs.CL Authors: Junjie Zhou, Zheng Liu, Ze Liu, Shitao Xiao, Yueze Wang, Bo Zhao, Chen Jason Zhang, Defu Lian, Yongping Xi...

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks

Episode 258 · · 23:11

πŸ€— Upvotes: 23 | cs.CL, cs.AI Authors: Yushi Bai, Shangqing Tu, Jiajie Zhang, Hao Peng, Xiaozhi Wang, Xin Lv, Shulin Cao, Jiazheng Xu, Lei Hou, Y...

How to Synthesize Text Data without Model Collapse?

How to Synthesize Text Data without Model Collapse?

Episode 257 · · 24:20

πŸ€— Upvotes: 19 | cs.CL, cs.AI, cs.LG Authors: Xuekai Zhu, Daixuan Cheng, Hengli Li, Kaiyan Zhang, Ermo Hua, Xingtai Lv, Ning Ding, Zhouhan Lin, Z...

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Episode 256 · · 19:57

πŸ€— Upvotes: 17 | cs.CV Authors: Qihao Liu, Xi Yin, Alan Yuille, Andrew Brown, Mannat Singh Title: Flowing from Words to ...

Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion

Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion

Episode 255 · · 20:44

πŸ€— Upvotes: 13 | cs.CV Authors: Jixuan He, Wanhua Li, Ye Liu, Junsik Kim, Donglai Wei, Hanspeter Pfister Title: Affordan...

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis

Episode 254 · · 21:08

πŸ€— Upvotes: 12 | cs.CV Authors: Hanlin Wang, Hao Ouyang, Qiuyu Wang, Wen Wang, Ka Leong Cheng, Qifeng Chen, Yujun Shen, Limin Wang T...

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Episode 253 · · 23:08

πŸ€— Upvotes: 8 | cs.CV, cs.AI, cs.GR Authors: Wang Zhao, Yan-Pei Cao, Jiale Xu, Yuejiang Dong, Ying Shan Title: DI-PCG: D...

AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

Episode 252 · · 24:09

πŸ€— Upvotes: 7 | cs.CL, cs.AI, cs.LG Authors: Zihan Liu, Yang Chen, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping Title: Ac...

No More Adam: Learning Rate Scaling at Initialization is All You Need

No More Adam: Learning Rate Scaling at Initialization is All You Need

Episode 251 · · 21:59

πŸ€— Upvotes: 177 | cs.LG, cs.AI Authors: Minghao Xu, Lichuan Xiang, Xu Cai, Hongkai Wen Title: No More Adam: Learning Rat...

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Episode 250 · · 21:56

πŸ€— Upvotes: 36 | cs.CL, cs.AI Authors: Benjamin Warner, Antoine Chaffin, Benjamin ClaviΓ©, Orion Weller, Oskar HallstrΓΆm, Said Taghadouini, Alexis...

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Episode 249 · · 24:45

πŸ€— Upvotes: 30 | cs.CL Authors: Frank F. Xu, Yufan Song, Boxuan Li, Yuxuan Tang, Kritanjali Jain, Mengxue Bao, Zora Z. Wang, Xuhui Zhou, Zhitong ...

AniDoc: Animation Creation Made Easier

AniDoc: Animation Creation Made Easier

Episode 248 · · 22:20

πŸ€— Upvotes: 29 | cs.CV Authors: Yihao Meng, Hao Ouyang, Hanlin Wang, Qiuyu Wang, Wen Wang, Ka Leong Cheng, Zhiheng Liu, Yujun Shen, Huamin Qu ...

FashionComposer: Compositional Fashion Image Generation

FashionComposer: Compositional Fashion Image Generation

Episode 247 · · 19:47

πŸ€— Upvotes: 13 | cs.CV Authors: Sihui Ji, Yiyang Wang, Xi Chen, Xiaogang Xu, Hao Luo, Hengshuang Zhao Title: FashionComp...

GUI Agents: A Survey

GUI Agents: A Survey

Episode 246 · · 21:01

πŸ€— Upvotes: 11 | cs.AI, cs.HC Authors: Dang Nguyen, Jian Chen, Yu Wang, Gang Wu, Namyong Park, Zhengmian Hu, Hanjia Lyu, Junda Wu, Ryan Aponte, Y...

Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning

Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning

Episode 245 · · 22:42

πŸ€— Upvotes: 10 | cs.LG, cs.RO Authors: Moritz Reuss, Jyothish Pari, Pulkit Agrawal, Rudolf Lioutikov Title: Efficient Di...

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Episode 244 · · 20:41

πŸ€— Upvotes: 10 | cs.CV Authors: Haotong Lin, Sida Peng, Jingxiao Chen, Songyou Peng, Jiaming Sun, Minghuan Liu, Hujun Bao, Jiashi Feng, Xiaowei Z...

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Episode 243 · · 20:52

πŸ€— Upvotes: 9 | cs.CV Authors: Jihan Yang, Shusheng Yang, Anjali W. Gupta, Rilyn Han, Li Fei-Fei, Saining Xie Title: Thi...

Are Your LLMs Capable of Stable Reasoning?

Are Your LLMs Capable of Stable Reasoning?

Episode 242 · · 24:11

πŸ€— Upvotes: 61 | cs.AI, cs.CL Authors: Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, ...

Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models

Multi-Dimensional Insights: Benchmarking Real-World Personalization in Large Multimodal Models

Episode 241 · · 22:34

πŸ€— Upvotes: 29 | cs.AI, cs.CL, cs.CV Authors: YiFan Zhang, Shanglin Lei, Runqi Qiao, Zhuoma GongQue, Xiaoshuai Song, Guanting Dong, Qiuna Tan, Zh...

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain

Episode 240 · · 23:15

πŸ€— Upvotes: 29 | cs.CL Authors: Shuting Wang, Jiejun Tan, Zhicheng Dou, Ji-Rong Wen Title: OmniEval: An Omnidirectional ...

Compressed Chain of Thought: Efficient Reasoning Through Dense Representations

Compressed Chain of Thought: Efficient Reasoning Through Dense Representations

Episode 239 · · 23:05

πŸ€— Upvotes: 21 | cs.CL Authors: Jeffrey Cheng, Benjamin Van Durme Title: Compressed Chain of Thought: Efficient Reasonin...