Episodes

Latest Episode
Optimizing Large Language Model Training Using FP4 Quantization

Optimizing Large Language Model Training Using FP4 Quantization

Episode 448 · · 22:09

🤗 Upvotes: 15 | cs.LG, cs.CL Authors: Ruizhe Wang, Yeyun Gong, Xiao Liu, Guoshuai Zhao, Ziyue Yang, Baining Guo, Zhengjun Zha, Peng Cheng ...

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Episode 447 · · 23:00

🤗 Upvotes: 11 | cs.CV Authors: Chenguo Lin, Panwang Pan, Bangbang Yang, Zeming Li, Yadong Mu Title: DiffSplat: Repurpos...

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Episode 446 · · 23:23

🤗 Upvotes: 10 | cs.CL, cs.LG Authors: Hongzhi Huang, Defa Zhu, Banggu Wu, Yutao Zeng, Ya Wang, Qiyang Min, Xun Zhou Title: ...

Open Problems in Mechanistic Interpretability

Open Problems in Mechanistic Interpretability

Episode 445 · · 25:48

🤗 Upvotes: 10 | cs.LG Authors: Lee Sharkey, Bilal Chughtai, Joshua Batson, Jack Lindsey, Jeff Wu, Lucius Bushnaq, Nicholas Goldowsky-Dill, Stefa...

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Episode 444 · · 22:26

🤗 Upvotes: 5 | cs.LG, cs.AI, cs.CL Authors: J. Pablo Muñoz, Jinjie Yuan, Nilesh Jain Title: Low-Rank Adapters Meet Neur...

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

Episode 443 · · 20:02

🤗 Upvotes: 4 | cs.CL, cs.AI Authors: Sankalp KJ, Ashutosh Kumar, Laxmaan Balaji, Nikunj Kotecha, Vinija Jain, Aman Chadha, Sreyoshi Bhaduri ...

Histoires Morales: A French Dataset for Assessing Moral Alignment

Histoires Morales: A French Dataset for Assessing Moral Alignment

Episode 442 · · 20:44

🤗 Upvotes: 3 | cs.CL, cs.AI Authors: Thibaud Leteno, Irina Proskurina, Antoine Gourru, Julien Velcin, Charlotte Laclau, Guillaume Metzler, Chris...

Qwen2.5-1M Technical Report

Qwen2.5-1M Technical Report

Episode 441 · · 24:17

🤗 Upvotes: 26 | cs.CL Authors: An Yang, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoyan Huang, Jiandong Jiang, Jianhong Tu, Jianwei Zhan...

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

Episode 440 · · 20:45

🤗 Upvotes: 13 | cs.CL Authors: Lin Yueyu, Li Zhiyuan, Peter Yue, Liu Xiao Title: ARWKV: Pretrain is not what we need, a...

Towards General-Purpose Model-Free Reinforcement Learning

Towards General-Purpose Model-Free Reinforcement Learning

Episode 439 · · 20:53

🤗 Upvotes: 13 | cs.LG, cs.AI Authors: Scott Fujimoto, Pierluca D'Oro, Amy Zhang, Yuandong Tian, Michael Rabbat Title: T...

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation

Episode 438 · · 22:02

🤗 Upvotes: 11 | cs.SD, cs.CL, eess.AS Authors: Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, J...

iFormer: Integrating ConvNet and Transformer for Mobile Application

iFormer: Integrating ConvNet and Transformer for Mobile Application

Episode 437 · · 24:01

🤗 Upvotes: 9 | cs.CV, cs.AI Authors: Chuanyang Zheng Title: iFormer: Integrating ConvNet and Transformer for Mobile App...

Are Vision Language Models Texture or Shape Biased and Can We Steer Them?

Are Vision Language Models Texture or Shape Biased and Can We Steer Them?

Episode 436 · · 24:59

🤗 Upvotes: 7 | cs.CV, cs.AI, cs.LG, q-bio.NC Authors: Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, Bianca Lamm, Muhammad Jehanze...

CodeMonkeys: Scaling Test-Time Compute for Software Engineering

CodeMonkeys: Scaling Test-Time Compute for Software Engineering

Episode 435 · · 23:04

🤗 Upvotes: 5 | cs.LG Authors: Ryan Ehrlich, Bradley Brown, Jordan Juravsky, Ronald Clark, Christopher Ré, Azalia Mirhoseini Title: ...

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Episode 434 · · 21:21

🤗 Upvotes: 4 | cs.LG, cs.AI Authors: Samira Abnar, Harshay Shah, Dan Busbridge, Alaaeldin Mohamed Elnouby Ali, Josh Susskind, Vimal Thilak ...

Humanity's Last Exam

Humanity's Last Exam

Episode 433 · · 22:51

🤗 Upvotes: 33 | cs.LG, cs.AI, cs.CL Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Sean Shi, Michael Choi, ...

Chain-of-Retrieval Augmented Generation

Chain-of-Retrieval Augmented Generation

Episode 432 · · 23:23

🤗 Upvotes: 26 | cs.IR, cs.CL Authors: Liang Wang, Haonan Chen, Nan Yang, Xiaolong Huang, Zhicheng Dou, Furu Wei Title: ...

Redundancy Principles for MLLMs Benchmarks

Redundancy Principles for MLLMs Benchmarks

Episode 431 · · 22:20

🤗 Upvotes: 22 | cs.CL, cs.AI Authors: Zicheng Zhang, Xiangyu Zhao, Xinyu Fang, Chunyi Li, Xiaohong Liu, Xiongkuo Min, Haodong Duan, Kai Chen, Gu...

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

Episode 430 · · 23:35

🤗 Upvotes: 13 | cs.CL, cs.AI, cs.LG Authors: Zhengyang Tang, Ziniu Li, Zhenyang Xiao, Tian Ding, Ruoyu Sun, Benyou Wang, Dayiheng Liu, Fei Huang...

RL + Transformer = A General-Purpose Problem Solver

RL + Transformer = A General-Purpose Problem Solver

Episode 429 · · 24:24

🤗 Upvotes: 7 | cs.LG, cs.AI Authors: Micah Rentschler, Jesse Roberts Title: RL + Transformer = A General-Purpose Proble...

Relightable Full-Body Gaussian Codec Avatars

Relightable Full-Body Gaussian Codec Avatars

Episode 428 · · 20:53

🤗 Upvotes: 5 | cs.CV, cs.GR Authors: Shaofei Wang, Tomas Simon, Igor Santesteban, Timur Bagautdinov, Junxuan Li, Vasu Agrawal, Fabian Prada, Sho...

Question Answering on Patient Medical Records with Private Fine-Tuned LLMs

Question Answering on Patient Medical Records with Private Fine-Tuned LLMs

Episode 427 · · 21:57

🤗 Upvotes: 4 | cs.CL, cs.AI Authors: Sara Kothari, Ayush Gupta Title: Question Answering on Patient Medical Records wit...

GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing

GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing

Episode 426 · · 23:27

🤗 Upvotes: 3 | cs.CV Authors: Akashah Shabbir, Mohammed Zumri, Mohammed Bennamoun, Fahad S. Khan, Salman Khan Title: Ge...

AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation

AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation

Episode 425 · · 21:05

🤗 Upvotes: 2 | cs.CV Authors: Yuning Cui, Syed Waqas Zamir, Salman Khan, Alois Knoll, Mubarak Shah, Fahad Shahbaz Khan Title: ...

Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning

Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning

Episode 424 · · 24:19

🤗 Upvotes: 2 | cs.CV Authors: Yang You, Yixin Li, Congyue Deng, Yue Wang, Leonidas Guibas Title: Multiview Equivariance...

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

Episode 423 · · 23:50

🤗 Upvotes: 46 | cs.LG, cs.AI, cs.MA, I.2.11 Authors: Alsu Sagirova, Yuri Kuratov, Mikhail Burtsev Title: SRMT: Shared M...

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Episode 422 · · 20:44

🤗 Upvotes: 33 | cs.CL Authors: Zhenghao Lin, Zihao Tang, Xiao Liu, Yeyun Gong, Yi Cheng, Qi Chen, Hang Li, Ying Xin, Ziyue Yang, Kailai Yang, Yu...

Improving Video Generation with Human Feedback

Improving Video Generation with Human Feedback

Episode 421 · · 24:20

🤗 Upvotes: 30 | cs.CV, cs.AI, cs.GR, cs.LG Authors: Jie Liu, Gongye Liu, Jiajun Liang, Ziyang Yuan, Xiaokun Liu, Mingwu Zheng, Xiele Wu, Qiulin ...

Temporal Preference Optimization for Long-Form Video Understanding

Temporal Preference Optimization for Long-Form Video Understanding

Episode 420 · · 24:47

🤗 Upvotes: 15 | cs.CV, cs.AI, cs.CL, cs.LG, cs.RO Authors: Rui Li, Xiaohan Wang, Yuhui Zhang, Zeyu Wang, Serena Yeung-Levy Title: ...

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Episode 419 · · 21:04

🤗 Upvotes: 14 | cs.CV, cs.AI, cs.CL Authors: Ziyu Guo, Renrui Zhang, Chengzhuo Tong, Zhizheng Zhao, Peng Gao, Hongsheng Li, Pheng-Ann Heng ...