Episodes

Latest Episode
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming

Episode 469 · · 20:55

🤗 Upvotes: 6 | cs.CL, cs.AI, cs.CR, cs.LG Authors: Mrinank Sharma, Meg Tong, Jesse Mu, Jerry Wei, Jorrit Kruthoff, Scott Goodfriend, Euan Ong, A...

Scalable-Softmax Is Superior for Attention

Scalable-Softmax Is Superior for Attention

Episode 468 · · 23:31

🤗 Upvotes: 6 | cs.CL, cs.AI, cs.LG Authors: Ken M. Nakanishi Title: Scalable-Softmax Is Superior for Attention ...

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

Episode 467 · · 21:51

🤗 Upvotes: 3 | cs.LG, math.OC, stat.ML Authors: Fabian Schaipp, Alexander Hägele, Adrien Taylor, Umut Simsekli, Francis Bach Title:...

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Episode 466 · · 20:09

🤗 Upvotes: 3 | cs.LG, cs.AI Authors: Bartosz Cywiński, Kamil Deja Title: SAeUron: Interpretable Concept Unlearning in D...

GuardReasoner: Towards Reasoning-based LLM Safeguards

GuardReasoner: Towards Reasoning-based LLM Safeguards

Episode 465 · · 21:00

🤗 Upvotes: 46 | cs.CR, cs.AI, cs.LG Authors: Yue Liu, Hongcheng Gao, Shengfang Zhai, Jun Xia, Tianyi Wu, Zhiwei Xue, Yulin Chen, Kenji Kawaguchi...

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Episode 464 · · 23:01

🤗 Upvotes: 22 | cs.CL Authors: Yue Wang, Qiuzhi Liu, Jiahao Xu, Tian Liang, Xingyu Chen, Zhiwei He, Linfeng Song, Dian Yu, Juntao Li, Zhuosheng ...

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

Episode 463 · · 23:36

🤗 Upvotes: 15 | cs.CL Authors: Arthur Douillard, Yanislav Donchev, Keith Rush, Satyen Kale, Zachary Charles, Zachary Garrett, Gabriel Teston, Da...

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

Episode 462 · · 19:26

🤗 Upvotes: 15 | cs.AI, cs.CL, cs.CV, cs.LG Authors: Yuxin Zuo, Shang Qu, Yifei Li, Zhangren Chen, Xuekai Zhu, Ermo Hua, Kaiyan Zhang, Ning Ding,...

Large Language Models Think Too Fast To Explore Effectively

Large Language Models Think Too Fast To Explore Effectively

Episode 461 · · 25:52

🤗 Upvotes: 10 | cs.AI, q-bio.NC Authors: Lan Pan, Hanbo Xie, Robert C. Wilson Title: Large Language Models Think Too Fa...

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

Episode 460 · · 20:15

🤗 Upvotes: 10 | cs.LG, cs.CL Authors: Benjamin Feuer, Chinmay Hegde Title: WILDCHAT-50M: A Deep Dive Into the Role of S...

PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

Episode 459 · · 24:32

🤗 Upvotes: 10 | cs.CV, cs.AI, cs.CL, cs.LG, cs.RO Authors: Wei Chow, Jiageng Mao, Boyi Li, Daniel Seita, Vitor Guizilini, Yue Wang ...

o3-mini vs DeepSeek-R1: Which One is Safer?

o3-mini vs DeepSeek-R1: Which One is Safer?

Episode 458 · · 20:01

🤗 Upvotes: 6 | cs.SE, cs.AI Authors: Aitor Arrieta, Miriam Ugarte, Pablo Valle, José Antonio Parejo, Sergio Segura Title: ...

CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation

CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation

Episode 457 · · 21:14

🤗 Upvotes: 1 | cs.AI, cs.CL, cs.HC Authors: Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham, Graham Neubig ...

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

Episode 456 · · 22:30

🤗 Upvotes: 28 | cs.CL Authors: Yubo Wang, Xiang Yue, Wenhu Chen Title: Critique Fine-Tuning: Learning to Critique is Mo...

Atla Selene Mini: A General Purpose Evaluation Model

Atla Selene Mini: A General Purpose Evaluation Model

Episode 455 · · 25:28

🤗 Upvotes: 24 | cs.CL, cs.AI Authors: Andrei Alexandru, Antonia Calvi, Henry Broomfield, Jackson Golden, Kyle Dai, Mathias Leys, Maurice Burger,...

Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts

Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts

Episode 454 · · 28:39

🤗 Upvotes: 14 | cs.AI, cs.CY, cs.LG Authors: Clément Desroches, Martin Chauvin, Louis Ladan, Caroline Vateau, Simon Gosset, Philippe Cordier ...

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation

Episode 453 · · 22:03

🤗 Upvotes: 8 | cs.SE, cs.AI Authors: Aitor Arrieta, Miriam Ugarte, Pablo Valle, José Antonio Parejo, Sergio Segura Title: ...

Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks

Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks

Episode 452 · · 22:14

🤗 Upvotes: 8 | cs.CV Authors: Hailong Guo, Bohan Zeng, Yiren Song, Wentao Zhang, Chuang Zhang, Jiaming Liu Title: Any2A...

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

Episode 451 · · 21:44

🤗 Upvotes: 6 | cs.CR, cs.AI, cs.CL, cs.LG Authors: Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu Title: ...

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

Episode 450 · · 19:42

🤗 Upvotes: 6 | cs.CL, cs.AI Authors: Jenna Russell, Marzena Karpinska, Mohit Iyyer Title: People who frequently use Cha...

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Episode 449 · · 23:17

🤗 Upvotes: 29 | cs.AI, cs.CV, cs.LG Authors: Tianzhe Chu, Yuexiang Zhai, Jihan Yang, Shengbang Tong, Saining Xie, Dale Schuurmans, Quoc V. Le, S...

Optimizing Large Language Model Training Using FP4 Quantization

Optimizing Large Language Model Training Using FP4 Quantization

Episode 448 · · 22:09

🤗 Upvotes: 15 | cs.LG, cs.CL Authors: Ruizhe Wang, Yeyun Gong, Xiao Liu, Guoshuai Zhao, Ziyue Yang, Baining Guo, Zhengjun Zha, Peng Cheng ...

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

Episode 447 · · 23:00

🤗 Upvotes: 11 | cs.CV Authors: Chenguo Lin, Panwang Pan, Bangbang Yang, Zeming Li, Yadong Mu Title: DiffSplat: Repurpos...

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Episode 446 · · 23:23

🤗 Upvotes: 10 | cs.CL, cs.LG Authors: Hongzhi Huang, Defa Zhu, Banggu Wu, Yutao Zeng, Ya Wang, Qiyang Min, Xun Zhou Title: ...

Open Problems in Mechanistic Interpretability

Open Problems in Mechanistic Interpretability

Episode 445 · · 25:48

🤗 Upvotes: 10 | cs.LG Authors: Lee Sharkey, Bilal Chughtai, Joshua Batson, Jack Lindsey, Jeff Wu, Lucius Bushnaq, Nicholas Goldowsky-Dill, Stefa...

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Episode 444 · · 22:26

🤗 Upvotes: 5 | cs.LG, cs.AI, cs.CL Authors: J. Pablo Muñoz, Jinjie Yuan, Nilesh Jain Title: Low-Rank Adapters Meet Neur...

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

Episode 443 · · 20:02

🤗 Upvotes: 4 | cs.CL, cs.AI Authors: Sankalp KJ, Ashutosh Kumar, Laxmaan Balaji, Nikunj Kotecha, Vinija Jain, Aman Chadha, Sreyoshi Bhaduri ...

Histoires Morales: A French Dataset for Assessing Moral Alignment

Histoires Morales: A French Dataset for Assessing Moral Alignment

Episode 442 · · 20:44

🤗 Upvotes: 3 | cs.CL, cs.AI Authors: Thibaud Leteno, Irina Proskurina, Antoine Gourru, Julien Velcin, Charlotte Laclau, Guillaume Metzler, Chris...

Qwen2.5-1M Technical Report

Qwen2.5-1M Technical Report

Episode 441 · · 24:17

🤗 Upvotes: 26 | cs.CL Authors: An Yang, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoyan Huang, Jiandong Jiang, Jianhong Tu, Jianwei Zhan...

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

Episode 440 · · 20:45

🤗 Upvotes: 13 | cs.CL Authors: Lin Yueyu, Li Zhiyuan, Peter Yue, Liu Xiao Title: ARWKV: Pretrain is not what we need, a...