Episodes

Latest Episode
Continuous Diffusion Model for Language Modeling

Continuous Diffusion Model for Language Modeling

Episode 582 · · 18:43

🤗 Upvotes: 44 | cs.LG Authors: Jaehyeong Jo, Sung Ju Hwang Title: Continuous Diffusion Model for Language Modeling ...

Phantom: Subject-consistent video generation via cross-modal alignment

Phantom: Subject-consistent video generation via cross-modal alignment

Episode 581 · · 21:18

🤗 Upvotes: 42 | cs.CV, cs.AI Authors: Lijie Liu, Tianxiang Ma, Bingchuan Li, Zhuowei Chen, Jiawei Liu, Qian He, Xinglong Wu Title: ...

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Episode 580 · · 22:39

🤗 Upvotes: 33 | cs.AI, cs.CL Authors: Feng Luo, Rui Yang, Hao Sun, Chunyuan Deng, Jiarui Yao, Jingyan Shen, Huan Zhang, Hanjie Chen ...

Magma: A Foundation Model for Multimodal AI Agents

Magma: A Foundation Model for Multimodal AI Agents

Episode 579 · · 23:02

🤗 Upvotes: 30 | cs.CV, cs.AI, cs.HC, cs.LG, cs.RO Authors: Jianwei Yang, Reuben Tan, Qianhui Wu, Ruijie Zheng, Baolin Peng, Yongyuan Liang, Yu G...

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

Episode 578 · · 21:52

🤗 Upvotes: 29 | cs.CV Authors: Bencheng Liao, Hongyuan Tao, Qian Zhang, Tianheng Cheng, Yingyue Li, Haoran Yin, Wenyu Liu, Xinggang Wang ...

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Episode 577 · · 21:29

🤗 Upvotes: 27 | cs.RO, cs.AI, cs.CV Authors: Zekun Qi, Wenyao Zhang, Yufei Ding, Runpei Dong, Xinqiang Yu, Jingwen Li, Lingyun Xu, Baoyu Li, Xia...

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

Episode 576 · · 20:32

🤗 Upvotes: 26 | cs.CL Authors: Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang ...

You Do Not Fully Utilize Transformer's Representation Capacity

You Do Not Fully Utilize Transformer's Representation Capacity

Episode 575 · · 20:56

🤗 Upvotes: 25 | cs.LG, cs.CL Authors: Gleb Gerasimov, Yaroslav Aksenov, Nikita Balagansky, Viacheslav Sinii, Daniil Gavrilov Title:...

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Episode 574 · · 23:03

🤗 Upvotes: 68 | cs.CL, cs.AI, cs.LG Authors: Jingyang Yuan, Huazuo Gao, Damai Dai, Junyu Luo, Liang Zhao, Zhengyan Zhang, Zhenda Xie, Y. X. Wei,...

Learning Getting-Up Policies for Real-World Humanoid Robots

Learning Getting-Up Policies for Real-World Humanoid Robots

Episode 573 · · 24:40

🤗 Upvotes: 32 | cs.RO, cs.LG Authors: Xialin He, Runpei Dong, Zixuan Chen, Saurabh Gupta Title: Learning Getting-Up Pol...

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?

Episode 572 · · 21:53

🤗 Upvotes: 27 | cs.LG, cs.SE Authors: Samuel Miserendino, Michele Wang, Tejal Patwardhan, Johannes Heidecke Title: SWE-...

CRANE: Reasoning with constrained LLM generation

CRANE: Reasoning with constrained LLM generation

Episode 571 · · 21:23

🤗 Upvotes: 17 | cs.PL, cs.LG Authors: Debangshu Banerjee, Tarun Suresh, Shubham Ugare, Sasa Misailovic, Gagandeep Singh Title: ...

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

Episode 570 · · 24:39

🤗 Upvotes: 16 | cs.LG, cs.AI, cs.CL, cs.CV, cs.HC Authors: Yixin Ou, Yunzhi Yao, Ningyu Zhang, Hui Jin, Jiacheng Sun, Shumin Deng, Zhenguo Li, H...

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Episode 569 · · 20:01

🤗 Upvotes: 15 | cs.CV Authors: Ling Yang, Xinchen Zhang, Ye Tian, Chenming Shang, Minghao Xu, Wentao Zhang, Bin Cui Title: ...

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

Episode 568 · · 22:23

🤗 Upvotes: 14 | cs.LG, cs.AI Authors: Zhenxing Mi, Kuan-Chieh Wang, Guocheng Qian, Hanrong Ye, Runtao Liu, Sergey Tulyakov, Kfir Aberman, Dan Xu...

SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

Episode 567 · · 23:50

🤗 Upvotes: 11 | cs.LG, cs.CL Authors: Bohan Lyu, Siqiao Huang, Zichen Liang Title: SURGE: On the Potential of Large Lan...

Region-Adaptive Sampling for Diffusion Transformers

Region-Adaptive Sampling for Diffusion Transformers

Episode 566 · · 22:49

🤗 Upvotes: 46 | cs.CV, cs.AI Authors: Ziming Liu, Yifan Yang, Chengruidong Zhang, Yiqi Zhang, Lili Qiu, Yang You, Yuqing Yang Title...

Large Language Diffusion Models

Large Language Diffusion Models

Episode 565 · · 18:51

🤗 Upvotes: 44 | cs.CL, cs.LG Authors: Shen Nie, Fengqi Zhu, Zebin You, Xiaolu Zhang, Jingyang Ou, Jun Hu, Jun Zhou, Yankai Lin, Ji-Rong Wen, Cho...

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Episode 564 · · 27:29

🤗 Upvotes: 41 | cs.AI Authors: Alejandro Cuadron, Dacheng Li, Wenjie Ma, Xingyao Wang, Yichuan Wang, Siyuan Zhuang, Shu Liu, Luis Gaspar Schroed...

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Episode 563 · · 23:13

🤗 Upvotes: 38 | cs.CV, cs.CL Authors: Guoqing Ma, Haoyang Huang, Kun Yan, Liangyu Chen, Nan Duan, Shengming Yin, Changyi Wan, Ranchen Ming, Xiao...

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Episode 562 · · 22:33

🤗 Upvotes: 27 | cs.CV Authors: Jonathan Roberts, Mohammad Reza Taesiri, Ansh Sharma, Akash Gupta, Samuel Roberts, Ioana Croitoru, Simion-Vlad Bo...

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Episode 561 · · 23:35

🤗 Upvotes: 22 | cs.CL, cs.CV Authors: Yi-Fan Zhang, Tao Yu, Haochen Tian, Chaoyou Fu, Peiyan Li, Jianshu Zeng, Wulin Xie, Yang Shi, Huanyu Zhang...

ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation

ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation

Episode 560 · · 23:22

🤗 Upvotes: 12 | cs.CV, cs.GR Authors: Rotem Shalev-Arkushin, Rinon Gal, Amit H. Bermano, Ohad Fried Title: ImageRAG: Dy...

Diverse Inference and Verification for Advanced Reasoning

Diverse Inference and Verification for Advanced Reasoning

Episode 559 · · 22:54

🤗 Upvotes: 11 | cs.AI Authors: Iddo Drori, Gaston Longhitano, Mao Mao, Seunghwan Hyun, Yuke Zhang, Sungjun Park, Zachary Meeks, Xin-Yu Zhang, Be...

Precise Parameter Localization for Textual Generation in Diffusion Models

Precise Parameter Localization for Textual Generation in Diffusion Models

Episode 558 · · 21:50

🤗 Upvotes: 10 | cs.CV Authors: Łukasz Staniszewski, Bartosz Cywiński, Franziska Boenisch, Kamil Deja, Adam Dziedzic Title: ...

DarwinLM: Evolutionary Structured Pruning of Large Language Models

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Episode 557 · · 17:24

🤗 Upvotes: 9 | cs.LG, cs.CL Authors: Shengkun Tang, Oliver Sieberling, Eldar Kurtic, Zhiqiang Shen, Dan Alistarh Title: ...

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Episode 556 · · 21:13

🤗 Upvotes: 62 | cs.CL, cs.LG Authors: Heejun Lee, Geon Park, Jaduk Suh, Sung Ju Hwang Title: InfiniteHiP: Extending Lan...

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Episode 555 · · 21:39

🤗 Upvotes: 35 | cs.CL, cs.AI, cs.CV, cs.LG Authors: Mo Yu, Lemao Liu, Junjie Wu, Tsz Ting Chung, Shunchi Zhang, Jiangnan Li, Dit-Yan Yeung, Jie ...

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Episode 554 · · 19:30

🤗 Upvotes: 28 | cs.LG, cs.AI, cs.CV Authors: Hoigi Seo, Wongi Jeong, Jae-sun Seo, Se Young Chun Title: Skrr: Skip and R...

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Episode 553 · · 22:10

🤗 Upvotes: 22 | cs.CL, cs.AI, cs.LG Authors: Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, J...