Daily Paper Cast | All Episodes

Continuous Diffusion Model for Language Modeling

Episode 582 · February 19, 2025 · 18:43

🤗 Upvotes: 44 | cs.LG Authors: Jaehyeong Jo, Sung Ju Hwang Title: Continuous Diffusion Model for Language Modeling ...

Phantom: Subject-consistent video generation via cross-modal alignment

Episode 581 · February 19, 2025 · 21:18

🤗 Upvotes: 42 | cs.CV, cs.AI Authors: Lijie Liu, Tianxiang Ma, Bingchuan Li, Zhuowei Chen, Jiawei Liu, Qian He, Xinglong Wu Title: ...

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Episode 580 · February 19, 2025 · 22:39

🤗 Upvotes: 33 | cs.AI, cs.CL Authors: Feng Luo, Rui Yang, Hao Sun, Chunyuan Deng, Jiarui Yao, Jingyan Shen, Huan Zhang, Hanjie Chen ...

Magma: A Foundation Model for Multimodal AI Agents

Episode 579 · February 19, 2025 · 23:02

🤗 Upvotes: 30 | cs.CV, cs.AI, cs.HC, cs.LG, cs.RO Authors: Jianwei Yang, Reuben Tan, Qianhui Wu, Ruijie Zheng, Baolin Peng, Yongyuan Liang, Yu G...

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

Episode 578 · February 19, 2025 · 21:52

🤗 Upvotes: 29 | cs.CV Authors: Bencheng Liao, Hongyuan Tao, Qian Zhang, Tianheng Cheng, Yingyue Li, Haoran Yin, Wenyu Liu, Xinggang Wang ...

SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

Episode 577 · February 19, 2025 · 21:29

🤗 Upvotes: 27 | cs.RO, cs.AI, cs.CV Authors: Zekun Qi, Wenyao Zhang, Yufei Ding, Runpei Dong, Xinqiang Yu, Jingwen Li, Lingyun Xu, Baoyu Li, Xia...

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

Episode 576 · February 19, 2025 · 20:32

🤗 Upvotes: 26 | cs.CL Authors: Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang ...

You Do Not Fully Utilize Transformer's Representation Capacity

Episode 575 · February 19, 2025 · 20:56

🤗 Upvotes: 25 | cs.LG, cs.CL Authors: Gleb Gerasimov, Yaroslav Aksenov, Nikita Balagansky, Viacheslav Sinii, Daniil Gavrilov Title:...

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Episode 574 · February 18, 2025 · 23:03

🤗 Upvotes: 68 | cs.CL, cs.AI, cs.LG Authors: Jingyang Yuan, Huazuo Gao, Damai Dai, Junyu Luo, Liang Zhao, Zhengyan Zhang, Zhenda Xie, Y. X. Wei,...

Learning Getting-Up Policies for Real-World Humanoid Robots

Episode 573 · February 18, 2025 · 24:40

🤗 Upvotes: 32 | cs.RO, cs.LG Authors: Xialin He, Runpei Dong, Zixuan Chen, Saurabh Gupta Title: Learning Getting-Up Pol...

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?

Episode 572 · February 18, 2025 · 21:53

🤗 Upvotes: 27 | cs.LG, cs.SE Authors: Samuel Miserendino, Michele Wang, Tejal Patwardhan, Johannes Heidecke Title: SWE-...

CRANE: Reasoning with constrained LLM generation

Episode 571 · February 18, 2025 · 21:23

🤗 Upvotes: 17 | cs.PL, cs.LG Authors: Debangshu Banerjee, Tarun Suresh, Shubham Ugare, Sasa Misailovic, Gagandeep Singh Title: ...

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

Episode 570 · February 18, 2025 · 24:39

🤗 Upvotes: 16 | cs.LG, cs.AI, cs.CL, cs.CV, cs.HC Authors: Yixin Ou, Yunzhi Yao, Ningyu Zhang, Hui Jin, Jiacheng Sun, Shumin Deng, Zhenguo Li, H...

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Episode 569 · February 18, 2025 · 20:01

🤗 Upvotes: 15 | cs.CV Authors: Ling Yang, Xinchen Zhang, Ye Tian, Chenming Shang, Minghao Xu, Wentao Zhang, Bin Cui Title: ...

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

Episode 568 · February 18, 2025 · 22:23

🤗 Upvotes: 14 | cs.LG, cs.AI Authors: Zhenxing Mi, Kuan-Chieh Wang, Guocheng Qian, Hanrong Ye, Runtao Liu, Sergey Tulyakov, Kfir Aberman, Dan Xu...

SURGE: On the Potential of Large Language Models as General-Purpose Surrogate Code Executors

Episode 567 · February 18, 2025 · 23:50

🤗 Upvotes: 11 | cs.LG, cs.CL Authors: Bohan Lyu, Siqiao Huang, Zichen Liang Title: SURGE: On the Potential of Large Lan...

Region-Adaptive Sampling for Diffusion Transformers

Episode 566 · February 17, 2025 · 22:49

🤗 Upvotes: 46 | cs.CV, cs.AI Authors: Ziming Liu, Yifan Yang, Chengruidong Zhang, Yiqi Zhang, Lili Qiu, Yang You, Yuqing Yang Title...

Large Language Diffusion Models

Episode 565 · February 17, 2025 · 18:51

🤗 Upvotes: 44 | cs.CL, cs.LG Authors: Shen Nie, Fengqi Zhu, Zebin You, Xiaolu Zhang, Jingyang Ou, Jun Hu, Jun Zhou, Yankai Lin, Ji-Rong Wen, Cho...

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Episode 564 · February 17, 2025 · 27:29

🤗 Upvotes: 41 | cs.AI Authors: Alejandro Cuadron, Dacheng Li, Wenjie Ma, Xingyao Wang, Yichuan Wang, Siyuan Zhuang, Shu Liu, Luis Gaspar Schroed...

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Episode 563 · February 17, 2025 · 23:13

🤗 Upvotes: 38 | cs.CV, cs.CL Authors: Guoqing Ma, Haoyang Huang, Kun Yan, Liangyu Chen, Nan Duan, Shengming Yin, Changyi Wan, Ranchen Ming, Xiao...

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Episode 562 · February 17, 2025 · 22:33

🤗 Upvotes: 27 | cs.CV Authors: Jonathan Roberts, Mohammad Reza Taesiri, Ansh Sharma, Akash Gupta, Samuel Roberts, Ioana Croitoru, Simion-Vlad Bo...

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Episode 561 · February 17, 2025 · 23:35

🤗 Upvotes: 22 | cs.CL, cs.CV Authors: Yi-Fan Zhang, Tao Yu, Haochen Tian, Chaoyou Fu, Peiyan Li, Jianshu Zeng, Wulin Xie, Yang Shi, Huanyu Zhang...

ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation

Episode 560 · February 17, 2025 · 23:22

🤗 Upvotes: 12 | cs.CV, cs.GR Authors: Rotem Shalev-Arkushin, Rinon Gal, Amit H. Bermano, Ohad Fried Title: ImageRAG: Dy...

Diverse Inference and Verification for Advanced Reasoning

Episode 559 · February 17, 2025 · 22:54

🤗 Upvotes: 11 | cs.AI Authors: Iddo Drori, Gaston Longhitano, Mao Mao, Seunghwan Hyun, Yuke Zhang, Sungjun Park, Zachary Meeks, Xin-Yu Zhang, Be...

Precise Parameter Localization for Textual Generation in Diffusion Models

Episode 558 · February 17, 2025 · 21:50

🤗 Upvotes: 10 | cs.CV Authors: Łukasz Staniszewski, Bartosz Cywiński, Franziska Boenisch, Kamil Deja, Adam Dziedzic Title: ...

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Episode 557 · February 17, 2025 · 17:24

🤗 Upvotes: 9 | cs.LG, cs.CL Authors: Shengkun Tang, Oliver Sieberling, Eldar Kurtic, Zhiqiang Shen, Dan Alistarh Title: ...

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Episode 556 · February 14, 2025 · 21:13

🤗 Upvotes: 62 | cs.CL, cs.LG Authors: Heejun Lee, Geon Park, Jaduk Suh, Sung Ju Hwang Title: InfiniteHiP: Extending Lan...

The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Episode 555 · February 14, 2025 · 21:39

🤗 Upvotes: 35 | cs.CL, cs.AI, cs.CV, cs.LG Authors: Mo Yu, Lemao Liu, Junjie Wu, Tsz Ting Chung, Shunchi Zhang, Jiangnan Li, Dit-Yan Yeung, Jie ...

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Episode 554 · February 14, 2025 · 19:30

🤗 Upvotes: 28 | cs.LG, cs.AI, cs.CV Authors: Hoigi Seo, Wongi Jeong, Jae-sun Seo, Se Young Chun Title: Skrr: Skip and R...

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Episode 553 · February 14, 2025 · 22:10

🤗 Upvotes: 22 | cs.CL, cs.AI, cs.LG Authors: Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, J...