Episodes

Latest Episode
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution

Episode 38 · · 19:06

🤗 Paper Upvotes: 10 | cs.RO, cs.AI, cs.LG Authors: Yang Yue, Yulin Wang, Bingyi Kang, Yizeng Han, Shenzhi Wang, Shiji Song, Jiashi Feng, Gao Hua...

Controlling Language and Diffusion Models by Transporting Activations

Controlling Language and Diffusion Models by Transporting Activations

Episode 37 · · 22:46

🤗 Paper Upvotes: 8 | cs.LG, cs.AI, cs.CL, cs.CV, 68T07, 49Q22, I.2.6; I.2.7; I.4.8 Authors: Pau Rodriguez, Arno Blaas, Michal Klein, Luca Zappel...

Sample-Efficient Alignment for LLMs

Sample-Efficient Alignment for LLMs

Episode 36 · · 21:03

🤗 Paper Upvotes: 8 | cs.LG, cs.AI, cs.CL Authors: Zichen Liu, Changyu Chen, Chao Du, Wee Sun Lee, Min Lin Title: Sample...

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

DreamPolish: Domain Score Distillation With Progressive Geometry Generation

Episode 35 · · 18:07

🤗 Paper Upvotes: 6 | cs.CV, cs.AI Authors: Yean Cheng, Ziqi Cai, Ming Ding, Wendi Zheng, Shiyu Huang, Yuxiao Dong, Jie Tang, Boxin Shi ...

Adaptive Length Image Tokenization via Recurrent Allocation

Adaptive Length Image Tokenization via Recurrent Allocation

Episode 34 · · 21:08

🤗 Paper Upvotes: 4 | cs.CV, cs.AI, cs.LG, cs.RO Authors: Shivam Duggal, Phillip Isola, Antonio Torralba, William T. Freeman Title: ...

GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details

GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details

Episode 33 · · 19:08

🤗 Paper Upvotes: 3 | cs.CV, cs.GR Authors: Zhongjin Luo, Haolin Liu, Chenghong Li, Wanghao Du, Zirong Jin, Wanhu Sun, Yinyu Nie, Weikai Chen, Xi...

Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge

Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge

Episode 32 · · 25:57

🤗 Paper Upvotes: 3 | cs.CL Authors: Karthik Soman, Andrew Langdon, Catalina Villouta, Chinmay Agrawal, Lashaw Salta, Braian Peetoom, Gianmarco B...

Inference Optimal VLMs Need Only One Visual Token but Larger Models

Inference Optimal VLMs Need Only One Visual Token but Larger Models

Episode 31 · · 22:08

🤗 Paper Upvotes: 2 | cs.CV, cs.AI, cs.LG Authors: Kevin Y. Li, Sachin Goyal, Joao D. Semedo, J. Zico Kolter Title: Infe...

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

Episode 30 · · 22:48

🤗 Paper Upvotes: 40 | cs.AI Authors: Yifan Xu, Xiao Liu, Xueqiao Sun, Siyi Cheng, Hao Yu, Hanyu Lai, Shudan Zhang, Dan Zhang, Jie Tang, Yuxiao D...

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Episode 29 · · 24:54

🤗 Paper Upvotes: 28 | cs.LG, cs.AI Authors: Eldar Kurtic, Alexandre Marques, Shubhra Pandit, Mark Kurtz, Dan Alistarh Title: ...

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Episode 28 · · 22:09

🤗 Paper Upvotes: 25 | cs.CL Authors: Zehan Qi, Xiao Liu, Iat Long Iong, Hanyu Lai, Xueqiao Sun, Xinyue Yang, Jiadai Sun, Yu Yang, Shuntian Yao, ...

MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

Episode 27 · · 21:18

🤗 Paper Upvotes: 20 | cs.CV Authors: Wei Cheng, Juncheng Mu, Xianfang Zeng, Xin Chen, Anqi Pang, Chi Zhang, Zhibin Wang, Bin Fu, Gang Yu, Ziwei ...

Training-free Regional Prompting for Diffusion Transformers

Training-free Regional Prompting for Diffusion Transformers

Episode 26 · · 17:11

🤗 Paper Upvotes: 19 | cs.CV Authors: Anthony Chen, Jianjin Xu, Wenzhao Zheng, Gaole Dai, Yida Wang, Renrui Zhang, Haofan Wang, Shanghang Zhang ...

How Far is Video Generation from World Model: A Physical Law Perspective

How Far is Video Generation from World Model: A Physical Law Perspective

Episode 25 · · 23:13

🤗 Paper Upvotes: 19 | cs.CV, cs.AI Authors: Bingyi Kang, Yang Yue, Rui Lu, Zhijie Lin, Yang Zhao, Kaixin Wang, Gao Huang, Jiashi Feng ...

Survey of Cultural Awareness in Language Models: Text and Beyond

Survey of Cultural Awareness in Language Models: Text and Beyond

Episode 24 · · 23:43

🤗 Paper Upvotes: 19 | cs.CL, cs.CV Authors: Siddhesh Pawar, Junyeong Park, Jiho Jin, Arnav Arora, Junho Myung, Srishti Yadav, Faiz Ghifari Hazni...

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Episode 23 · · 18:12

🤗 Paper Upvotes: 16 | cs.CL, cs.AI Authors: Xingwu Sun, Yanfeng Chen, Yiqing Huang, Ruobing Xie, Jiaqi Zhu, Kai Zhang, Shuaipeng Li, Zhen Yang, ...

GenXD: Generating Any 3D and 4D Scenes

GenXD: Generating Any 3D and 4D Scenes

Episode 22 · · 22:09

🤗 Paper Upvotes: 13 | cs.CV, cs.AI Authors: Yuyang Zhao, Chung-Ching Lin, Kevin Lin, Zhiwen Yan, Linjie Li, Zhengyuan Yang, Jianfeng Wang, Gim H...

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Episode 21 · · 19:22

🤗 Paper Upvotes: 13 | cs.CV, cs.AI, cs.CL Authors: Chengke Zou, Xingang Guo, Rui Yang, Junyu Zhang, Bin Hu, Huan Zhang Title: ...

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents

Episode 20 · · 20:10

🤗 Paper Upvotes: 32 | cs.CL, cs.CV, cs.HC Authors: Zhiyong Wu, Zhenyu Wu, Fangzhi Xu, Yian Wang, Qiushi Sun, Chengyou Jia, Kanzhi Cheng, Zichen ...

Personalization of Large Language Models: A Survey

Personalization of Large Language Models: A Survey

Episode 19 · · 25:40

🤗 Paper Upvotes: 14 | cs.CL Authors: Zhehao Zhang, Ryan A. Rossi, Branislav Kveton, Yijia Shao, Diyi Yang, Hamed Zamani, Franck Dernoncourt, Joe...

Constant Acceleration Flow

Constant Acceleration Flow

Episode 18 · · 21:13

🤗 Paper Upvotes: 14 | cs.LG, cs.AI, cs.CV Authors: Dogyun Park, Sojin Lee, Sihyeon Kim, Taehoon Lee, Youngjoon Hong, Hyunwoo J. Kim ...

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

Episode 17 · · 24:08

🤗 Paper Upvotes: 13 | cs.CV, cs.AI, cs.CL Authors: Ziyao Shangguan, Chuhan Li, Yuxuan Ding, Yanan Zheng, Yilun Zhao, Tesca Fitzgerald, Arman Coh...

Randomized Autoregressive Visual Generation

Randomized Autoregressive Visual Generation

Episode 16 · · 20:14

🤗 Paper Upvotes: 10 | cs.CV Authors: Qihang Yu, Ju He, Xueqing Deng, Xiaohui Shen, Liang-Chieh Chen Title: Randomized A...

Survey of User Interface Design and Interaction Techniques in Generative AI Applications

Survey of User Interface Design and Interaction Techniques in Generative AI Applications

Episode 15 · · 23:41

🤗 Paper Upvotes: 8 | cs.HC, cs.AI, cs.CL, cs.LG Authors: Reuben Luera, Ryan A. Rossi, Alexa Siu, Franck Dernoncourt, Tong Yu, Sungchul Kim, Ruiy...

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

Episode 14 · · 20:54

🤗 Paper Upvotes: 8 | cs.LG, cs.AI, cs.CL, I.2.6; I.2.7 Authors: Bohan Lyu, Yadi Cao, Duncan Watson-Parris, Leon Bergen, Taylor Berg-Kirkpatrick,...

In-Context LoRA for Diffusion Transformers

In-Context LoRA for Diffusion Transformers

Episode 13 · · 20:21

🤗 Paper Upvotes: 7 | cs.CV, cs.GR Authors: Lianghua Huang, Wei Wang, Zhi-Fan Wu, Yupeng Shi, Huanzhang Dou, Chen Liang, Yutong Feng, Yu Liu, Jin...

Physics in Next-token Prediction

Physics in Next-token Prediction

Episode 12 · · 18:44

🤗 Paper Upvotes: 7 | cs.LG, cs.AI Authors: Hongjun An, Yiliang Song, Xuelong Li Title: Physics in Next-token Prediction...

CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes

CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes

Episode 11 · · 20:24

🤗 Paper Upvotes: 5 | cs.CV Authors: Yang Liu, Chuanchen Luo, Zhongkai Mao, Junran Peng, Zhaoxiang Zhang Title: CityGaus...

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

Episode 10 · · 23:07

🤗 Daily Paper Upvotes: 57Authors: Viacheslav Surkov, Chris Wendler, Mikhail Terekhov, Justin Deschenaux, Robert West, Caglar GulcehreCategories: cs.LG, cs.AI, cs.CVArxiv: http://arxi...

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective

Episode 9 · · 20:47

🤗 Daily Paper Upvotes: 45 Authors: Ming Li, Yanhong Li, Tianyi Zhou Categories: cs.CL, cs.AI, cs.LG Arxiv: http://arxiv.org/abs/2410.23743v1 Title: What Happened in LLMs Layers w...