Episodes

Latest Episode
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Episode 328 · · 23:32

🤗 Upvotes: 30 | cs.CL Authors: Shanghaoran Quan, Jiaxi Yang, Bowen Yu, Bo Zheng, Dayiheng Liu, An Yang, Xuancheng Ren, Bofei Gao, Yibo Miao, Yun...

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

Episode 327 · · 19:15

🤗 Upvotes: 30 | cs.CV Authors: Yuanpeng Tu, Hao Luo, Xi Chen, Sihui Ji, Xiang Bai, Hengshuang Zhao Title: VideoAnydoor:...

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Episode 326 · · 24:49

🤗 Upvotes: 25 | cs.CV, cs.LG Authors: Jingfeng Yao, Xinggang Wang Title: Reconstruction vs. Generation: Taming Optimiza...

ProgCo: Program Helps Self-Correction of Large Language Models

ProgCo: Program Helps Self-Correction of Large Language Models

Episode 325 · · 20:19

🤗 Upvotes: 17 | cs.CL, cs.AI, cs.LG Authors: Xiaoshuai Song, Yanan Wu, Weixun Wang, Jiaheng Liu, Wenbo Su, Bo Zheng Title: ...

MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

Episode 324 · · 25:32

🤗 Upvotes: 16 | cs.CL Authors: Mahir Labib Dihan, Md Tanvir Hassan, Md Tanvir Parvez, Md Hasebul Hasan, Md Almash Alam, Muhammad Aamir Cheema, M...

A3: Android Agent Arena for Mobile GUI Agents

A3: Android Agent Arena for Mobile GUI Agents

Episode 323 · · 23:35

🤗 Upvotes: 15 | cs.AI Authors: Yuxiang Chai, Hanhao Li, Jiayu Zhang, Liang Liu, Guozhi Wang, Shuai Ren, Siyuan Huang, Hongsheng Li ...

MLLM-as-a-Judge for Image Safety without Human Labeling

MLLM-as-a-Judge for Image Safety without Human Labeling

Episode 322 · · 22:20

🤗 Upvotes: 14 | cs.CV, cs.CL, cs.CY, cs.LG Authors: Zhenting Wang, Shuming Hu, Shiyu Zhao, Xiaowen Lin, Felix Juefei-Xu, Zhuowei Li, Ligong Han,...

Dynamic Scaling of Unit Tests for Code Reward Modeling

Dynamic Scaling of Unit Tests for Code Reward Modeling

Episode 321 · · 21:52

🤗 Upvotes: 13 | cs.CL, cs.SE Authors: Zeyao Ma, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang Title: Dynami...

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Episode 320 · · 22:38

🤗 Upvotes: 52 | cs.AI, cs.CL, cs.CV, cs.HC Authors: Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chen...

Xmodel-2 Technical Report

Xmodel-2 Technical Report

Episode 319 · · 17:16

🤗 Upvotes: 13 | cs.AI Authors: Wang Qun, Liu Yang, Lin Qingquan, Qu Zhijiu, Jiang Ling Title: Xmodel-2 Technical Report...

Are Vision-Language Models Truly Understanding Multi-vision Sensor?

Are Vision-Language Models Truly Understanding Multi-vision Sensor?

Episode 318 · · 24:50

🤗 Upvotes: 9 | cs.CV Authors: Sangyun Chung, Youngjoon Yu, Youngchae Chee, Se Yeon Kim, Byung-Kwan Lee, Yong Man Ro Title: ...

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving

Episode 317 · · 20:48

🤗 Upvotes: 4 | cs.AI, cs.CL Authors: Yang Li, Dong Du, Linfeng Song, Chen Li, Weikang Wang, Tao Yang, Haitao Mi Title: ...

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control

Episode 316 · · 22:06

🤗 Upvotes: 2 | cs.CV Authors: Shaojin Wu, Fei Ding, Mengqi Huang, Wei Liu, Qian He Title: VMix: Improving Text-to-Image...

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Episode 315 · · 20:07

🤗 Upvotes: 13 | cs.CL Authors: Xingyu Chen, Jiahao Xu, Tian Liang, Zhiwei He, Jianhui Pang, Dian Yu, Linfeng Song, Qiuzhi Liu, Mengfei Zhou, Zhu...

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Episode 314 · · 18:53

🤗 Upvotes: 11 | cs.CL, cs.AI, cs.DB, cs.IR, cs.LG Authors: Yujie Luo, Xiangyuan Ru, Kangwei Liu, Lin Yuan, Mengshu Sun, Ningyu Zhang, Lei Liang,...

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Episode 313 · · 25:04

🤗 Upvotes: 39 | cs.CV Authors: Yang Shen, Xiu-Shen Wei, Yifan Sun, Yuxin Song, Tao Yuan, Jian Jin, Heyang Xu, Yazhou Yao, Errui Ding ...

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Episode 312 · · 22:45

🤗 Upvotes: 29 | cs.CV, cs.AI, cs.CL, cs.LG Authors: Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize C...

Bringing Objects to Life: 4D generation from 3D objects

Bringing Objects to Life: 4D generation from 3D objects

Episode 311 · · 21:48

🤗 Upvotes: 24 | cs.CV Authors: Ohad Rahamim, Ori Malca, Dvir Samuel, Gal Chechik Title: Bringing Objects to Life: 4D ge...

Efficiently Serving LLM Reasoning Programs with Certaindex

Efficiently Serving LLM Reasoning Programs with Certaindex

Episode 310 · · 20:19

🤗 Upvotes: 20 | cs.LG, cs.CL Authors: Yichao Fu, Junda Chen, Siqi Zhu, Zheyu Fu, Zhongdongming Dai, Aurick Qiao, Hao Zhang Title: ...

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization

Episode 309 · · 21:15

🤗 Upvotes: 14 | cs.SD, cs.AI, cs.CL, eess.AS Authors: Chia-Yu Hung, Navonil Majumder, Zhifeng Kong, Ambuj Mehrish, Rafael Valle, Bryan Catanzaro...

Edicho: Consistent Image Editing in the Wild

Edicho: Consistent Image Editing in the Wild

Episode 308 · · 22:47

🤗 Upvotes: 13 | cs.CV Authors: Qingyan Bai, Hao Ouyang, Yinghao Xu, Qiuyu Wang, Ceyuan Yang, Ka Leong Cheng, Yujun Shen, Qifeng Chen ...

Facilitating large language model Russian adaptation with Learned Embedding Propagation

Facilitating large language model Russian adaptation with Learned Embedding Propagation

Episode 307 · · 22:12

🤗 Upvotes: 6 | cs.CL, cs.AI Authors: Mikhail Tikhomirov, Daniil Chernyshev Title: Facilitating large language model Rus...

Training Software Engineering Agents and Verifiers with SWE-Gym

Training Software Engineering Agents and Verifiers with SWE-Gym

Episode 306 · · 26:54

🤗 Upvotes: 6 | cs.SE, cs.CL Authors: Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang Title...

HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation

Episode 305 · · 20:54

🤗 Upvotes: 5 | cs.SE, cs.CL Authors: Zhaojian Yu, Yilun Zhao, Arman Cohan, Xiao-Ping Zhang Title: HumanEval Pro and MBP...

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Episode 304 · · 23:19

🤗 Upvotes: 5 | cs.CV Authors: Haoran Wei, Youyang Yin, Yumeng Li, Jia Wang, Liang Zhao, Jianjian Sun, Zheng Ge, Xiangyu Zhang Title...

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Episode 303 · · 23:19

🤗 Upvotes: 53 | cs.CL, cs.AI, cs.LG Authors: Junying Chen, Zhenyang Cai, Ke Ji, Xidong Wang, Wanlong Liu, Rongsheng Wang, Jianye Hou, Benyou Wan...

1.58-bit FLUX

1.58-bit FLUX

Episode 302 · · 22:59

🤗 Upvotes: 24 | cs.CV, cs.AI, cs.LG Authors: Chenglin Yang, Celong Liu, Xueqing Deng, Dongwon Kim, Xing Mei, Xiaohui Shen, Liang-Chieh Chen ...

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Episode 301 · · 17:30

🤗 Upvotes: 17 | cs.CL, cs.AI, cs.CV, cs.LG, cs.MM, eess.AS Authors: Liang Chen, Zekun Wang, Shuhuai Ren, Lei Li, Haozhe Zhao, Yunshui Li, Zefan ...

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Episode 300 · · 23:16

🤗 Upvotes: 11 | cs.CV Authors: Zehan Wang, Ziang Zhang, Tianyu Pang, Chao Du, Hengshuang Zhao, Zhou Zhao Title: Orient ...

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Episode 299 · · 25:03

🤗 Upvotes: 11 | cs.CV Authors: Ziang Yan, Zhilin Li, Yinan He, Chenting Wang, Kunchang Li, Xinhao Li, Xiangyu Zeng, Zilei Wang, Yali Wang, Yu Qi...