Episodes

Latest Episode
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Episode 1420 · · 24:43

🤗 Upvotes: 78 | cs.CV Authors: Z-Image Team, Huanqia Cai, Sihan Cao, Ruoyi Du, Peng Gao, Steven Hoi, Shijie Huang, Zhaohui Hou, Dengyang Jiang, ...

REASONEDIT: Towards Reasoning-Enhanced Image Editing Models

REASONEDIT: Towards Reasoning-Enhanced Image Editing Models

Episode 1419 · · 21:40

🤗 Upvotes: 40 | cs.CV Authors: Fukun Yin, Shiyu Liu, Yucheng Han, Zhibo Wang, Peng Xing, Rui Wang, Wei Cheng, Yingming Wang, Aojie Li, Zixin Yin...

Vision Bridge Transformer at Scale

Vision Bridge Transformer at Scale

Episode 1418 · · 21:48

🤗 Upvotes: 31 | cs.CV, cs.AI Authors: Zhenxiong Tan, Zeqing Wang, Xingyi Yang, Songhua Liu, Xinchao Wang Title: Vision ...

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Episode 1417 · · 21:10

🤗 Upvotes: 25 | cs.AI, cs.CL Authors: Zhihong Shao, Yuxiang Luo, Chengda Lu, Z. Z. Ren, Jiewen Hu, Tian Ye, Zhibin Gou, Shirong Ma, Xiaokang Zha...

Architecture Decoupling Is Not All You Need For Unified Multimodal Model

Architecture Decoupling Is Not All You Need For Unified Multimodal Model

Episode 1416 · · 20:55

🤗 Upvotes: 23 | cs.CV Authors: Dian Zheng, Manyuan Zhang, Hongyu Li, Kai Zou, Hongbo Liu, Ziyu Guo, Kaituo Feng, Yexin Liu, Ying Luo, Yan Feng, ...

Multimodal Evaluation of Russian-language Architectures

Multimodal Evaluation of Russian-language Architectures

Episode 1415 · · 24:48

🤗 Upvotes: 71 | cs.CL, cs.AI, cs.CV Authors: Artem Chervyakov, Ulyana Isaeva, Anton Emelyanov, Artem Safin, Maria Tikhonova, Alexander Kharitono...

Latent Collaboration in Multi-Agent Systems

Latent Collaboration in Multi-Agent Systems

Episode 1414 · · 26:12

🤗 Upvotes: 60 | cs.CL, cs.AI, cs.LG Authors: Jiaru Zou, Xiyuan Yang, Ruizhong Qiu, Gaotang Li, Katherine Tieu, Pan Lu, Ke Shen, Hanghang Tong, Y...

Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation

Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation

Episode 1413 · · 18:26

🤗 Upvotes: 37 | cs.CV, cs.AI Authors: Inferix Team, Tianyu Feng, Yizeng Han, Jiahao He, Yuanyu He, Xi Lin, Teng Liu, Hanfeng Lu, Jiasheng Tang, ...

GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms

Episode 1412 · · 23:40

🤗 Upvotes: 87 | cs.NE, cs.AI, cs.LG Authors: Valentin Khrulkov, Andrey Galichin, Denis Bashkirov, Dmitry Vinichenko, Oleg Travkin, Roman Alferov...

MedSAM3: Delving into Segment Anything with Medical Concepts

MedSAM3: Delving into Segment Anything with Medical Concepts

Episode 1411 · · 24:21

🤗 Upvotes: 38 | cs.CV, cs.AI Authors: Anglin Liu, Rundong Xue, Xu R. Cao, Yifan Shen, Yi Lu, Xiang Li, Qianqian Chen, Jintai Chen T...

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

Episode 1410 · · 22:42

🤗 Upvotes: 37 | cs.CV, cs.AI Authors: Jiaqi Liu, Kaiwen Xiong, Peng Xia, Yiyang Zhou, Haonian Ji, Lu Feng, Siwei Han, Mingyu Ding, Huaxiu Yao ...

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Episode 1409 · · 19:11

🤗 Upvotes: 37 | cs.CV Authors: Jiaming Zhang, Shengming Cao, Rui Li, Xiaotong Zhao, Yutao Cui, Xinglin Hou, Gangshan Wu, Haolan Chen, Yu Xu, Lim...

iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation

iMontage: Unified, Versatile, Highly Dynamic Many-to-many Image Generation

Episode 1408 · · 25:05

🤗 Upvotes: 28 | cs.CV Authors: Zhoujie Fu, Xianfang Zeng, Jinghong Lan, Xinyao Liao, Cheng Chen, Junyi Chen, Jiacheng Wei, Wei Cheng, Shiyu Liu,...

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward

Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward

Episode 1407 · · 25:10

🤗 Upvotes: 26 | cs.CV, cs.CL Authors: Yuwei Niu, Weiyang Jin, Jiaqi Liao, Chaoran Feng, Peng Jin, Bin Lin, Zongjian Li, Bin Zhu, Weihao Yu, Li Y...

GigaWorld-0: World Models as Data Engine to Empower Embodied AI

GigaWorld-0: World Models as Data Engine to Empower Embodied AI

Episode 1406 · · 22:27

🤗 Upvotes: 24 | cs.CV, cs.RO Authors: GigaWorld Team, Angen Ye, Boyuan Wang, Chaojun Ni, Guan Huang, Guosheng Zhao, Haoyun Li, Jiagang Zhu, Keru...

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space

Episode 1405 · · 21:32

🤗 Upvotes: 23 | cs.CL Authors: Zhenyi Shen, Junru Lu, Lin Gui, Jiazheng Li, Yulan He, Di Yin, Xing Sun Title: SSA: Spar...

Soft Adaptive Policy Optimization

Soft Adaptive Policy Optimization

Episode 1404 · · 24:08

🤗 Upvotes: 23 | cs.LG, cs.AI, cs.CL Authors: Chang Gao, Chujie Zheng, Xiong-Hui Chen, Kai Dang, Shixuan Liu, Bowen Yu, An Yang, Shuai Bai, Jingr...

General Agentic Memory Via Deep Research

General Agentic Memory Via Deep Research

Episode 1403 · · 25:36

🤗 Upvotes: 121 | cs.CL, cs.AI, cs.IR, cs.LG Authors: B. Y. Yan, Chaofan Li, Hongjin Qian, Shuqi Lu, Zheng Liu Title: Ge...

AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning

AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning

Episode 1402 · · 23:00

🤗 Upvotes: 79 | cs.AI, cs.CL, cs.LG Authors: Jiayi Zhang, Yiran Peng, Fanqi Kong, Yang Cheng, Yifan Wu, Zhaoyang Yu, Jinyu Xiang, Jianhao Ruan, ...

Computer-Use Agents as Judges for Generative User Interface

Computer-Use Agents as Judges for Generative User Interface

Episode 1401 · · 25:50

🤗 Upvotes: 47 | cs.CV, cs.CL, cs.HC Authors: Kevin Qinghong Lin, Siyuan Hu, Linjie Li, Zhengyuan Yang, Lijuan Wang, Philip Torr, Mike Zheng Shou...