Episodes

Latest Episode
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Episode 1283 · · 24:17

🤗 Upvotes: 106 | cs.LG, cs.CL, cs.CV Authors: Wei Huang, Yi Ge, Shuai Yang, Yicheng Xiao, Huizi Mao, Yujun Lin, Hanrong Ye, Sifei Liu, Ka Chun C...

Diffusion Transformers with Representation Autoencoders

Diffusion Transformers with Representation Autoencoders

Episode 1282 · · 24:28

🤗 Upvotes: 93 | cs.CV, cs.LG Authors: Boyang Zheng, Nanye Ma, Shengbang Tong, Saining Xie Title: Diffusion Transformers...

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Episode 1281 · · 26:46

🤗 Upvotes: 39 | cs.AI Authors: Caorui Li, Yu Chen, Yiyan Ji, Jin Xu, Zhenyu Cui, Shihao Li, Yuanxing Zhang, Jiafu Tang, Zhenghao Song, Dingling ...

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models by Refining Belief States

Episode 1280 · · 25:11

🤗 Upvotes: 37 | cs.CL Authors: Qinglin Zhu, Yizhen Yao, Runcong Zhao, Yanzheng Xiang, Amrutha Saseendran, Chen Jin, Philip Alexander Teare, Bin ...

Spotlight on Token Perception for Multimodal Reinforcement Learning

Spotlight on Token Perception for Multimodal Reinforcement Learning

Episode 1279 · · 23:52

🤗 Upvotes: 31 | cs.CV Authors: Siyuan Huang, Xiaoye Qu, Yafu Li, Yun Luo, Zefeng He, Daizong Liu, Yu Cheng Title: Spotl...

RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

Episode 1278 · · 24:01

🤗 Upvotes: 31 | cs.LG, cs.AI, cs.CL Authors: Jinghao Zhang, Naishan Zheng, Ruilin Li, Dongzhou Cheng, Zheming Liang, Feng Zhao, Jiaqi Wang ...

DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training

DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training

Episode 1277 · · 22:13

🤗 Upvotes: 26 | cs.CV Authors: Haoran Feng, Dizhe Zhang, Xiangtai Li, Bo Du, Lu Qi Title: DiT360: High-Fidelity Panoram...

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Episode 1276 · · 24:24

🤗 Upvotes: 26 | cs.CV Authors: Xinlong Chen, Yue Ding, Weihong Lin, Jingyun Hua, Linli Yao, Yang Shi, Bozhou Li, Yuanxing Zhang, Qiang Liu, Peng...

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Episode 1275 · · 25:21

🤗 Upvotes: 25 | cs.CV Authors: Haomin Wang, Jinhui Yin, Qi Wei, Wenguang Zeng, Lixin Gu, Shenglong Ye, Zhangwei Gao, Yaohui Wang, Yanting Zhang,...

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Episode 1274 · · 21:38

🤗 Upvotes: 25 | cs.CL, cs.AI Authors: Tao Yu, Zhengbo Zhang, Zhiheng Lyu, Junhao Gong, Hongzhu Yi, Xinming Wang, Yuxuan Zhou, Jiabing Yang, Ping...

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Episode 1273 · · 23:49

🤗 Upvotes: 104 | cs.AI, cs.CV, cs.RO Authors: Suwhan Choi, Jaeyoon Jung, Haebin Seong, Minchan Kim, Minyeong Kim, Yongjun Cho, Yoonshik Kim, Yub...

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation

Episode 1272 · · 23:18

🤗 Upvotes: 86 | cs.CV Authors: Kang Liao, Size Wu, Zhonghua Wu, Linyi Jin, Chao Wang, Yikai Wang, Fei Wang, Wei Li, Chen Change Loy ...

TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling

TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling

Episode 1271 · · 21:39

🤗 Upvotes: 38 | cs.CV Authors: Hyunmin Cho, Donghoon Ahn, Susung Hong, Jee Eun Kim, Seungryong Kim, Kyong Hwan Jin Title: ...

AutoPR: Let's Automate Your Academic Promotion!

AutoPR: Let's Automate Your Academic Promotion!

Episode 1270 · · 22:35

🤗 Upvotes: 38 | cs.CL Authors: Qiguang Chen, Zheng Yan, Mingda Yang, Libo Qin, Yixin Yuan, Hanjing Li, Jinhao Liu, Yiyan Ji, Dengyun Peng, Jiann...

Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs

Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs

Episode 1269 · · 25:37

🤗 Upvotes: 37 | cs.LG, cs.AI, cs.CL Authors: Yumin Choi, Dongki Kim, Jinheon Baek, Sung Ju Hwang Title: Multimodal Prom...

BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities

BEAR: Benchmarking and Enhancing Multimodal Language Models for Atomic Embodied Capabilities

Episode 1268 · · 26:40

🤗 Upvotes: 30 | cs.CV, cs.RO Authors: Yu Qi, Haibo Zhao, Ziyu Guo, Siyuan Ma, Ziyan Chen, Yaokun Han, Renrui Zhang, Zitiantao Lin, Shiji Xin, Yi...

StreamingVLM: Real-Time Understanding for Infinite Video Streams

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Episode 1267 · · 21:22

🤗 Upvotes: 26 | cs.CV, cs.AI, cs.CL Authors: Ruyi Xu, Guangxuan Xiao, Yukang Chen, Liuning He, Kelly Peng, Yao Lu, Song Han Title: ...

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Episode 1266 · · 24:41

🤗 Upvotes: 22 | cs.CL, cs.AI Authors: Zhepeng Cen, Haolin Chen, Shiyu Wang, Zuxin Liu, Zhiwei Liu, Ding Zhao, Silvio Savarese, Caiming Xiong, Hu...

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Episode 1265 · · 22:59

🤗 Upvotes: 22 | cs.SE, cs.AI, cs.CL Authors: Terry Yue Zhuo, Xiaolong Jin, Hange Liu, Juyong Jiang, Tianyang Liu, Chen Gong, Bhupesh Bishnoi, Va...

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?

Episode 1264 · · 24:42

🤗 Upvotes: 22 | cs.AI, cs.CL Authors: Yi Lu, Jianing Wang, Linsen Guo, Wei He, Hongyin Tang, Tao Gui, Xuanjing Huang, Xuezhi Cao, Wei Wang, Xunl...