Episodes

Latest Episode
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework

M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework

Episode 68 · · 20:43

πŸ€— Paper Upvotes: 28 | cs.CL Authors: Yew Ken Chia, Liying Cheng, Hou Pong Chan, Chaoqun Liu, Maojia Song, Sharifah Mahani Aljunied, Soujanya Por...

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Episode 67 · · 24:47

πŸ€— Paper Upvotes: 21 | cs.CV, cs.LG Authors: NVIDIA, :, Yuval Atzmon, Maciej Bala, Yogesh Balaji, Tiffany Cai, Yin Cui, Jiaojiao Fan, Yunhao Ge, ...

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models

GitChameleon: Unmasking the Version-Switching Capabilities of Code Generation Models

Episode 66 · · 24:32

πŸ€— Paper Upvotes: 18 | cs.SE, cs.LG Authors: Nizar Islah, Justine Gehring, Diganta Misra, Eilif Muller, Irina Rish, Terry Yue Zhuo, Massimo Cacci...

Watermark Anything with Localized Messages

Watermark Anything with Localized Messages

Episode 65 · · 23:25

πŸ€— Paper Upvotes: 11 | cs.CV, cs.CR Authors: Tom Sander, Pierre Fernandez, Alain Durmus, Teddy Furon, Matthijs Douze Title: ...

Autoregressive Models in Vision: A Survey

Autoregressive Models in Vision: A Survey

Episode 64 · · 22:52

πŸ€— Paper Upvotes: 3 | cs.CV, cs.CL Authors: Jing Xiong, Gongye Liu, Lun Huang, Chengyue Wu, Taiqiang Wu, Yao Mu, Yuan Yao, Hui Shen, Zhongwei Wan...

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation

Episode 63 · · 25:29

πŸ€— Paper Upvotes: 15 | cs.CV, cs.CL Authors: Weiquan Huang, Aoqi Wu, Yifan Yang, Xufang Luo, Yuqing Yang, Liang Hu, Qi Dai, Xiyang Dai, Dongdong ...

Balancing Pipeline Parallelism with Vocabulary Parallelism

Balancing Pipeline Parallelism with Vocabulary Parallelism

Episode 62 · · 23:34

πŸ€— Paper Upvotes: 10 | cs.DC Authors: Man Tsung Yeung, Penghui Qi, Min Lin, Xinyi Wan Title: Balancing Pipeline Parallel...

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Episode 61 · · 21:47

πŸ€— Paper Upvotes: 10 | cs.CV Authors: Yuze He, Yanning Zhou, Wang Zhao, Zhongkai Wu, Kaiwen Xiao, Wei Yang, Yong-Jin Liu, Xiao Han T...

DELIFT: Data Efficient Language model Instruction Fine Tuning

DELIFT: Data Efficient Language model Instruction Fine Tuning

Episode 60 · · 21:17

πŸ€— Paper Upvotes: 5 | cs.CL Authors: Ishika Agarwal, Krishnateja Killamsetty, Lucian Popa, Marina Danilevksy Title: DELI...

Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study

Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study

Episode 59 · · 25:06

πŸ€— Paper Upvotes: 4 | cs.SE, cs.AI, cs.LG Authors: AndrΓ© Storhaug, Jingyue Li Title: Parameter-Efficient Fine-Tuning of ...

RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models

RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models

Episode 58 · · 22:22

πŸ€— Paper Upvotes: 3 | cs.CV, cs.AI Authors: Maya Varma, Jean-Benoit Delbrouck, Zhihong Chen, Akshay Chaudhari, Curtis Langlotz Title...

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities

The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities

Episode 57 · · 24:01

πŸ€— Paper Upvotes: 3 | cs.CL Authors: Zhaofeng Wu, Xinyan Velocity Yu, Dani Yogatama, Jiasen Lu, Yoon Kim Title: The Sema...

Improving the detection of technical debt in Java source code with an enriched dataset

Improving the detection of technical debt in Java source code with an enriched dataset

Episode 56 · · 26:17

πŸ€— Paper Upvotes: 2 | cs.SE Authors: Nam Le Hai, Anh M. T. Bui, Phuong T. Nguyen, Davide Di Ruscio, Rick Kazman Title: I...

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Episode 55 · · 22:46

πŸ€— Paper Upvotes: 69 | cs.CL, cs.PL Authors: Siming Huang, Tianhao Cheng, Jason Klein Liu, Jiaran Hao, Liuyihan Song, Yang Xu, J. Yang, J. H. Liu...

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Episode 54 · · 19:53

πŸ€— Paper Upvotes: 50 | cs.CV, cs.AI, cs.GR, cs.LG Authors: David Junhao Zhang, Roni Paiss, Shiran Zada, Nikhil Karnad, David E. Jacobs, Yael Prit...

BitNet a4.8: 4-bit Activations for 1-bit LLMs

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Episode 53 · · 25:23

πŸ€— Paper Upvotes: 41 | cs.CL, cs.LG Authors: Hongyu Wang, Shuming Ma, Furu Wei Title: BitNet a4.8: 4-bit Activations for...

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Episode 52 · · 23:01

πŸ€— Paper Upvotes: 27 | cs.CV, cs.AI, cs.GR Authors: Wenqiang Sun, Shuo Chen, Fangfu Liu, Zilong Chen, Yueqi Duan, Jun Zhang, Yikai Wang ...

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Episode 51 · · 24:52

πŸ€— Paper Upvotes: 25 | cs.CL Authors: Weixin Liang, Lili Yu, Liang Luo, Srinivasan Iyer, Ning Dong, Chunting Zhou, Gargi Ghosh, Mike Lewis, Wen-t...

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Episode 50 · · 24:39

πŸ€— Paper Upvotes: 20 | cs.CV Authors: Wenhao Wang, Yi Yang Title: TIP-I2V: A Million-Scale Real Text and Image Prompt Da...

Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Episode 49 · · 22:30

πŸ€— Paper Upvotes: 15 | cs.CL Authors: Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Ho-Jin Choi Title: Thanos...

Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?

Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?

Episode 48 · · 22:01

πŸ€— Paper Upvotes: 14 | cs.CL Authors: Jonathan Roberts, Kai Han, Samuel Albanie Title: Needle Threading: Can LLMs Follow...

DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation

Episode 47 · · 21:13

πŸ€— Paper Upvotes: 12 | cs.RO, cs.LG Authors: Peiqi Liu, Zhanqiu Guo, Mohit Warke, Soumith Chintala, Chris Paxton, Nur Muhammad Mahi Shafiullah, L...

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Episode 46 · · 27:45

πŸ€— Paper Upvotes: 12 | cs.CV Authors: Shehan Munasinghe, Hanan Gani, Wenqi Zhu, Jiale Cao, Eric Xing, Fahad Shahbaz Khan, Salman Khan ...

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Episode 45 · · 23:53

πŸ€— Paper Upvotes: 33 | cs.CV, cs.AI, cs.CL, cs.MM Authors: Dingjie Song, Sicheng Lai, Shunian Chen, Lichao Sun, Benyou Wang Title: ...

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Episode 44 · · 20:13

πŸ€— Paper Upvotes: 26 | cs.LG, cs.AI Authors: Antoine Grosnit, Alexandre Maraval, James Doran, Giuseppe Paolo, Albert Thomas, Refinath Shahul Hame...

Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models

Episode 43 · · 23:19

πŸ€— Paper Upvotes: 10 | cs.CL, cs.AI, cs.LG Authors: Zhijian Zhuo, Ya Wang, Yutao Zeng, Xiaoqing Li, Xun Zhou, Jinwen Ma Title: ...

Self-Consistency Preference Optimization

Self-Consistency Preference Optimization

Episode 42 · · 20:41

πŸ€— Paper Upvotes: 5 | cs.CL, cs.AI, cs.LG Authors: Archiki Prasad, Weizhe Yuan, Richard Yuanzhe Pang, Jing Xu, Maryam Fazel-Zarandi, Mohit Bansal...

From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond

From Medprompt to o1: Exploration of Run-Time Strategies for Medical Challenge Problems and Beyond

Episode 41 · · 17:13

πŸ€— Paper Upvotes: 3 | cs.CL Authors: Harsha Nori, Naoto Usuyama, Nicholas King, Scott Mayer McKinney, Xavier Fernandes, Sheng Zhang, Eric Horvitz...

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems

Episode 40 · · 21:04

πŸ€— Paper Upvotes: 34 | cs.IR Authors: Jiejun Tan, Zhicheng Dou, Wen Wang, Mang Wang, Weipeng Chen, Ji-Rong Wen Title: Ht...

LLaMo: Large Language Model-based Molecular Graph Assistant

LLaMo: Large Language Model-based Molecular Graph Assistant

Episode 39 · · 24:53

πŸ€— Paper Upvotes: 13 | cs.LG, cs.AI, q-bio.MN Authors: Jinyoung Park, Minseong Bae, Dohwan Ko, Hyunwoo J. Kim Title: LLa...