ICLR 2025 Workshop on
Scalable Optimization for Efficient and Adaptive Foundation Models
(SCOPE)


Monday, April 28th, 2025

collocated with ICLR 2025 in Singapore

Accepted Papers

The full list of accepted papers can be found in SCOPE - OpenReview.

Oral Accepts

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun, Disen Lan, Tong Zhu, Xiaoye Qu, Yu Cheng

STIV: Scalable Text and Image Conditioned Video Generation
Zongyu Lin, Wei Liu, Chen Chen, Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, Bowen Zhang, Cha Chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang

Overtrained Language Models Are Harder to Fine-Tune
Jacob Mitchell Springer, Sachin Goyal, Kaiyue Wen, Tanishq Kumar, Xiang Yue, Sadhika Malladi, Graham Neubig, Aditi Raghunathan

LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models
Sihwan Park, Doohyuk Jang, Sung-Yub Kim, Souvik Kundu, Eunho Yang

ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals
Utkarsh Saxena, Sayeh Sharify, Kaushik Roy, Xin Wang

SageAttention2: Efficient Attention with Smoothing Q and Per-thread Quantization
Jintao Zhang, Haofeng Huang, Pengle Zhang, Jia Wei, Jun Zhu, Jianfei Chen

Towards Infinite-Long Prefix in Transformers
Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang

M2R2: Efficient Transformers with Mixture of Multi-Rate Residuals
Nikhil Bhendawade, Mahyar Najibi, Devang Naik, Irina Belousova

Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity
Weixin Liang, Junhong Shen, Genghan Zhang, Ning Dong, Luke Zettlemoyer, Lili Yu

Poster Accepts

Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models
Andy Zhou, Ron Arel

Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong, Sayak Paul, Noah Lee, Kashif Rasul, James Thorne, Jongheon Jeong

The Curse of Depth in Large Language Models
Wenfang Sun, Xinyuan Song, Pengxiang Li, Lu Yin, Yefeng Zheng, Shiwei Liu

DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
Zhen Tan, Daize Dong, Xinyu Zhao, Jianing Cai, Jie Peng, Yu Cheng, Tianlong Chen

QMambaExtend: Improving Long-Context Extension of Memory-Efficient Mamba Models
Seyedarmin Azizi, Souvik Kundu, Mohammad Erfan Sadeghi, Massoud Pedram

Efficient Open-set Test Time Adaptation of Vision Language Models
Manogna Sreenivas, Soma Biswas

Effortless Efficiency: Low-Cost Pruning of Diffusion Models
Yang Zhang, Er Jin, Yanfei Dong, Ashkan Khakzar, Philip Torr, Johannes Stegmaier, Kenji Kawaguchi

Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Tianjin Huang, Haotian Hu, Zhenyu Zhang, Gaojie Jin, Xiang Li, Li Shen, Tianlong Chen, Lu Liu, Qingsong Wen, Zhangyang Wang, Shiwei Liu

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang, Ziquan Zhu, Gaojie Jin, Lu Liu, Zhangyang Wang, Shiwei Liu

Revisiting Associative Recall in Modern Recurrent Models
Destiny Okpekpe, Antonio Orvieto

PENCIL: Long Thoughts with Short Memory
Chenxiao Yang, Nathan Srebro, David McAllester, Zhiyuan Li

Universal LLM Routing with Correctness-Based Representation
Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Jeevesh Juneja, Zifeng Wang, Chen-Yu Lee, Pradeep Shenoy, Rina Panigrahy, Aditya Krishna Menon, Sanjiv Kumar

AsymLoRA: Unlocking the Power of Multimodal LLMs via Asymmetric LoRA
Xuyang Wei, Chunlin Tian, Li Li

Acceleration Multiple Heads Decoding for LLM via Dynamic Tree Attention
Zhendong Zhang

Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2
Steven Abreu, Sumit Bam Shrestha, Rui-Jie Zhu, Jason Eshraghian

Graph Low-Rank Adapters of High Regularity for Graph Neural Networks and Graph Transformers
Pantelis Papageorgiou, Haitz Sáez de Ocáriz Borde, Anastasis Kratsios, Michael M. Bronstein

MixER: Better Mixture of Experts Routing for Hierarchical Meta-Learning
Roussel Desmond Nzoyem, Grant Stevens, Amarpal Sahota, David A.W. Barton, Tom Deakin

Adaptive Length Image Tokenization via Recurrent Allocation
Shivam Duggal, Phillip Isola, Antonio Torralba, William T. Freeman

Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations
Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto

Relevance Isn't All You Need: Scaling RAG Systems With Inference-Time Compute Via Multi-Criteria Reranking
Will LeVine, Bijan Varjavand

Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters
Kevin Li, Sachin Goyal, João D. Semedo, J Zico Kolter

Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov, Alexander Nikulin, Ilya Zisman, Albina Klepach, Andrei Polubarov, Lyubaykin Nikita, Alexander Derevyagin, Igor Kiselev, Vladislav Kurenkov

FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models
Raghav Singhal, Kaustubh Ponkshe, Praneeth Vepakomma

ChameleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters
Kamer Ali Yuksel, Hassan Sawaf

A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck, Maximilian Baader, Martin Vechev

Conformal Transformations for Symmetric Power Transformers
Saurabh Kumar, Jacob Buckman, Carles Gelada, Xiaowen Zhang

Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
Fenglu Hong, Ravi Shanker Raju, Jonathan Lingjie Li, Bo Li, Urmish Thakker, Avinash Ravichandran, Swayambhoo Jain, Changran Hu

Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao, Xiaoyu Li, Zhao Song

Fast Gradient Computation for RoPE Attention in Almost Linear Time
Yifang Chen, Jiayan Huo, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song

Domain-Invariant Prompt Learning for Vision-Language Models
Arsham Gholamzadeh Khoee, Yinan Yu, Robert Feldt

Low-Rank Continual Personalization of Diffusion Models
Łukasz Staniszewski, Katarzyna Zaleska, Kamil Deja

Efficient Distributed Optimization under Heavy-Tailed Noise
Su Hyeong Lee, Manzil Zaheer, Tian Li

Enhanced Continual Learning of Vision-Language Models with Model Fusion
Haoyuan Gao, Zicong Zhang, Yuqi Wei, Linglan Zhao, Guilin Li, Yexin Li, Linghe Kong, Weiran Huang

RecurFormer: Not All Transformer Heads Need Self-Attention
Ruiqing Yan, Linghan Zheng, Xingbo Du, Han Zou, Yufeng Guo, Jianfei Yang

XAMBA: Enabling Efficient State Space Models on Resource-Constrained Neural Processing Units
Arghadip Das, Arnab Raha, Shamik Kundu, Soumendu Kumar Ghosh, Deepak Mathaikutty, Vijay Raghunathan

DARS: Robust Sparse Fine-Tuning with Regularized Subspace Disalignment
Sumin Park, Noseong Park

In-batch Ensemble Drafting: Robust Speculative Decoding for LVLMs
Minjae Lee, Wonjun Kang, Byeongkeun Ahn, Christian Classen, Minghao Yan, Hyung Il Koo, Kangwook Lee

OPPA: Optimizing Parallelism for Language Model Training
Apivich Hemachandra, Yizhan Han, See-Kiong Ng, Bryan Kian Hsiang Low

N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman, Alexander Nikulin, Viacheslav Sinii, Denis Tarasov, Lyubaykin Nikita, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov

Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Kaustubh Ponkshe, Raghav Singhal, Eduard Gorbunov, Alexey Tumanov, Samuel Horváth, Praneeth Vepakomma

AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting
Abdelhakim Benechehab, Vasilii Feofanov, Giuseppe Paolo, Albert Thomas, Maurizio Filippone, Balázs Kégl

UniForm: A Reuse Attention Mechanism for Efficient Transformers on Resource-Constrained Edge Devices
Seul-Ki Yeom, Tae-Ho Kim

KV Prediction for Improved Time to First Token
Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi Jin, Sachin Mehta, Mohammad Rastegari, Moin Nabi

Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing
Aviv Bick, Tobias Katsch, Nimit Sharad Sohoni, Arjun D Desai, Albert Gu

On Vanishing Variance in Transformer Length Generalization
Ruining Li, Gabrijel Boduljak, Jensen Zhou

Attention Is All You Need For Mixture-of-Depths Routing
Advait Gadhikar, Souptik Kumar Majumdar, Niclas Popp, Piyapat Saranrittichai, Martin Rapp, Lukas Schott

Context Is All You Need: Efficient Retrieval Augmented Generation for Domain Specific AI
Peixi Xiong, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain