ICLR 2025 Workshop on
Scalable Optimization for Efficient
and Adaptive Foundation Models
(SCOPE)
Monday, April 28th, 2025
collocated with ICLR 2025 in Singapore
The full list of accepted papers can be found in SCOPE - OpenReview.
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
Weigao Sun, Disen Lan, Tong Zhu, Xiaoye Qu, Yu Cheng
STIV: Scalable Text and Image Conditioned Video Generation
Zongyu Lin, Wei Liu, Chen Chen, Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, Bowen Zhang, Cha Chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang
Overtrained Language Models Are Harder to Fine-Tune
Jacob Mitchell Springer, Sachin Goyal, Kaiyue Wen, Tanishq Kumar, Xiang Yue, Sadhika Malladi, Graham Neubig, Aditi Raghunathan
LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models
Sihwan Park, Doohyuk Jang, Sung-Yub Kim, Souvik Kundu, Eunho Yang
ResQ: Mixed-Precision Quantization of Large Language Models with Low-Rank Residuals
Utkarsh Saxena, Sayeh Sharify, Kaushik Roy, Xin Wang
SageAttention2: Efficient Attention with Smoothing Q and Per-thread Quantization
Jintao Zhang, Haofeng Huang, Pengle Zhang, Jia Wei, Jun Zhu, Jianfei Chen
Towards Infinite-Long Prefix in Transformers
Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang
M2R2: Efficient Transformers with Mixture of Multi-Rate Residuals
Nikhil Bhendawade, Mahyar Najibi, Devang Naik, Irina Belousova
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity
Weixin Liang, Junhong Shen, Genghan Zhang, Ning Dong, Luke Zettlemoyer, Lili Yu
Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models
Andy Zhou, Ron Arel
Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
Jiwoo Hong, Sayak Paul, Noah Lee, Kashif Rasul, James Thorne, Jongheon Jeong
The Curse of Depth in Large Language Models
Wenfang Sun, Xinyuan Song, Pengxiang Li, Lu Yin, Yefeng Zheng, Shiwei Liu
DLO: Dynamic Layer Operation for Efficient Vertical Scaling of LLMs
Zhen Tan, Daize Dong, Xinyu Zhao, Jianing Cai, Jie Peng, Yu Cheng, Tianlong Chen
QMambaExtend: Improving Long-Context Extension of Memory-Efficient Mamba Models
Seyedarmin Azizi, Souvik Kundu, Mohammad Erfan Sadeghi, Massoud Pedram
Efficient Open-set Test Time Adaptation of Vision Language Models
Manogna Sreenivas, Soma Biswas
Effortless Efficiency: Low-Cost Pruning of Diffusion Models
Yang Zhang, Er Jin, Yanfei Dong, Ashkan Khakzar, Philip Torr, Johannes Stegmaier, Kenji Kawaguchi
Stable-SPAM: How to Train in 4-Bit More Stably than 16-Bit Adam
Tianjin Huang, Haotian Hu, Zhenyu Zhang, Gaojie Jin, Xiang Li, Li Shen, Tianlong Chen, Lu Liu, Qingsong Wen, Zhangyang Wang, Shiwei Liu
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Tianjin Huang, Ziquan Zhu, Gaojie Jin, Lu Liu, Zhangyang Wang, Shiwei Liu
Revisiting Associative Recall in Modern Recurrent Models
Destiny Okpekpe, Antonio Orvieto
PENCIL: Long Thoughts with Short Memory
Chenxiao Yang, Nathan Srebro, David McAllester, Zhiyuan Li
Universal LLM Routing with Correctness-Based Representation
Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Jeevesh Juneja, Zifeng Wang, Chen-Yu Lee, Pradeep Shenoy, Rina Panigrahy, Aditya Krishna Menon, Sanjiv Kumar
AsymLoRA: Unlocking the Power of Multimodal LLMs via Asymmetric LoRA
Xuyang Wei, Chunlin Tian, Li Li
Acceleration Multiple Heads Decoding for LLM via Dynamic Tree Attention
Zhendong Zhang
Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2
Steven Abreu, Sumit Bam Shrestha, Rui-Jie Zhu, Jason Eshraghian
Graph Low-Rank Adapters of High Regularity for Graph Neural Networks and Graph Transformers
Pantelis Papageorgiou, Haitz Sáez de Ocáriz Borde, Anastasis Kratsios, Michael M. Bronstein
MixER: Better Mixture of Experts Routing for Hierarchical Meta-Learning
Roussel Desmond Nzoyem, Grant Stevens, Amarpal Sahota, David A.W. Barton, Tom Deakin
Adaptive Length Image Tokenization via Recurrent Allocation
Shivam Duggal, Phillip Isola, Antonio Torralba, William T. Freeman
Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations
Sajad Movahedi, Felix Sarnthein, Nicola Muca Cirone, Antonio Orvieto
Relevance Isn't All You Need: Scaling RAG Systems With Inference-Time Compute Via Multi-Criteria Reranking
Will LeVine, Bijan Varjavand
Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters
Kevin Li, Sachin Goyal, João D. Semedo, J Zico Kolter
Yes, Q-learning Helps Offline In-Context RL
Denis Tarasov, Alexander Nikulin, Ilya Zisman, Albina Klepach, Andrei Polubarov, Lyubaykin Nikita, Alexander Derevyagin, Igor Kiselev, Vladislav Kurenkov
FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models
Raghav Singhal, Kaustubh Ponkshe, Praneeth Vepakomma
ChameleonLLM: Batch-Aware Dynamic Low-Rank Adaptation via Inference-Time Clusters
Kamer Ali Yuksel, Hassan Sawaf
A Unified Approach to Routing and Cascading for LLMs
Jasper Dekoninck, Maximilian Baader, Martin Vechev
Conformal Transformations for Symmetric Power Transformers
Saurabh Kumar, Jacob Buckman, Carles Gelada, Xiaowen Zhang
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
Fenglu Hong, Ravi Shanker Raju, Jonathan Lingjie Li, Bo Li, Urmish Thakker, Avinash Ravichandran, Swayambhoo Jain, Changran Hu
Grams: Gradient Descent with Adaptive Momentum Scaling
Yang Cao, Xiaoyu Li, Zhao Song
Fast Gradient Computation for RoPE Attention in Almost Linear Time
Yifang Chen, Jiayan Huo, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song
Domain-Invariant Prompt Learning for Vision-Language Models
Arsham Gholamzadeh Khoee, Yinan Yu, Robert Feldt
Low-Rank Continual Personalization of Diffusion Models
Łukasz Staniszewski, Katarzyna Zaleska, Kamil Deja
Efficient Distributed Optimization under Heavy-Tailed Noise
Su Hyeong Lee, Manzil Zaheer, Tian Li
Enhanced Continual Learning of Vision-Language Models with Model Fusion
Haoyuan Gao, Zicong Zhang, Yuqi Wei, Linglan Zhao, Guilin Li, Yexin Li, Linghe Kong, Weiran Huang
RecurFormer: Not All Transformer Heads Need Self-Attention
Ruiqing Yan, Linghan Zheng, Xingbo Du, Han Zou, Yufeng Guo, Jianfei Yang
XAMBA: Enabling Efficient State Space Models on Resource-Constrained Neural Processing Units
Arghadip Das, Arnab Raha, Shamik Kundu, Soumendu Kumar Ghosh, Deepak Mathaikutty, Vijay Raghunathan
DARS: Robust Sparse Fine-Tuning with Regularized Subspace Disalignment
Sumin Park, Noseong Park
In-batch Ensemble Drafting: Robust Speculative Decoding for LVLMs
Minjae Lee, Wonjun Kang, Byeongkeun Ahn, Christian Classen, Minghao Yan, Hyung Il Koo, Kangwook Lee
OPPA: Optimizing Parallelism for Language Model Training
Apivich Hemachandra, Yizhan Han, See-Kiong Ng, Bryan Kian Hsiang Low
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
Ilya Zisman, Alexander Nikulin, Viacheslav Sinii, Denis Tarasov, Lyubaykin Nikita, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
Kaustubh Ponkshe, Raghav Singhal, Eduard Gorbunov, Alexey Tumanov, Samuel Horváth, Praneeth Vepakomma
AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting
Abdelhakim Benechehab, Vasilii Feofanov, Giuseppe Paolo, Albert Thomas, Maurizio Filippone, Balázs Kégl
UniForm: A Reuse Attention Mechanism for Efficient Transformers on Resource-Constrained Edge Devices
Seul-Ki Yeom, Tae-Ho Kim
KV Prediction for Improved Time to First Token
Maxwell Horton, Qingqing Cao, Chenfan Sun, Yanzi Jin, Sachin Mehta, Mohammad Rastegari, Moin Nabi
Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing
Aviv Bick, Tobias Katsch, Nimit Sharad Sohoni, Arjun D Desai, Albert Gu
On Vanishing Variance in Transformer Length Generalization
Ruining Li, Gabrijel Boduljak, Jensen Zhou
Attention Is All You Need For Mixture-of-Depths Routing
Advait Gadhikar, Souptik Kumar Majumdar, Niclas Popp, Piyapat Saranrittichai, Martin Rapp, Lukas Schott
Context Is All You Need: Efficient Retrieval Augmented Generation for Domain Specific AI
Peixi Xiong, Chaunte W. Lacewell, Sameh Gobriel, Nilesh Jain