Minhak Song | KAIST

About Me

Hi! I am an undergraduate student majoring in Mathematical Sciences at KAIST, where I am fortunate to be advised by Chulhee Yun. I was also a visiting student at the University of Washington, working with Simon Shaolei Du and Sewoong Oh.

My research interests broadly span the foundations of deep learning, with the goal of bridging theory and practice. Recently, I have been focusing on understanding the optimization dynamics in deep learning, particularly in the pre-training and post-training of language models, and leveraging these insights to design principled and efficient optimization algorithms.

Research Interests

DL/RL/LLM Theory
Optimization

News

[Feb. 2026] Our paper won the Best Student Paper Award at ALT 2026.
[Jan. 2026] Our paper on the implicit bias of per-sample Adam on separable data is accepted to ICLR 2026.
[Dec. 2025] Our paper on the theory of the spurious alignment of SGD in ill-conditioned high-dimensional quadratics is accepted to ALT 2026.
[Oct. 2025] I was selected as a Top Reviewer (top 8% of reviewers) at NeurIPS 2025.
[Sep. 2025] Our paper on understanding the benefit of Schedule-Free Optimizer through the river-valley loss landscape is accepted to NeurIPS 2025.
[Jun. 2025] I joined Sewoong Oh’s group as a visiting student researcher at the University of Washington.
[May. 2025] Our paper on how the datasets, network architectures, and optimizers influence progressive sharpening is accepted to ICML 2025.
[Jan. 2025] Our paper on identifying the spurious alignment of SGD in an ill-conditioned valley (a.k.a. river-valley) loss landscape is accepted to ICLR 2025.
[Jan. 2025] I joined Simon Shaolei Du’s group as a visiting student researcher at the University of Washington.
[Jan. 2024] Our paper on the optimization characteristics of linear Transformers is accepted to ICLR 2024.
[Sep. 2023] Our paper on understanding the Edge of Stability in deep learning is accepted to NeurIPS 2023.

Publications

(* denotes equal contribution)

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Ruizhe Shi*, Minhak Song*, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon S. Du

Manuscript

arXiv
Implicit Bias of Per-Sample Adam on Separable Data: Departure from the Full-Batch Regime

Beomhan Baek*, Minhak Song*, Chulhee Yun

International Conference on Learning Representations (ICLR) 2026

NeurIPS 2025 Workshop on Optimization for Machine Learning

Paper arXiv
Suspicious Alignment of SGD: A Fine-Grained Step Size Condition Analysis

Shenyang Deng, Boyao Liao, Zhuoli Ouyang, Tianyu Pang, Minhak Song, Yaoqing Yang

International Conference on Algorithmic Learning Theory (ALT) 2026 (Best Student Paper)

Paper arXiv
Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training

Minhak Song*, Beomhan Baek*, Kwangjun Ahn, Chulhee Yun

Neural Information Processing Systems (NeurIPS) 2025

ICML 2025 Workshop on High-dimensional Learning Dynamics

Paper arXiv
Understanding Sharpness Dynamics in NN Training with a Minimalist Example: The Effects of Dataset Difficulty, Depth, Stochasticity, and More

Geonhui Yoo, Minhak Song, Chulhee Yun

International Conference on Machine Learning (ICML) 2025

Paper arXiv
Does SGD really happen in tiny subspaces?

Minhak Song, Kwangjun Ahn, Chulhee Yun

International Conference on Learning Representations (ICLR) 2025

ICML 2024 Workshop on High-dimensional Learning Dynamics

Paper arXiv
Linear attention is (maybe) all you need (to understand Transformer optimization)

Kwangjun Ahn*, Xiang Cheng*, Minhak Song*, Chulhee Yun, Ali Jadbabaie, Suvrit Sra

International Conference on Learning Representations (ICLR) 2024

NeurIPS 2023 Workshop on Mathematics of Modern Machine Learning (Oral)

Paper arXiv
Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

Minhak Song, Chulhee Yun

Neural Information Processing Systems (NeurIPS) 2023

Paper arXiv

Services

Conference/Workshop Reviewer

Neural Information Processing Systems (NeurIPS) 2024-2025 (Selected as a Top Reviewer at NeurIPS 2025)

International Conference on Learning Representations (ICLR) 2025-2026

International Conference on Machine Learning (ICML) 2025-2026

International Conference on Artificial Intelligence and Statistics (AISTATS) 2025

ICML 2025 Workshop on High-dimensional Learning Dynamics

ICML 2026 Workshop on Scientific Methods for Understanding Deep Learning