Research papers

Link to DBLP and Google Scholar. You can find the code for reproducing our empirical work in the following GitHub. Acknowledgement: Our research has been generously supported by funds from the National Science Foundation and from JP Morgan Chase.

By year | By topic | Selected papers

Representative publications

These selected publications include both my job market paper, more recent work with my PhD students at Northeastern, and collaborative work.

Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
Yuanzhi Li*, Tengyu Ma*, and Hongyang R. Zhang*
Conference on Learning Theory (COLT), 2018. Best Paper Award

In this paper, we showed that in the over-parameterized matrix sensing problem, gradient descent starting from a small random initialization converges to the ground-truth matrix without explicit regularization added to the loss objective. We recently utilized techniques from this line of work to tackle matrix completion from ultra-sparse samples.

Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees
Haotian Ju†, Dongyue Li†, and Hongyang R. Zhang
International Conference on Machine Learning (ICML), 2022

We analyze the supervised fine-tuning algorithm, which starts from a pretrained model instead of a random initialization. We identify a Hessian-based generalization measure (derived from a PAC-Bayes noise perturbation analysis) that gives non-vacuous bound on the generalization gap, and empirically validate this bound for various neural network architectures.

Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion
Haotian Ju†, Dongyue Li†, Aneesh Sharma, and Hongyang R. Zhang
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023

We improve the state-of-the-art generalization bound for graph neural networks, specifically message-passing neural networks. We prove bounds based on the spectral norm of the graph diffusion matrix, whereas prior work shows bounds depending on the maximum degree of the graphs.

Identification of Negative Transfers in Multitask Learning Using Surrogate Models
Dongyue Li†, Huy L. Nguyen, and Hongyang R. Zhang
Transactions on Machine Learning Research (TMLR), 2023. Featured Certification

We rigorously formulate the problem of accurately identifying negative transfer in multitask learning. We introduce linear surrogate models for predicting the outcomes of multitask training and prove linear sample complexity bounds. This problem formulation has generated numerous follow-up works including our latest work on task attribution at ICLR'26.

Precise High-Dimensional Asymptotics for Quantifying Heterogeneous Transfers
Fan Yang*, Hongyang R. Zhang*, Sen Wu, Christopher Ré, and Weijie Su
Journal of Machine Learning Research (JMLR), 2025

We give a precise quantization of negative transfer in the proportional limit regime for two linear regression tasks. We rigorously prove a phase transition from positive to negative transfer as the number of source-task sample increases, improving upon our initial analysis at ICLR'20.

* represents alphabetical authorship.

† indicates a Northeastern student co-author who was advised by me.