Recent years have witnessed major advances in data science, driven by increasingly large neural networks. While these models hold great promise, modern machine learning applications motivate fine-grained performance control, spanning fairness, robustness, and accuracy. This talk will introduce new strategies to ensure these models behave the way we want.
I will first discuss optimizing fairness objectives on datasets with imbalanced or sensitive groups. We observe that a large model can achieve seemingly perfect fairness on training data but dramatically fail at the test-time. We show that this is due to overfitting as well as ineffectiveness of the traditional approaches. To address this, we propose a new family of fairness-seeking loss functions that better guide the training process and use it to achieve state-of-the-art performance.
As a key ingredient of generalizable learning, I will motivate bilevel problem formulations which help avoid overfitting, select the right model, and overcome distribution shift between training and validation data. Finally, I will introduce new bilevel optimization methods for federated learning across decentralized clients. These methods achieve provable communication and computation efficiency, overcome client heterogeneity, and gracefully specialize to min-max optimization.
I will conclude the talk with a discussion of future research on the theoretical foundations of deep learning, resource-efficient ML architectures, and data-driven control problems.