Structural information and domain knowledge are two necessary components of training a good machine learning model to maximize the performance in the targeted application. This tutorial summarizes how to use optimization as a differentiable building block to incorporate the non-trivial operational information in applications into machine learning models.
Machine learning models have achieved significant success in many industrial applications and social challenges, including natural language processing, computer vision, time series analysis, and recommendation systems. To adapt to different applications, incorporating structural information and domain knowledge in applications into machine learning models is an important element of the training process. But it often relies on fine-tuning and feature-engineering without a systematic approach to adapt to various applications. On the other hand, operational research is an application-driven approach, where optimization problems are formulated based on the knowledge and constraints of targeted applications to derive actionable solutions. Optimization formulations can capture structural information and domain knowledge in applications, but the non-differentiability and the complex operational processes in optimization make it hard to integrate into machine learning models.
This tutorial starts from the foundation of differentiable optimization to discuss how to convert optimization into differentiable building blocks to use in larger architectures. The direct benefit of differentiable optimization is to integrate structural information and domain knowledge in optimization formulations into machine learning models. The first part of the tutorial covers a variety of applications using optimization as differentiable units in machine learning models to properly handle operational tasks in reinforcement learning, control, optimal transport, and geometry. Experiments demonstrate that differentiable optimization can model operational processes more efficiently than neural networks. The second part of the tutorial focuses on integrating various industrial and social challenges as differentiable optimization layers into the training pipeline. This integration of machine learning models and application-driven optimization leads to end-to-end learning, decision-focused learning, that trains models to directly optimize the performance in targeted applications. Lastly, the tutorial concludes with a series of applications of differentiable optimization and its computational limitations with various open directions left to the audiences.
[Part 1] "Differentiable optimization-based modeling for machine learning", Brandon Amos
[Part 2] "Decision-focused learning: theory, applications, challenges", Andrew Perrault
[Part 3] "Scalability challenges and solutions to decision-focused learning", Kai Wang
I am a Ph.D. candidate studying Computer Science at Harvard University working with Professor Milind Tambe. My research focuses on using artificial intelligence to resolve various social challenges, including wildlife conservation and healthcare challenges. I formulate these social challenges as multi-agent systems and learn the unknown information from data using machine learning. I apply and generalize a new learning algorithm, decision-focused learning, to integrate the domain knowledge from social challenges into machine learning pipelines.
I am an assistant professor in the Department of Computer Science and Engineering at The Ohio State University. My research focuses on multi-agent interactions that arise in combating societal challenges, especially in the areas of conservation and public health. These interactions often involve challenges of uncertainty in the environment and the utility functions of the agents, necessitating approaches that handle scarce data. To achieve this end, I combine methodologies from game theory and multi-agent systems with machine learning, robust planning and optimization techniques.
I am a research scientist at Facebook AI (FAIR) in NYC and study foundational topics in machine learning and optimization, recently involving reinforcement learning, control, optimal transport, and geometry. My research is on learning systems that understand and interact with our world and focuses on integrating structural information and domain knowledge into these systems to represent non-trivial reasoning operations. A key theme of my work in this space involves the use of optimization as a differentiable building block in larger architectures that are end-to-end learned.