Bayesian optimization

Bayesian optimization is a sequential design strategy for global optimization of black-box functions[1] that doesn't require derivatives.

History

The term is generally attributed to Jonas Mockus and is coined in his work from a series of publications on global optimization in the 1970s and 1980s.[2][3][4]

Strategy

Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a prior over it. The prior captures our beliefs about the behaviour of the function. After gathering the function evaluations, which are treated as data, the prior is updated to form the posterior distribution over the objective function. The posterior distribution, in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines what the next query point should be.

Examples

Examples of acquisition functions include probability of improvement, expected improvement, Bayesian expected losses, upper confidence bounds (UCB), Thompson sampling and mixtures of these.[5] They all trade-off exploration and exploitation so as to minimize the number of function queries. As such, Bayesian optimization is well suited for functions that are very expensive to evaluate.

Solution methods

The maximum of the acquisition function is typically found by resorting to discretization or by means of an auxiliary optimizer.

Applications

The approach has been applied to solve a wide range of problems,[6] including learning to rank,[7] interactive animation,[8] robotics,[9][10][11] sensor networks,[12][13] automatic algorithm configuration,[14] automatic machine learning toolboxes,[15][16][17] reinforcement learning, planning, visual attention, architecture configuration in deep learning, etc.

See also

References

  1. Jonas Mockus (2013). Bayesian approach to global optimization: theory and applications. Kluwer Academic.
  2. Jonas Mockus: On Bayesian Methods for Seeking the Extremum. Optimization Techniques 1974: 400-404
  3. Jonas Mockus: On Bayesian Methods for Seeking the Extremum and their Application. IFIP Congress 1977: 195-200
  4. J. Mockus, Bayesian Approach to Global Optimization. Kluwer Academic Publishers, Dordrecht, 1989
  5. Matthew W. Hoffman, Eric Brochu, Nando de Freitas: Portfolio Allocation for Bayesian Optimization. Uncertainty in Artificial Intelligence (UAI): 327–336 (2011)
  6. Eric Brochu, Vlad M. Cora, Nando de Freitas: A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. CoRR abs/1012.2599 (2010)
  7. Eric Brochu, Nando de Freitas, Abhijeet Ghosh: Active Preference Learning with Discrete Choice Data. NIPS 2007
  8. Eric Brochu, Tyson Brochu, Nando de Freitas: A Bayesian Interactive Optimization Approach to Procedural Animation Design. Symposium on Computer Animation 2010: 103–112
  9. Daniel J. Lizotte, Tao Wang, Michael H. Bowling, Dale Schuurmans: Automatic Gait Optimization with Gaussian Process Regression. IJCAI 2007: 944–949
  10. Ruben Martinez-Cantin, Nando de Freitas, Eric Brochu, Jose Castellanos and Arnaud Doucet. A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots. Volume 27, Issue 2, pp 93–103 (2009)
  11. Scott Kuindersma, Roderic Grupen, and Andrew Barto. Variable Risk Control via Stochastic Optimization. International Journal of Robotics Research, volume 32, number 7, pp 806–825 (2013)
  12. Niranjan Srinivas, Andreas Krause, Sham M. Kakade, Matthias W. Seeger: Information-Theoretic Regret Bounds for Gaussian Process Optimization in the Bandit Setting. IEEE Transactions on Information Theory 58(5):3250–3265 (2012)
  13. Roman Garnett, Michael A. Osborne, Stephen J. Roberts: Bayesian optimization for sensor set selection. IPSN 2010: 209–219
  14. Frank Hutter, Holger Hoos, and Kevin Leyton-Brown (2011). Sequential model-based optimization for general algorithm configuration, Learning and Intelligent Optimization
  15. J. Bergstra, D. Yamins, D. D. Cox (2013). Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms. Proc. SciPy 2013.
  16. Chris Thornton, Frank Hutter, Holger H. Hoos, Kevin Leyton-Brown: Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. KDD 2013: 847–855
  17. Jasper Snoek, Hugo Larochelle and Ryan Prescott Adams. Practical Bayesian Optimization of Machine Learning Algorithms. Neural Information Processing Systems, 2012

External links

This article is issued from Wikipedia - version of the 9/10/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.