TopicsDeep Learning on Graphs and Manifolds
Michael Bronstein (PhD 2007, Technion, Israel) is a professor at Imperial College London, where he holds the Chair in Machine Learning and Pattern Recognition and Royal Society Wolfson Merit Award. He holds/has held visiting appointments at Stanford, Harvard, MIT, and TUM. Michael’s main research interest is in theoretical and computational methods for geometric data analysis. He is a Fellow of IEEE and IAPR, and ACM Distinguished Speaker. He is the recipient of four ERC grants, two Google Faculty awards, and the 2018 Facebook Computational Social Science award. Besides academic work, Michael was a co-founder and technology executive at Novafora (2005-2009) developing large-scale video analysis methods, and one of the chief technologists at Invision (2009-2012) developing low-cost 3D sensors. Following the multi-million acquisition of Invision by Intel in 2012, Michael has been one of the key developers of the Intel RealSense technology in the role of Principal Engineer. His most recent venture is Fabula AI, a startup dedicated to algorithmic detection of fake news using geometric deep learning.
In the past decade, deep learning methods have achieved unprecedented performance on a broad range of problems in various fields from computer vision to speech recognition. So far research has mainly focused on developing deep learning methods for Euclidean-structured data. However, many important applications have to deal with non-Euclidean structured data, such as graphs and manifolds. Such data are becoming increasingly important in computer graphics and 3D vision, sensor networks, drug design, biomedicine, high energy physics, recommendation systems, and social media analysis. The adoption of deep learning in these fields has been lagging behind until recently, primarily since the non-Euclidean nature of objects dealt with makes the very definition of basic operations used in deep networks rather elusive. In this talk, I will introduce the emerging field of geometric deep learning on graphs and manifolds, overview existing solutions and outline the key difficulties and future research directions. As examples of applications, I will show problems from the domains of computer vision, graphics, and fake news detection.
TopicsConstraint-Based Approaches to Machine Learning
Marco Gori received the Ph.D. degree in 1990 from Università di Bologna, Italy, while working partly as a visiting student at the School of Computer Science, McGill University – Montréal. In 1992, he became an associate professor of Computer Science at Università di Firenze and, in November 1995, he joint the Università di Siena, where he is currently full professor of computer science. His main interests are in machine learning, computer vision, and natural language processing. He was the leader of the WebCrow project supported by Google for automatic solving of crosswords, that outperformed human competitors in an official competition within the ECAI-06 conference. He has just published the book “Machine Learning: A Constrained-Based Approach,” where you can find his view on the field.
He has been an Associated Editor of a number of journals in his area of expertise, including The IEEE Transactions on Neural Networks and Neural Networks, and he has been the Chairman of the Italian Chapter of the IEEE Computational Intelligence Society and the President of the Italian Association for Artificial Intelligence. He is a fellow of the ECCAI (EurAI) (European Coordinating Committee for Artificial Intelligence), a fellow of the IEEE, and of IAPR. He is in the list of top Italian scientists kept by VIA-Academy.
Learning over temporal environments can be formulated as a variational problem with subsidiary conditions. In particular, one can regard the neurons as single constraints and minimize an appropriate functional risk under the satisfaction of the neural architectural constraints. A method for discovering the Lagrangian multipliers is given, so as we return to Backpropagation in the case of a Lagrangian (loss) term which only depends on the weights. More generally, the solution of the problem arises as a neural differential equation on the weights that overcomes the well-known problem of memory space of Backpropagation Through Time. This suggests that classic perceptual problems in speech and vision can likely benefit from fully exploiting the central role of time coherence that arises from this formulation of learning.
TopicsKernel Methods to Reveal Properties and Relations in Data
Arthur Gretton is a Professor with the Gatsby Computational Neuroscience Unit, CSML, UCL, which he joined in 2010. He received degrees in physics and systems engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He worked from 2002-2012 at the MPI for Biological Cybernetics, and from 2009-2010 at the Machine Learning Department, Carnegie Mellon University.
Arthur’s research interests in machine learning include kernel methods, statistical learning theory, nonparametric hypothesis testing, and generative modelling. He has been an associate editor at IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 to 2013, an Action Editor for JMLR since April 2013, a member of the NeurIPS Program Committee in 2008 and 2009, a Senior Area Chair for NeurIPS in 2018, an Area Chair for ICML in 2011 and 2012, and a member of the COLT Program Committee in 2013. Arthur was co-chair of AISTATS in 2016 (with Christian Robert), co-tutorials chair of ICML in 2018 (with Ruslan Salakhutdinov), co-workshop chair for ICML 2019 (with Honglak Lee), co-organiser of the Dali workshop in 2019 (with Krikamol Muandet and Shakir Mohammed), and co-organsier of MLSS 2019 in London (with Marc Deisenroth).
Generative adversarial networks (GANs) use neural networks as generative models, creating realistic samples that mimic real-life reference samples (for instance, images of faces, bedrooms, and more). These networks require an adaptive critic function while training, to teach the networks how to move improve their samples to better match the reference data. I will describe a kernel divergence measure, the maximum mean discrepancy, which represents one such critic function. With gradient regularisation, the MMD is used to obtain strong performance on challenging image generation tasks, including 160 × 160 CelebA and 64 × 64 ImageNet. In addition to adversarial network training, I’ll discuss the challenge of benchmarking GAN performance.
TopicsGeneral Reinforcement Learning Algorithms
Arthur Guez is currently a research scientist at DeepMind since 2014, initially in London and now based in Montreal. He focuses on reinforcement learning methods and played a key role in the development of AlphaGo and its later generalizations. He studied for his PhD at the Gatsby Computational Neuroscience Unit (UCL, London) under the supervision of Peter Dayan and David Silver. Before that, he studied for his undergraduate and MSc at McGill University in Montreal, under the supervision of Joelle Pineau.
Carefully reasoning about the consequences of one’s actions is one of the hallmarks of intelligence. Traditionally, we distinguish model-free approaches, that gradually learn a reactive policy by trial-and-error, from model-based approaches that leverage an environment model to plan using more computation steps (e.g., by using the true model in AlphaGo/Zero). After introducing these concepts, I will talk about how that distinction can be blurred in different ways, for example with model-free agents that learn to plan without explicit models or with learned models being employed within a model-free architecture. I will discuss how that suggests new ways to obtain the desirable effects of planning (the foresight) while forgoing some of the classic prescribed mechanisms (e.g. state-based lookahead).
TopicsMultiobjective Optimization & Decision Analytics
Kaisa Miettinen is Professor of Industrial Optimization of the University of Jyvaskyla. Her research interests include theory, methods, applications and software of nonlinear multiobjective optimization including interactive and evolutionary approaches. She heads the Research Group on Industrial Optimization and is the director of the thematic research area called Decision Analytics utilizing Causal Models and Multiobjective Optimization (DEMO, www.jyu.fi/demo). She has authored over 170 refereed journal, proceedings and collection papers, edited 14 proceedings, collections and special issues and written a monograph Nonlinear Multiobjective Optimization. She is a member of the Finnish Academy of Science and Letters, Section of Science and the Immediate-Past President of the International Society on Multiple Criteria Decision Making (MCDM). She belongs to the editorial boards of eight international. She has previously worked at IIASA, International Institute for Applied Systems Analysis in Austria, KTH Royal Institute of Technology in Stockholm, Sweden and Helsinki School of Economics, Finland. In 2017, she received the Georg Cantor Award of the International Society on MCDM for independent inquiry in developing innovative ideas in the theory and methodology.
Digitalization enables collecting and having access to different types of data. We can use descriptive analytics to understand the data or predictive analytics to make predictions. However, this does not necessarily make the most of the data. To recommendations based on the data, we need prescriptive or decision analytics. We can fit models in the data and derive decision problems. If we have multiple conflicting objectives to be optimized, we get a multiobjective optimization problem to be solved to support decision making. In this talk, we discuss different elements of a seamless chain from data to decision support.
Lot sizing is an example of a data-driven optimization problem, where a decision maker needs support in production planning and inventory management. We formulate a multiobjective optimization problem relying on stochastic demand forecasts based on sales data available and solve it with interactive multiobjective optimization methods. In interactive methods, a decision maker directs the search for the best balance between the conflicting objectives by providing preference information. In this way, (s)he gains insight in the phenomena involved, learns about what kind of solutions are available and learns about the feasibility of one’s preferences. We demonstrate how switching the interactive method can be beneficial in different stages of the solution process by first applying a navigation and then a classification based method.
TopicsIntelligent Autonomous Systems, Robotics & Machine Learning
Jan Peters is a full professor (W3) for Intelligent Autonomous Systems at the Computer Science Department ofthe Technische Universitaet Darmstadt and at the same time a senior research scientist and group leader at the Max-Planck Institute for Intelligent Systems, where he heads the interdepartmental Robot Learning Group. Jan Peters has received the Dick Volz Best 2007 US PhD Thesis Runner-Up Award, the Robotics: Science & Systems – Early Career Spotlight, the INNS Young Investigator Award, and the IEEE Robotics & Automation Society’s Early Career Award as well as numerous best paper awards. In 2015, he received an ERC Starting Grant and in2019, he was appointed as an IEEE Fellow.
Autonomous robots that can assist humans in situations of daily life have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. A first step towards this goal is to create robots that can learn tasks triggered by environmental context or higher level instruction. However, learning techniques have yet to live up to this promise as only few methods manage to scale to high-dimensional manipulator or humanoid robots. In this talk, we investigate a general framework suitable for learning motor skills in robotics which is based on the principles behind many analytical robotics approaches. It involves generating a representation of motor skills by parameterized motor primitive policies acting as building blocks of movement generation, and a learned task execution module that transforms these movements into motor commands. We discuss learning on three different levels of abstraction, i.e., learning for accurate control is needed to execute, learning of motor primitives is needed to acquire simple movements, and learning of the task-dependent “hyperparameters” of these motor primitives allows learning complex tasks. We discuss task-appropriate learning approaches for imitation learning, model learning and reinforcement learning for robots with many degrees of freedom. Empirical evaluations on a several robot systems illustrate the effectiveness and applicability to learning control on an anthropomorphic robot arm. These robot motor skills range from toy examples (e.g., paddling a ball, ball-in-a-cup) to playing robot table tennis against a human being and manipulation of various objects.
TopicsCombinatorial Optimization & Heuristics
Principal Research Scientist at Amazon, Seattle.
Mauricio G. C. Resende grew up in Rio de Janeiro (BR), West Lafayette (IN-US), and Amherst (MA-US). He did his undergraduate training in electrical engineering (systems engineering concentration) at the Pontifical Catholic U. of Rio de Janeiro. He obtained an MS in operations research from Georgia Tech and a PhD in operations research from the U. of California, Berkeley. He is most known for his work with metaheuristics, in particular GRASP and biased random-key genetic algorithms, as well as for his work with interior point methods for linear programming and network flows. Dr. Resende has published over 200 papers on optimization and holds 15 U.S. patents. He has edited four handbooks, including the “Handbook of Heuristics,” the “Handbook of Applied Optimization,” and the “Handbook of Optimization in Telecommunications,” and is coauthor of the book “Optimization by GRASP.” He sits on the editorial boards of several optimization journals, including Networks, Discrete Optimization, J. of Global Optimization, R.A.I.R.O., and International Transactions in Operational Research.
Prior to joining Amazon.com in 2014 as a Principal Research Scientist in the transportation area, Dr. Resende was a Lead Inventive Scientist at the Mathematical Foundations of Computing Department of AT&T Bell Labs and at the Algorithms and Optimization Research Department of AT&T Labs Research in New Jersey. Since 2016, Dr. Resende is also Affiliate Professor of Industrial and Systems Engineering at the University of Washington in Seattle. He is a Fellow of INFORMS.
A biased random-key genetic algorithm (BRKGA) is a general search metaheuristic for finding optimal or near-optimal solutions to hard combinatorial optimization problems. It is derived from the random-key genetic algorithm of Bean (1994), differing in the way solutions are combined to produce offspring. BRKGAs have three key features that specialize genetic algorithms: 1) A fixed chromosome encoding using a vector of N random keys or alleles over the real interval [0, 1), where the value of N depends on the instance of the optimization problem; 2) A well-defined evolutionary process adopting parameterized uniform crossover to generate offspring and thus evolve the population; 3) The introduction of new chromosomes called mutants in place of the mutation operator usually found in evolutionary algorithms. Such features simplify and standardize the metaheuristic with a set of self-contained tasks from which only one is problem-dependent: that of decoding a chromosome, i.e. using the keys to construct a solution to the underlying optimization problem, from which the objective function value or fitness can be computed. In this talk we review the basic components of a BRKGA and introduce an Application Programming Interface (API) for quick implementations of BRKGA heuristics. We then apply the framework to a number of packing and layout problems, including 1) 2D and 3D constrained orthogonal packing 2) 2D and 3D bin packing 3) Unequal area facility layout We conclude with a brief review of other domains where BRKGA have been applied.
Dr. Richard E. Turner is a Reader in Machine Learning at the University of Cambridge and a Visiting Researcher at Microsoft Research Cambridge. His research fuses probabilistic machine learning and deep learning to develop robust, data-efficient, flexible and automated learning systems. Richard helps lead Cambridge’s renowned Machine Learning Group, the Machine Learning and Machine Intelligence MPhil, the Centre for Doctoral Training in AI for Environmental Risk, and the Cambridge Big Data Strategic Initiative. He studied for his PhD at the Gatsby Computational Neuroscience Unit at UCL and spent his Postdoctoral Fellowship at New York University in the Laboratory for Computational Vision. He has been awarded the Cambridge Students’ Union Teaching Award for Lecturing and his work has featured on BBC Radio 5 Live’s The Naked Scientist, BBC World Service’s Click and in Wired Magazine.
Past Keynote Speakers
The Keynote Speakers of the previous editions:
- Jörg Bornschein, DeepMind, London, UK
- Nello Cristianini, University of Bristol, UK
- Peter Flach, University of Bristol, UK, and EiC of the Machine Learning Journal
- Yi-Ke Guo, Imperial College London, UK
- George Karypis, University of Minnesota, USA
- Vipin Kumar, University of Minnesota, USA
- George Michailidis, University of Florida, USA
- Stephen Muggleton, Imperial College London, UK
- Panos Pardalos, University of Florida, USA
- Jun Pei, Hefei University of Technology, China
- Tomaso Poggio, MIT, USA
- Andrey Raygorodsky, Moscow Institute of Physics and Technology, Russia
- Ruslan Salakhutdinov, Carnegie Mellon University, USA, and AI Research at Apple
- Vincenzo Sciacca, Almawave, Italy
- My Thai, University of Florida, USA