Keynote Speakers

Michael Bronstein

Imperial College London, UK

Topics

Deep Learning on Graphs and Manifolds

Biography

Michael Bronstein (PhD 2007, Technion, Israel) is a professor at Imperial College London, where he holds the Chair in Machine Learning and Pattern Recognition and Royal Society Wolfson Merit Award. He holds/has held visiting appointments at Stanford, Harvard, MIT, and TUM. Michael’s main research interest is in theoretical and computational methods for geometric data analysis. He is a Fellow of IEEE and IAPR, and ACM Distinguished Speaker. He is the recipient of four ERC grants, two Google Faculty awards, and the 2018 Facebook Computational Social Science award. Besides academic work, Michael was a co-founder and technology executive at Novafora (2005-2009) developing large-scale video analysis methods, and one of the chief technologists at Invision (2009-2012) developing low-cost 3D sensors. Following the multi-million acquisition of Invision by Intel in 2012, Michael has been one of the key developers of the Intel RealSense technology in the role of Principal Engineer. His most recent venture is Fabula AI, a startup dedicated to algorithmic detection of fake news using geometric deep learning.

Talk

Deep learning on graphs and manifolds: going beyond Euclidean data

In the past decade, deep learning methods have achieved unprecedented performance on a broad range of problems in various fields from computer vision to speech recognition. So far research has mainly focused on developing deep learning methods for Euclidean-structured data. However, many important applications have to deal with non-Euclidean structured data, such as graphs and manifolds. Such data are becoming increasingly important in computer graphics and 3D vision, sensor networks, drug design, biomedicine, high energy physics, recommendation systems, and social media analysis. The adoption of deep learning in these fields has been lagging behind until recently, primarily since the non-Euclidean nature of objects dealt with makes the very definition of basic operations used in deep networks rather elusive. In this talk, I will introduce the emerging field of geometric deep learning on graphs and manifolds, overview existing solutions and outline the key difficulties and future research directions. As examples of applications, I will show problems from the domains of computer vision, graphics, and fake news detection.

Marco Gori

University of Siena, Italy

Topics

Constraint-Based Approaches to Machine Learning

Biography

Marco Gori received the Ph.D. degree in 1990 from Università di Bologna, Italy, while working partly as a visiting student at the School of Computer Science, McGill University – Montréal. In 1992, he became an associate professor of Computer Science at Università di Firenze and, in November 1995, he joint the Università di Siena, where he is currently full professor of computer science. His main interests are in machine learning, computer vision, and natural language processing. He was the leader of the WebCrow project supported by Google for automatic solving of crosswords, that outperformed human competitors in an official competition within the ECAI-06 conference. He has just published the book “Machine Learning: A Constrained-Based Approach,” where you can find his view on the field.

He has been an Associated Editor of a number of journals in his area of expertise, including The IEEE Transactions on Neural Networks and Neural Networks, and he has been the Chairman of the Italian Chapter of the IEEE Computational Intelligence Society and the President of the Italian Association for Artificial Intelligence. He is a fellow of the ECCAI (EurAI) (European Coordinating Committee for Artificial Intelligence), a fellow of the IEEE, and of IAPR. He is in the list of top Italian scientists kept by VIA-Academy.

Talk

Backpropagation and Lagrangian Multipliers – New Frontiers of Learning

Learning over temporal environments can be formulated as a variational problem with subsidiary conditions. In particular, one can regard the neurons as single constraints and minimize an appropriate functional risk under the satisfaction of the neural architectural constraints. A method for discovering the Lagrangian multipliers is given, so as we return to Backpropagation in the case of a Lagrangian (loss) term which only depends on the weights. More generally, the solution of the problem arises as a neural differential equation on the weights that overcomes the well-known problem of memory space of Backpropagation Through Time. This suggests that classic perceptual problems in speech and vision can likely benefit from fully exploiting the central role of time coherence that arises from this formulation of learning.

Arthur Gretton

UCL, UK

Topics

Kernel Methods to Reveal Properties and Relations in Data

Biography

Arthur Gretton is a Professor with the Gatsby Computational Neuroscience Unit, CSML, UCL, which he joined in 2010. He received degrees in physics and systems engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He worked from 2002-2012 at the MPI for Biological Cybernetics, and from 2009-2010 at the Machine Learning Department, Carnegie Mellon University.
Arthur’s research interests in machine learning include kernel methods, statistical learning theory, nonparametric hypothesis testing, and generative modelling. He has been an associate editor at IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 to 2013, an Action Editor for JMLR since April 2013, a member of the NeurIPS Program Committee in 2008 and 2009, a Senior Area Chair for NeurIPS in 2018, an Area Chair for ICML in 2011 and 2012, and a member of the COLT Program Committee in 2013. Arthur was co-chair of AISTATS in 2016 (with Christian Robert), co-tutorials chair of ICML in 2018 (with Ruslan Salakhutdinov), co-workshop chair for ICML 2019 (with Honglak Lee), co-organiser of the Dali workshop in 2019 (with Krikamol Muandet and Shakir Mohammed), and co-organsier of MLSS 2019 in London (with Marc Deisenroth).

Talk

A Kernel Critic for Generative Adversarial Networks

Generative adversarial networks (GANs) use neural networks as generative models, creating realistic samples that mimic real-life reference samples (for instance, images of faces, bedrooms, and more). These networks require an adaptive critic function while training, to teach the networks how to move improve their samples to better match the reference data. I will describe a kernel divergence measure, the maximum mean discrepancy, which represents one such critic function. With gradient regularisation, the MMD is used to obtain strong performance on challenging image generation tasks, including 160 × 160 CelebA and 64 × 64 ImageNet. In addition to adversarial network training, I’ll discuss the challenge of benchmarking GAN performance.

Arthur Guez

Google DeepMind, Montreal, Canada

Topics

General Reinforcement Learning Algorithms

Biography

Arthur Guez is currently a research scientist at DeepMind since 2014, initially in London and now based in Montreal. He focuses on reinforcement learning methods and played a key role in the development of AlphaGo and its later generalizations. He studied for his PhD at the Gatsby Computational Neuroscience Unit (UCL, London) under the supervision of Peter Dayan and David Silver. Before that, he studied for his undergraduate and MSc at McGill University in Montreal, under the supervision of Joelle Pineau.

Talk

Rethinking planning in reinforcement learning

Carefully reasoning about the consequences of one’s actions is one of the hallmarks of intelligence. Traditionally, we distinguish model-free approaches, that gradually learn a reactive policy by trial-and-error, from model-based approaches that leverage an environment model to plan using more computation steps (e.g., by using the true model in AlphaGo/Zero). After introducing these concepts, I will talk about how that distinction can be blurred in different ways, for example with model-free agents that learn to plan without explicit models or with learned models being employed within a model-free architecture. I will discuss how that suggests new ways to obtain the desirable effects of planning (the foresight) while forgoing some of the classic prescribed mechanisms (e.g. state-based lookahead).

Kaisa Miettinen

University of Jyväskylä, Finland

Topics

Multiobjective Optimization & Decision Analytics

Biography

Kaisa Miettinen is Professor of Industrial Optimization of the University of Jyvaskyla. Her research interests include theory, methods, applications and software of nonlinear multiobjective optimization including interactive and evolutionary approaches. She heads the Research Group on Industrial Optimization and is the director of the thematic research area called Decision Analytics utilizing Causal Models and Multiobjective Optimization (DEMO, www.jyu.fi/demo). She has authored over 170 refereed journal, proceedings and collection papers, edited 14 proceedings, collections and special issues and written a monograph Nonlinear Multiobjective Optimization. She is a member of the Finnish Academy of Science and Letters, Section of Science and the Immediate-Past President of the International Society on Multiple Criteria Decision Making (MCDM). She belongs to the editorial boards of eight international. She has previously worked at IIASA, International Institute for Applied Systems Analysis in Austria, KTH Royal Institute of Technology in Stockholm, Sweden and Helsinki School of Economics, Finland. In 2017, she received the Georg Cantor Award of the International Society on MCDM for independent inquiry in developing innovative ideas in the theory and methodology.

Talk

Interactive Multiobjective Optimization in Decision Analytics with a Case Study

Digitalization enables collecting and having access to different types of data. We can use descriptive analytics to understand the data or predictive analytics to make predictions. However, this does not necessarily make the most of the data. To recommendations based on the data, we need prescriptive or decision analytics. We can fit models in the data and derive decision problems. If we have multiple conflicting objectives to be optimized, we get a multiobjective optimization problem to be solved to support decision making. In this talk, we discuss different elements of a seamless chain from data to decision support.

Lot sizing is an example of a data-driven optimization problem, where a decision maker needs support in production planning and inventory management. We formulate a multiobjective optimization problem relying on stochastic demand forecasts based on sales data available and solve it with interactive multiobjective optimization methods. In interactive methods, a decision maker directs the search for the best balance between the conflicting objectives by providing preference information. In this way, (s)he gains insight in the phenomena involved, learns about what kind of solutions are available and learns about the feasibility of one’s preferences. We demonstrate how switching the interactive method can be beneficial in different stages of the solution process by first applying a navigation and then a classification based method.

Panos Pardalos

Center for Applied Optimization, University of Florida, USA

Topics

Optimization, Complex Networks & Data Science

Biography

Panos Pardalos is a Distinguished Professor and the Paul and Heidi Brown Preeminent Professor in the Departments of Industrial and Systems Engineering at the University of Florida, and a world renowned leader in Global Optimization, Mathematical Modeling, and Data Sciences. He is a Fellow of AAAS, AIMBE, and INFORMS and was awarded the 2013 Constantin Caratheodory Prize of the International Society of Global Optimization. In addition, Dr. Pardalos has been awarded the 2013 EURO Gold Medal prize bestowed by the Association for European Operational Research Societies. This medal is the preeminent European award given to Operations Research (OR) professionals for “scientific contributions that stand the test of time.”

Dr. Pardalos has been awarded a prestigious Humboldt Research Award (2018-2019). The Humboldt Research Award is granted in recognition of a researcher’s entire achievements to date – fundamental discoveries, new theories, insights that have had significant impact on their discipline.

Dr. Pardalos is also a Member of the New York Academy of Sciences, the Lithuanian Academy of Sciences, the Royal Academy of Spain, and the National Academy of Sciences of Ukraine. He is the Founding Editor of Optimization Letters, Energy Systems, and Co-Founder of the International Journal of Global Optimization, and Computational Management Science. He has published over 500 journal papers, edited/authored over 200 books and organized over 80 conferences. He has a google h-index of 97 and has graduated 63 PhD students so far. Details can be found in www.ise.ufl.edu/pardalos

Talk

Sustainable Interdependent Networks

Sustainable interdependent networks have a wide spectrum of applications in computer science, electrical engineering, and smart infrastructures. We are going to discuss the next generation sustainability framework as well as smart cities with special emphasis on energy, communication, data analytics and financial networks. In addition, we will discuss solutions regarding performance and security challenges of developing interdependent networks in terms of networked control systems, scalable computation platforms, and dynamic social networks.

References:

Amini, M.H., Boroojeni, K.G., Iyengar, S.S., Pardalos, P., Blaabjerg, F., Madni, A.M. (Eds.), “Sustainable Interdependent Networks: From Theory to Application,” Springer (2018).

Amini, M.H., Boroojeni, K.G., Iyengar, S.S., Pardalos, P., Blaabjerg, F., Madni, A.M. (Eds.), “Sustainable Interdependent Networks: From Smart Power Grids to Intelligent Transportation Networks,” Springer (2019),

Rassia, Stamatina Th., Pardalos, Panos M. (Eds.) , “Smart City Networks: Through the Internet of Things,” Springer (2017).

Kalyagin, Valery A., Pardalos, Panos M., Rassias, Themistocles M. (Eds.), “Network Models in Economics and Finance,” Springer (2014).

Laura Carpi, Tiago Schieber, Panos M. Pardalos, Gemma Marfany, Cristina Masoller, Albert Diaz-Guilera, and Martín Ravetti, “Assessing diversity in multiplex networks,” Scientific Reports (2019).

Mauricio G. C. Resende

Amazon.com Research and University of Washington Seattle, Washington, USA

Topics

Combinatorial Optimization & Heuristics

Biography

Principal Research Scientist at Amazon, Seattle.

Mauricio G. C. Resende grew up in Rio de Janeiro (BR), West Lafayette (IN-US), and Amherst (MA-US). He did his undergraduate training in electrical engineering (systems engineering concentration) at the Pontifical Catholic U. of Rio de Janeiro. He obtained an MS in operations research from Georgia Tech and a PhD in operations research from the U. of California, Berkeley. He is most known for his work with metaheuristics, in particular GRASP and biased random-key genetic algorithms, as well as for his work with interior point methods for linear programming and network flows. Dr. Resende has published over 200 papers on optimization and holds 15 U.S. patents. He has edited four handbooks, including the “Handbook of Heuristics,” the “Handbook of Applied Optimization,” and the “Handbook of Optimization in Telecommunications,” and is coauthor of the book “Optimization by GRASP.” He sits on the editorial boards of several optimization journals, including Networks, Discrete Optimization, J. of Global Optimization, R.A.I.R.O., and International Transactions in Operational Research.

Prior to joining Amazon.com in 2014 as a Principal Research Scientist in the transportation area, Dr. Resende was a Lead Inventive Scientist at the Mathematical Foundations of Computing Department of AT&T Bell Labs and at the Algorithms and Optimization Research Department of AT&T Labs Research in New Jersey. Since 2016, Dr. Resende is also Affiliate Professor of Industrial and Systems Engineering at the University of Washington in Seattle. He is a Fellow of INFORMS.

Talk

Biased random-key genetic algorithms – Learning intelligent solutions from random building blocks

A biased random-key genetic algorithm (BRKGA) is a general search metaheuristic for finding optimal or near-optimal solutions to hard combinatorial optimization problems. It is derived from the random-key genetic algorithm of Bean (1994), differing in the way solutions are combined to produce offspring. BRKGAs have three key features that specialize genetic algorithms: 1) A fixed chromosome encoding using a vector of N random keys or alleles over the real interval [0, 1), where the value of N depends on the instance of the optimization problem; 2) A well-defined evolutionary process adopting parameterized uniform crossover to generate offspring and thus evolve the population; 3) The introduction of new chromosomes called mutants in place of the mutation operator usually found in evolutionary algorithms. Such features simplify and standardize the metaheuristic with a set of self-contained tasks from which only one is problem-dependent: that of decoding a chromosome, i.e. using the keys to construct a solution to the underlying optimization problem, from which the objective function value or fitness can be computed. In this talk we review the basic components of a BRKGA and introduce an Application Programming Interface (API) for quick implementations of BRKGA heuristics. We then apply the framework to a number of packing and layout problems, including 1) 2D and 3D constrained orthogonal packing 2) 2D and 3D bin packing 3) Unequal area facility layout We conclude with a brief review of other domains where BRKGA have been applied.

Raniero Romagnoli

Almawave, Italy

Biography

Raniero Romagnoli is currently CTO of Almawave, that he joined in 2011, with the responsibility of defining and implementing the company’s technology strategy, with a special focus on R&D labs, helping Almawave create and evolve its products and solutions, that are based on proprietary Natural Language Processing technology to leverage speech and text information and communications in order to govern processes and improve both self and assisted engagement with users. Before joining Almawave Raniero worked for 2 years in RSA and before that Raniero worked for Hewlett Packard, for almost 10 years, in different technology and divisions, covering roles both in Product Management and R&D in intelligent support systems area. Raniero has a broad experience in the artificial intelligence field, starting from his research activities in the late ’90s on Machine Learning and Neural Networks for image processing, than in the security space, and since he joined Almawave in the field of speech and text analysis.

Talk

Building Iride: how to mix deep learning and Ontologies techiniques to understand language

Almawave is an italian AI company based in Rome, specialized on Natural Language processing and speech analytics. NLP is multidisciplinary science that mixes computer science, psychology and linguistics. Deep learning techniques achieved important results in a wide variety of NLP tasks, but language understanding “still requires a deep knowledge of the technological domain”. In this session we present IRIDE a proprietary NLP engine, that embraces and mixes an ontologies based approach and the most recent DL techniques in order to solve NLP tasks likes sentiment analysis and named entity recognition

Vincenzo Sciacca

Almawave, Italy

Biography

Vincenzo Sciacca is an experienced AI Technical Architect, responsible to enable and guide customers and Business partner in their high priority innovation initiatives, leveraging the Almawave proprietary Natural Language Processing technology known as IRIDE. He moved to Almawave after a longexperience with IBM covering different roles in SW development, Cloud and AI. He spent more than10 years in the Tivoli Laboratory an internationalIBM Research & Development focused on systems monitoring and management. He gained rich experienceand knowledge as Software Architect leading many International software project and as Data and AI Architect in the Cloud&Watson division. Vincenzo is also an active inventor, achieving multiple patents and authoring technical papers andhe always strived to apply AI/ML to traditional IT projects.

Talk

Building Iride: how to mix deep learning and Ontologies techiniques to understand language

Richard E. Turner

Department of Engineering, University of Cambridge, UK

Topics

Deep Learning

Biography

Dr. Richard E. Turner is a Reader in Machine Learning at the University of Cambridge and a Visiting Researcher at Microsoft Research Cambridge. His research fuses probabilistic machine learning and deep learning to develop robust, data-efficient, flexible and automated learning systems. Richard helps lead Cambridge’s renowned Machine Learning Group, the Machine Learning and Machine Intelligence MPhil, the Centre for Doctoral Training in AI for Environmental Risk, and the Cambridge Big Data Strategic Initiative. He studied for his PhD at the Gatsby Computational Neuroscience Unit at UCL and spent his Postdoctoral Fellowship at New York University in the Laboratory for Computational Vision. He has been awarded the Cambridge Students’ Union Teaching Award for Lecturing and his work has featured on BBC Radio 5 Live’s The Naked Scientist, BBC World Service’s Click and in Wired Magazine.

Talk

Extending the frontiers of deep learning using probabilistic modelling

Deep learning has revolutionised many facets of machine learning, however it suffers from a number of crucial limitations that severely limit its applicability to real world problems. For example, deep learning is data hungry requiring large numbers of labelled data points. It is often confidently wrong, which is a disaster for downstream decision making. Deep learning fails catastrophically in continual learning scenarios, where data and tasks arrive continuously and must be leaned from in an incremental way. Deep learning also fails in general distributed learning settings, and critically so when the data is inhomogeneously distributed across devices. These failings are particularly limiting in healthcare (where data are typically scarce and uncertainties need to be well-calibrated for diagnosis) and personalisation (where data is in short supply, distributed across multiple users’ devices, and models must be incrementally updated as new data arrive).

I will discuss how we have recently started to address these limitations by fusing deep learning with probabilistic modelling and inference techniques. I will give a short introduction to probabilistic modelling and probabilistic inference in the context of deep learning. I will then discuss how the new synthesis can be used to deploy deep learning to small data settings, such as few shot learning tasks from computer vision, providing well-calibrated uncertainty estimates in the process. I will also show that incremental updates, necessary for continual learning, are built into the bedrock of probabilistic inference and that this enables deep continual learning methods. Finally, I will show general distributed learning, also known as federated learning, is also unlocked by the probabilistic approach.

Past Keynote Speakers

The Keynote Speakers of the previous editions:

Jörg Bornschein, DeepMind, London, UK
Nello Cristianini, University of Bristol, UK
Peter Flach, University of Bristol, UK, and EiC of the Machine Learning Journal
Yi-Ke Guo, Imperial College London, UK
George Karypis, University of Minnesota, USA
Vipin Kumar, University of Minnesota, USA
George Michailidis, University of Florida, USA
Stephen Muggleton, Imperial College London, UK
Panos Pardalos, Center for Applied Optimization, University of Florida, USA
Jun Pei, Hefei University of Technology, China
Tomaso Poggio, MIT, USA
Andrey Raygorodsky, Moscow Institute of Physics and Technology, Russia
Ruslan Salakhutdinov, Carnegie Mellon University, USA, and AI Research at Apple
My Thai, University of Florida, USA