Abstracts

On Demystifying Adversarial Learning

Lawrence Carin, Duke University
Friday, August 28, 3:30 - 4:30 p.m. EDT

There has been significant recent interest within the machine learning community on adversarial learning. In this setup an “actor,” or model, seeks to synthesize samples that are similar to those in the training set, while a “critic” seeks to distinguish the synthesized samples from the real data. The actor and critic play an adversarial “game,” and if this game is run effectively, highly realistic samples are manifested. In this talk we will derive such an adversarial learning setup from statistical first principles, and via this foundational perspective, we gain insight into failure mechanisms of such learning, and develop methods to mitigate them. We also show the application of adversarial learning to a diverse set of applications.

Distributed Machine Learning

Heng Huang, Unviersity of Pittsburgh
Friday, September 4, 10:30 - 11:30 a.m. EDT

Machine learning is gaining fresh momentum, and has helped us to enhance not only many industrial and professional processes but also our everyday living. The recent success of machine learning relies heavily on the surge of big data, big models, and big computing. However, inefficient algorithms restrict the applications of machine learning to big data mining tasks. In terms of big data, serious concerns, such as communication overhead and data privacy, should be rigorously addressed when we train models using large amounts of data located on multiple devices. In terms of the big model, it is still an underexplored research area if a model is too big to train on a single device. To address these challenging problems, we focused on designing new large-scale machine learning models, efficiently optimizing and training methods for big data mining, and studying new discoveries in both theory and applications.

For the challenges raised by big data, we proposed several new asynchronous distributed stochastic gradient descent or coordinate descent methods for efficiently solving convex and non-convex problems. We also designed new large-batch training methods for deep learning models to reduce the computation time significantly with better generalization performance. For the challenges raised by the big model, we scaled up the deep learning models by parallelizing the layer-wise computations with a theoretical guarantee, which is the first algorithm breaking the lock of backpropagation mechanism such that the large-scale deep learning models can be dramatically accelerated.

A Representational Model of Grid Cells Based on Matrix Lie Algebras

Ying Nian Wu, UCLA
Friday, September 11, 10:30 - 11:30 a.m. EDT

A key perspective of deep learning is representation learning, where concepts are embedded in latent space and are represented by latent vectors whose units can be interpreted as neurons. In this talk, I will explain a representational model in the mammalian brain for navigation that involves grid cells and place cells. The grid cells in the mammalian medial entorhinal cortex exhibit striking hexagon firing patterns when the agent (e.g., a rat or a human) navigates in the open field. It is hypothesized that the grid cells are involved in path integral so that the agent is aware of its self-position by accumulating its self-motion. Assuming the grid cells form a vector representation of self-position, we elucidate a minimally simple recurrent model for path integral, which models the change of the vector representation given the self-motion, and we uncover two matrix Lie algebras and their Lie groups that are naturally coupled together. This enables us to connect the path integral model to the dimension reduction model for place cells via group representation theory of harmonic analysis. By reconstructing the kernel functions for place cells, our model learns hexagon grid patterns that characterize the grid cells. The learned model is capable of near perfect path integral, and it is also capable of error correction. Joint work with Ruiqi Gao, Jianwen Xie, and Song-Chun Zhu.

Integrating Domain-Knowledge into Deep Learning

Ruslan Salakhutdinov, Carnegie Mellon University
Friday, September 18, 3:30 - 4:30 p.m. EDT

I will first discuss deep learning models that can find semantically meaningful representations of words, learn to read documents and answer questions about their content. I will introduce methods that can augment neural representation of text with structured data from Knowledge Bases (KBs) for question answering, and show how we can answer complex multi-hop questions using a text corpus as a virtual KB. In the second part of the talk, I will show how we can design modular hierarchical reinforcement learning agents for visual navigation that can perform tasks specified by natural language instructions, perform efficient exploration and long-term planning, learn to build the map of the environment, while generalizing across domains and tasks.