Plenary lectures – iTWIST'20

Vincent Duval (INRIA Paris, France):
The BLASSO: continuous dictionaries for sparser reconstructions

In this talk, I will give an overview of the main properties of the Beurling LASSO (BLASSO), a sparse reconstruction method which has drawn a lot of attention since the pioneering works of De Castro and Gamboa, Bredies and Pikkarainen, Candès and Fernandez-Granda… The method consists in performing an analogue of the ℓ¹ minimization in the space of Radon measures. Using this continuous framework instead of introducing an artificial finite grid for sparse recovery is not only relevant when modelling many physical problems, but it also provides interesting properties such as support stability, sparsity of the solutions, and efficient minimization algorithms.

Mark Plumbley (Centre for Vision, Speech and Signal Processing, University of Surrey, UK):
AI for Sound: From Independent Component Analysis and Sparse Representations to Deep Learning

Imagine you are standing on a street corner in a city. Close your eyes: what do you hear? Perhaps some cars and busses driving on the road, footsteps of people on the pavement, beeps from a pedestrian crossing, rustling and clonks from shopping bags and boxes, and the hubbub of talking shoppers. You can do the same in a kitchen as someone is making breakfast, or as you are travelling in a vehicle. Now, following the success of machine learning technologies for speech and image recognition, we are beginning to build computer systems to tackle this challenging task: to automatically recognize real-world sound scenes and events. In this talk, I will discuss some of the techniques and approaches that we have been using to analyze and recognize different types of sounds, including independent component analysis, nonnegative matrix factorization, sparse representations and deep learning. I will also discuss how we are using data challenges to help develop a community of researchers in recognition of real-world sound scenes and events, explore some of the work going on in this rapidly expanding research area, and touch on some of the key issues for the future, including privacy for sound sensors and the need for low-complexity models. We will discuss some of the potential applications emerging for sound recognition, from home security and assisted living to exploring sound archives, and we will close with some pointers to more information about this research area.

Christopher J Rozell (Georgia Institute of Technology, USA):
Leveraging low-dimensional models for human-in-the-loop machine learning tasks

While modern machine learning has made significant gains in focused tasks such as object recognition from images, it is clear that future advances in machine learning for more complex and subtle tasks will require richer human-machine interactions that must be as efficient and effective as possible. In this talk we will examine ways that the low-dimensional structure of natural data can be leveraged to enable performance improvements in three human-in-the-loop machine learning tasks. First, we will highlight work developing active learning approaches to posing relational queries to humans for tasks such as similarity learning and preference search. Second, we will demonstrate new manifold learning approaches based on generative models where a human input is used to specify data invariances to learn as identity preserving transformations. Finally, we will show how dimensionality reduction in deep generative models can be used to explain to humans the behavior of black-box machine learning classifiers. Taken together, these examples will demonstrate the power of low-dimensional models in emerging human-in-the-loop machine learning tasks.

Karin Schnass (University of Innsbruck, Austria) :
The landscape of dictionary learning

In this talk we will visit the landscape of dictionary learning via iterative thresholiding and K residual means. For a given generating dictionary we will have a look at the basin of attraction, the regions of contraction, and spurious attractive points.

Time permitting we will also discuss heuristics how to use escape from spurious attractive points and jump directly into the basin of attraction.

Irene Waldspurger (CNRS, University Paris-Dauphine, France):
Rank optimality for the Burer-Monteiro factorization

The Burer-Monteiro factorization is a classical heuristic used to speed up the solving of large scale semidefinite programs when the solution is expected to be low rank: One writes the solution as the product of thinner matrices, and optimizes over the (low-dimensional) factors instead of over the full matrix. Even though the factorized problem is non-convex, one observes that standard first-order algorithms can often solve it to global optimality. This has been rigorously proved by Boumal, Voroninski and Bandeira, but only under the assumption that the factorization rank is large enough, larger than what numerical experiments suggest. We will describe this result, and investigate its optimality. More specifically, we will show that, up to a minor improvement, it is optimal: without additional hypotheses on the semidefinite problem at hand, first-order algorithms can fail if the factorization rank is smaller than predicted by current theory.

Rebecca Willett (University of Chicago, USA) :
A function space view of overparameterized neural networks

Contrary to classical bias/variance tradeoffs, deep learning practitioners have observed that vastly overparameterized neural networks with the capacity to fit virtually any labels nevertheless generalize well when trained on real data. One possible explanation of this phenomenon is that complexity control is being achieved by implicitly or explicitly controlling the magnitude of the weights of the network. This raises the question: What functions are well-approximated by neural networks whose weights are bounded in norm? In this talk, I will give some partial answers to this question. In particular, I will give a precise characterization of the space of functions realizable as a two-layer (i.e., one hidden layer) neural network with ReLU activations having an unbounded number of units, but where the Euclidean norm of the weights in the network remains bounded. Surprisingly, this characterization is naturally posed in terms of the Radon transform as used in computational imaging, and I will show how tools from Radon transform analysis yield novel insights about learning with two and three-layer ReLU networks. This is joint work with Greg Ongie, Daniel Soudry, and Nati Srebro.

David Wipf (Visual Computing Group, Microsoft Research, Beijing, China) :
On the underappreciated role of sparsity in deep variational autoencoder models

This talk will trace the progression of Bayesian-inspired models for finding low-dimensional structure in data, from simple frameworks like robust PCA and Bayesian compressive sensing, to more complex heirs such as the variational autoencoder (VAE). The latter represents a popular, flexible form of deep generative model that can be stochastically fit to observed samples from a given random process using an a variational bound on the underlying log-likelihood. Although originally motivated as a way of generating new samples that approximate an unknown distribution, the VAE can also be leveraged to find low-dimensional manifold structure in training data.

Despite the lack of a canonical sparsity-promoting penalty as commonly adopted by classical methods, I will highlight how parsimony naturally emerges from the VAE and its predecessors, often with distinct provable advantages over deterministic alternatives. For example, subtle mechanisms will be discussed that allow such models to robustly dismiss outliers and smooth away bad local minima all while adapting to an unknown inlier manifold of arbitrary dimension. And as a byproduct of this process, in certain settings the VAE in particular can also generate realistic samples that mirror the data distribution within such manifolds devoid of outliers.