Quantcast
Channel: Optimization – Machine Learning Research Blog
Browsing latest articles
Browse All 12 View Live

Image may be NSFW.
Clik here to view.

Information theory with kernel methods

In last month blog post, I presented the von Neumann entropy. It is defined as a spectral function on positive semi-definite (PSD) matrices, and leads to a Bregman divergence called the von Neumann...

View Article


Image may be NSFW.
Clik here to view.

Rethinking SGD’s noise

It seemed a bit unfair to devote a blog to machine learning (ML) without talking about its current core algorithm: stochastic gradient descent (SGD). Indeed, SGD has become, year after year, the basic...

View Article


Image may be NSFW.
Clik here to view.

Rethinking SGD’s noise – II: Implicit Bias

In the previous post, we showed (or at least tried to!) how the inherent noise of the stochastic gradient descent algorithm (SGD), in the context of modern overparametrised architectures, is...

View Article

Image may be NSFW.
Clik here to view.

Sums-of-squares for dummies: a view from the Fourier domain

In these last two years, I have been studying intensively sum-of-squares relaxations for optimization, learning a lot from many great research papers [1, 2], review papers [3], books [4, 5, 6, 7, 8],...

View Article

Image may be NSFW.
Clik here to view.

Discrete, continuous and continuized accelerations

In optimization, acceleration is the art of modifying an algorithm in order to obtain faster convergence. Building accelerations and explaining their performance have been the subject of a countless...

View Article


Image may be NSFW.
Clik here to view.

Non-convex quadratic optimization problems

Among continuous optimization problems, convex problems (with convex objectives and convex constraints) define a class that can be solved efficiently with a variety of algorithms and with arbitrary...

View Article

Image may be NSFW.
Clik here to view.

Revisiting the classics: Jensen’s inequality

There are a few mathematical results that any researcher in applied mathematics uses on a daily basis. One of them is Jensen’s inequality, which allows bounding expectations of functions of random...

View Article

Image may be NSFW.
Clik here to view.

Unraveling spectral properties of kernel matrices – I

Since my early PhD years, I have plotted and studied eigenvalues of kernel matrices. In the simplest setting, take independent and identically distributed (i.i.d.) data, such as in the cube below in 2...

View Article


Image may be NSFW.
Clik here to view.

Scaling laws of optimization

Scaling laws have been one of the key achievements of theoretical analysis in various fields of applied mathematics and computer science, answering the following key question: How fast does my method...

View Article


Image may be NSFW.
Clik here to view.

My book is (at last) out!

Just in time for Christmas, I received two days ago the first hard copies of my book! It is a mix of feelings of relief and pride after 3 years of work. As most book writers will probably acknowledge,...

View Article
Browsing latest articles
Browse All 12 View Live