Speaker 1: Dr. Cheng Ong (Data61, Australia)
Title: Studying Compositional Data using Bregman Divergence
Summary: Compositional data consists of a collection of nonnegative data
that sum to a constant value. Examples of compositional data include
histograms and proportions, which appear in many practical applications.
Since the parts of the collection are statistically dependent,
many standard tools cannot be directly applied. Instead, compositional data must
be first transformed before analysis.
We study machine learning problems on compositional data using the scaled Bregman
theorem, which was recently proved. The scaled Bregman theorem relates
the perspective transform of a Bregman divergence to the Bregman divergence
of a perspective transform and a remainder conformal divergence.
This equality enables us to transform machine learning problems on compositional
data to new problems which are easier to optimise. We apply this to
principal component analysis and linear regression, and show promising
results on microbiome data.
Speaker 2: Dr. Richard Nock (Data61, Australia)
Title: Lossless or Quantized Boosting with Integer Arithmetic
Summary: In supervised learning, efficiency often starts with the choice of a good loss:
support vector machines popularised Hinge loss, Adaboost popularised
the exponential loss, etc. Recent trends in machine learning have
highlighted the necessity for training routines to meet
tight requirements on communication, bandwidth, energy, operations,
encoding, among others. Fitting the often decades-old state of the art
training routines into these new constraints does not go without pain and uncertainty or
reduction in the original guarantees.
We started to tackle this problem from the core with the design of a new loss, the Q-loss. While enjoying many
usual desirable properties for losses (it is strictly proper canonical, twice differentiable), it
also has the key property that its mirror update over (arbitrary) rational inputs uses only integer arithmetics —
more precisely, the sole use of $+, -, /, times, |.|$. We build a
learning algorithm which is able, under mild assumptions, to achieve a
lossless boosting-compliant training. We give conditions for a quantization of its main
memory footprint, weights, to be done while keeping the whole algorithm boosting-compliant.
Experiments display that the algorithm can achieve a fast convergence
during the early boosting rounds compared to AdaBoost, even with a weight storage
that can be 30+ times smaller. Lastly, we show that the Bayes risk of the
Q-loss can be used as node splitting criterion for decision trees and turns out to
guarantee optimal boosting convergence.
This is joint work with Bob Williamson, to be presented at ICML 2019.
|Date||May 17, 2019 (Fri) 10:00 - 12:00|