Talks by Dr. Cheng Ong and Dr. Richard Nock (Data61, Australia)

May 6, 2019 16:20

Abstract

Speaker 1: Dr. Cheng Ong (Data61, Australia)

Title: Studying Compositional Data using Bregman Divergence

Summary: Compositional data consists of a collection of nonnegative data
that sum to a constant value. Examples of compositional data include
histograms and proportions, which appear in many practical applications.
Since the parts of the collection are statistically dependent,
many standard tools cannot be directly applied. Instead, compositional data must
be first transformed before analysis.

We study machine learning problems on compositional data using the scaled Bregman
theorem, which was recently proved. The scaled Bregman theorem relates
the perspective transform of a Bregman divergence to the Bregman divergence
of a perspective transform and a remainder conformal divergence.
This equality enables us to transform machine learning problems on compositional
data to new problems which are easier to optimise. We apply this to
principal component analysis and linear regression, and show promising
results on microbiome data.

====================================================
Speaker 2: Dr. Richard Nock (Data61, Australia)

Title: Lossless or Quantized Boosting with Integer Arithmetic

Summary: In supervised learning, efficiency often starts with the choice of a good loss:
support vector machines popularised Hinge loss, Adaboost popularised
the exponential loss, etc. Recent trends in machine learning have
highlighted the necessity for training routines to meet
tight requirements on communication, bandwidth, energy, operations,
encoding, among others. Fitting the often decades-old state of the art
training routines into these new constraints does not go without pain and uncertainty or
reduction in the original guarantees.

We started to tackle this problem from the core with the design of a new loss, the Q-loss. While enjoying many
usual desirable properties for losses (it is strictly proper canonical, twice differentiable), it
also has the key property that its mirror update over (arbitrary) rational inputs uses only integer arithmetics —
more precisely, the sole use of $+, -, /, times, |.|$. We build a
learning algorithm which is able, under mild assumptions, to achieve a
lossless boosting-compliant training. We give conditions for a quantization of its main
memory footprint, weights, to be done while keeping the whole algorithm boosting-compliant.
Experiments display that the algorithm can achieve a fast convergence
during the early boosting rounds compared to AdaBoost, even with a weight storage
that can be 30+ times smaller. Lastly, we show that the Bayes risk of the
Q-loss can be used as node splitting criterion for decision trees and turns out to
guarantee optimal boosting convergence.

This is joint work with Bob Williamson, to be presented at ICML 2019.

More Information

Date	May 17, 2019 (Fri) 10:00 - 12:00
URL	https://c5dc59ed978213830355fc8978.doorkeeper.jp/events/91291

Venue

〒103-0027 Nihonbashi 1-chome Mitsui Building, 15th floor, 1-4-1 Nihonbashi,Chuo-ku, Tokyo(Google Maps)

Sunday	Monday	Tuesday	Wednesday	Thursday	Friday	Saturday
		Link to the event page for the 1st	Link to the event page for the 2nd	Link to the event page for the 3rd	Link to the event page for the 4th	5th
6th	7th	8th	Link to the event page for the 9th	Link to the event page for the 10th	11th	12th
13th	14th	Link to the event page for the 15th	Link to the event page for the 16th	Link to the event page for the 17th	18th	19th
20th	21th	22th	23th	24th	25th	26th
27th	28th	29th	30th	31th

Center for Advanced Intelligence Project

Events