
Boosted and Differentially Private Ensembles of Decision Trees
A Google TechTalk, 2020/7/29, presented by Richard Nock, Data61 and The Australian National University
ABSTRACT: Ensembles of decision tree (DT) classifiers are hugely popular in
both the private and non-private settings, and display a very singular
picture: while boosting and offsprings top international competitions
in the non-private setting, random forests reign supreme
when differential privacy (DP) is at stake. There is no middle ground
that would combine the rates of boosting with the randomness that DP
commands. In this talk, we summarise (i) the existence of a privacy vs boosting dilemma for top-down induction of DTs in the context of statistical decision theory, (ii) an algorithm to navigate this dilemma with the introduction of a new specifically designed proper loss and a way to tune it at training time to make the most of boosting under DP constraints, (iii) formal boosting convergence results under DP constraints and (iv) experiments comparing our approach to differentially private random forests.