オムライスの備忘録

数学・統計学・機械学習・プログラミングに関することを記す

【機械学習】ブースティング / Boosting

この記事の読者

機械学習・マシンラーニングの手法の１つである「ブースティング / Boosting」について知りたい.

Index

Index
ブースティング / Boosting
弱識別器
誤差関数
種類
参考
- 書籍

ブースティング / Boosting

適応基底関数モデル

Schapire / Freund から発表された、適応基底関数モデルにフィッティングするためのアルゴリズム.

適応基底関数モデル
- yhayato1320.hatenablog.com

弱学習器を基底関数として構成されている.

ブースティング / Boosting とは

ブースティング / Boosting は、複数の弱識別器を用意して、学習を直列的にし、前の弱識別器の学習結果を参考にしながら一つずつ弱識別器を学習するアンサブル手法の一つ.

アンサンブル学習 / Ensemble Learning
- yhayato1320.hatenablog.com

次の弱識別器の学習データは、それまでの学習結果から次の学習にとってもっとも有益なものが選ばれる.

各弱識別器は、学習データに対する誤り率が $\epsilon \leq \displaystyle \frac{1}{2} - \gamma\ \ (\gamma > 0)$ を満たすように学習が行われる.

二値分類では、ランダム識別器でも0.5 の精度 (誤り率)を出すことができる. それよりも少しでも良い精度 (誤り率でいうと $\epsilon \leq 0.5$ )をだせれば、全体として十分な識別器が作るれる.

目的

$\displaystyle \min_{f} \displaystyle \sum_{i=1}^{N} L(y_{i},\ f(x_{i}))$

$L$ : Loss Function
$y$ : ラベル・グランドトルース
$f(x)$ : 予測

処理フロー

Input

データ (分布) : $D$
基本学習アルゴリズム : $L$
学習反復回数 : $T$

Process

$D_{1}\ =\ D$ データ (分布) の初期化
for $t\ =\ 1,\ \cdots,\ T:$

$h_{t}\ =\ L(D_{t})$ データを利用した学習
$\epsilon_{t}\ =\ P_{x\ \sim\ D} ( h_{t}(x)\ \neq\ f(x))$ 誤差を評価
$D_{t+1}\ =\$ Adjust_Distribution $(D_{t},\ \epsilon_{t})$ 次の学習器のためのデータ (分布) を作成

end

$x$ : 観測データ点
$f(x)$ : ラベル・グランドトルース

Output

$H(x)\ =\$ Combine_Outputs $( \{ h_1 (x),\ \cdots,\ h_{t}(x) \} )$

弱識別器

弱識別器のひとつとして利用されるのが決定木である.

決定木
- yhayato1320.hatenablog.com

誤差関数

学習器の評価方法として、誤差関数を用いる.

Name	Loss	Algorithm
Squared Error	$\frac{1}{2}(\ y_{i}\ -\ f(x_{i})\ )^{2}$	L2 Boosting
Absolute Error	$\|\ y_{i}\ -\ f(x_{i})\ \|$	Gradient Boosting
Exponential Loss	$\exp (\ -\ \tilde{y}_{i}\ f(x_{i})\ )$	Ada Boosting
Log Loss	$\log (1\ +\ e^{-\ \tilde{y}_{i}\ f(x_{i})})$	Logit Boost

種類

ブースティングにはいくつか種類がある.

L2 Boosting
Gradient Boosting
アダブースト / Ada Boost (Adaptive Boosting)
Logit Boosting

Gradient Boosting / 2000

Gradient Boosting
- 勾配法を用いて、パラメータを更新する
- yhayato1320.hatenablog.com

アダブースト / Ada Boost (Adaptive Boosting)

アダブースト / Ada Boost (Adaptive Boosting)
- 各データ点にも重みを設け、各識別器の重みと同様に更新する
- yhayato1320.hatenablog.com

XGBoost / 2016

XGBoost
- yhayato1320.hatenablog.com

LightGBM / 2016

LightGBM
- yhayato1320.hatenablog.com

参考

Boosting: Foundations and Algorithms
- Schapire / Freund 2012
- 発表論文

書籍

Machine Learning A Probabilistic Perspective
- 16 Adaptive basis function models
  - 16.4 Boosting
- Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series)
  - 作者:Murphy, Kevin P.
  - The MIT Press
  Amazon
アンサンブル法による機械学習
- 2 ブースティング
  - 2.1 一般的なブースティング法
  - 2.2 AdaBoost (アダブースト) アルゴリズム
- [asin:B073F12L1V:detail]