Information About

Adaboost





GENERALISED FORM OF THE ALGORITHM

Given: (x_{1},y_{1}),\ldots,(x_{m},y_{m}) where x_{i} \in X,\, y_{i} \in Y = \{-1, +1\}

Initialise D_{1}(i) = rac{1}{m}.

For t = 1,\ldots,T:

  • Find the classifier h_{t} : X o \{-1,+1\} that minimizes the error with respect to the distribution D_{t}: h_{t} = \arg \min_{h_{j} \in \mathcal{H}} \epsilon_{j} = \sum_{i=1}^{m} D_{t}(i)[y(i)

  • e h_{j}(x_{i})]

  • Prerequisite: \epsilon_{t} < 0.5, otherwise stop.

  • Choose \alpha_{t} \in \mathbf{R}, typically \alpha_{t}= rac{1}{2} extrm{ln} rac{1-\epsilon_{t}}{\epsilon_{t}} where \epsilon_{t} is the weighted error rate of classifier h_{t}.

  • Update:

  • D_{t+1}(i) = rac{ D_{t}(i) \, extrm{exp}(- \alpha_{t} y_{i} h_{t}(x_{i})) }{ Z_{t} }

where Z_{t} is a normalisation factor (chosen so that D_{t+1} will be a distribution).

Output the final classifier:

H(x) = extrm{sign}\left( \sum_{t=1}^{T} \alpha_{t}h_{t}(x) ight)

The equation to update the distribution D_{t} in constructed so that:

\exp (- \alpha_{t} y_{i} h_{t}(x_{i})) \begin{cases} <1, & y(i)=h_{t}(x_{i}) \ >1, & y(i)
e h_{t}(x_{i}) \end{cases}

Thus, after selecting an optimal classifier h_{t} \, for the distribution D_{t} \,, the examples x_{i} \, that the classifier h_{t} \, identified correctly are weighted less and those that it identified incorrectly are weighted more. Therefore, when the algorithm is testing the classifiers on the distribution D_{t+1} \,, it will select a classifier that better identifies those examples that the previous classifer missed.


EXTERNAL LINKS