Maximum likelihood model selection for 1-norm soft margin SVMs with multiple parameters
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
Maximum likelihood model selection for 1-norm soft margin SVMs with multiple parameters. / Glasmachers, T.; Igel, Christian.
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 8, 2010, p. 1522-1528.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Maximum likelihood model selection for 1-norm soft margin SVMs with multiple parameters
AU - Glasmachers, T.
AU - Igel, Christian
PY - 2010
Y1 - 2010
N2 - Adapting the hyperparameters of support vector machines (SVMs) is a challenging model selection problem, especially when flexible kernels are to be adapted and data are scarce. We present a coherent framework for regularized model selection of 1-norm soft margin SVMs for binary classification. It is proposed to use gradient-ascent on a likelihood function of the hyperparameters. The likelihood function is based on logistic regression for robustly estimating the class conditional probabilities and can be computed efficiently. Overfitting is an important issue in SVM model selection and can be addressed in our framework by incorporating suitable prior distributions over the hyperparameters. We show empirically that gradient-based optimization of the likelihood function is able to adapt multiple kernel parameters and leads to better models than four concurrent state-of-the-art methods.
AB - Adapting the hyperparameters of support vector machines (SVMs) is a challenging model selection problem, especially when flexible kernels are to be adapted and data are scarce. We present a coherent framework for regularized model selection of 1-norm soft margin SVMs for binary classification. It is proposed to use gradient-ascent on a likelihood function of the hyperparameters. The likelihood function is based on logistic regression for robustly estimating the class conditional probabilities and can be computed efficiently. Overfitting is an important issue in SVM model selection and can be addressed in our framework by incorporating suitable prior distributions over the hyperparameters. We show empirically that gradient-based optimization of the likelihood function is able to adapt multiple kernel parameters and leads to better models than four concurrent state-of-the-art methods.
U2 - 10.1109/TPAMI.2010.95
DO - 10.1109/TPAMI.2010.95
M3 - Journal article
VL - 32
SP - 1522
EP - 1528
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
SN - 0162-8828
IS - 8
ER -
ID: 32089209