Efficient hold-out for subset of regressors

Tapio Pahikkala, Hanna Suominen, Jorma Boberg, Tapio I. Salakoski

Research output: A Conference proceeding or a Chapter in BookConference contribution

2 Citations (Scopus)

Abstract

Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(|H|3 + |H|2n), where |H| is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m 3/N2 + (m2n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn2) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient selection of the optimal parameter value

Original languageEnglish
Title of host publicationAdaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers
Subtitle of host publicationLecture Notes in Computer Science
Pages350-359
Number of pages10
Volume5495
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event9th International Conference on Adaptive and Natural Computing Algorithms, ICANNGA 2009 - Kuopio, Finland
Duration: 23 Apr 200925 Apr 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5495 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Conference

Conference9th International Conference on Adaptive and Natural Computing Algorithms, ICANNGA 2009
CountryFinland
CityKuopio
Period23/04/0925/04/09

Fingerprint

Cross-validation
Subset
Least Squares
Efficient Algorithms
Performance Assessment
Regularization Parameter
Optimal Parameter
Model Selection
Learning algorithms
Learning systems
Learning Algorithm
Computational complexity
Machine Learning
Computational Complexity
Fold
Calculate
Estimate
Training

Cite this

Pahikkala, T., Suominen, H., Boberg, J., & Salakoski, T. I. (2009). Efficient hold-out for subset of regressors. In Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers: Lecture Notes in Computer Science (Vol. 5495, pp. 350-359). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5495 LNCS). https://doi.org/10.1007/978-3-642-04921-7_36
Pahikkala, Tapio ; Suominen, Hanna ; Boberg, Jorma ; Salakoski, Tapio I. / Efficient hold-out for subset of regressors. Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers: Lecture Notes in Computer Science. Vol. 5495 2009. pp. 350-359 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{ca85adb9b1fe46f2b2665cd2cfe79755,
title = "Efficient hold-out for subset of regressors",
abstract = "Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(|H|3 + |H|2n), where |H| is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m 3/N2 + (m2n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn2) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient selection of the optimal parameter value",
author = "Tapio Pahikkala and Hanna Suominen and Jorma Boberg and Salakoski, {Tapio I.}",
year = "2009",
doi = "10.1007/978-3-642-04921-7_36",
language = "English",
isbn = "3642049206",
volume = "5495",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "350--359",
booktitle = "Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers",

}

Pahikkala, T, Suominen, H, Boberg, J & Salakoski, TI 2009, Efficient hold-out for subset of regressors. in Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers: Lecture Notes in Computer Science. vol. 5495, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5495 LNCS, pp. 350-359, 9th International Conference on Adaptive and Natural Computing Algorithms, ICANNGA 2009, Kuopio, Finland, 23/04/09. https://doi.org/10.1007/978-3-642-04921-7_36

Efficient hold-out for subset of regressors. / Pahikkala, Tapio; Suominen, Hanna; Boberg, Jorma; Salakoski, Tapio I.

Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers: Lecture Notes in Computer Science. Vol. 5495 2009. p. 350-359 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5495 LNCS).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Efficient hold-out for subset of regressors

AU - Pahikkala, Tapio

AU - Suominen, Hanna

AU - Boberg, Jorma

AU - Salakoski, Tapio I.

PY - 2009

Y1 - 2009

N2 - Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(|H|3 + |H|2n), where |H| is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m 3/N2 + (m2n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn2) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient selection of the optimal parameter value

AB - Hold-out and cross-validation are among the most useful methods for model selection and performance assessment of machine learning algorithms. In this paper, we present a computationally efficient algorithm for calculating the hold-out performance for sparse regularized least-squares (RLS) in case the method is already trained with the whole training set. The computational complexity of performing the hold-out is O(|H|3 + |H|2n), where |H| is the size of the hold-out set and n is the number of basis vectors. The algorithm can thus be used to calculate various types of cross-validation estimates effectively. For example, when m is the number of training examples, the complexities of N-fold and leave-one-out cross-validations are O(m 3/N2 + (m2n)/N) and O(mn), respectively. Further, since sparse RLS can be trained in O(mn2) time for several regularization parameter values in parallel, the fast hold-out algorithm enables efficient selection of the optimal parameter value

UR - http://www.scopus.com/inward/record.url?scp=78650747993&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-04921-7_36

DO - 10.1007/978-3-642-04921-7_36

M3 - Conference contribution

SN - 3642049206

SN - 9783642049200

VL - 5495

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 350

EP - 359

BT - Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers

ER -

Pahikkala T, Suominen H, Boberg J, Salakoski TI. Efficient hold-out for subset of regressors. In Adaptive and Natural Computing Algorithms - 9th International Conference, ICANNGA 2009, Revised Selected Papers: Lecture Notes in Computer Science. Vol. 5495. 2009. p. 350-359. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-04921-7_36