All sparse PCA models are wrong, but some are useful: Part II: Limitations and problems of deflation

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

All sparse PCA models are wrong, but some are useful : Part II: Limitations and problems of deflation. / Camacho, J.; Smilde, A. K.; Saccenti, E.; Westerhuis, J. A.; Bro, Rasmus.

In: Chemometrics and Intelligent Laboratory Systems, Vol. 208, 104212, 2021.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Camacho, J, Smilde, AK, Saccenti, E, Westerhuis, JA & Bro, R 2021, 'All sparse PCA models are wrong, but some are useful: Part II: Limitations and problems of deflation', Chemometrics and Intelligent Laboratory Systems, vol. 208, 104212. https://doi.org/10.1016/j.chemolab.2020.104212

APA

Camacho, J., Smilde, A. K., Saccenti, E., Westerhuis, J. A., & Bro, R. (2021). All sparse PCA models are wrong, but some are useful: Part II: Limitations and problems of deflation. Chemometrics and Intelligent Laboratory Systems, 208, [104212]. https://doi.org/10.1016/j.chemolab.2020.104212

Vancouver

Camacho J, Smilde AK, Saccenti E, Westerhuis JA, Bro R. All sparse PCA models are wrong, but some are useful: Part II: Limitations and problems of deflation. Chemometrics and Intelligent Laboratory Systems. 2021;208. 104212. https://doi.org/10.1016/j.chemolab.2020.104212

Author

Camacho, J. ; Smilde, A. K. ; Saccenti, E. ; Westerhuis, J. A. ; Bro, Rasmus. / All sparse PCA models are wrong, but some are useful : Part II: Limitations and problems of deflation. In: Chemometrics and Intelligent Laboratory Systems. 2021 ; Vol. 208.

Bibtex

@article{66d9ba1a32044cbdb55bb1cf16e3147e,
title = "All sparse PCA models are wrong, but some are useful: Part II: Limitations and problems of deflation",
abstract = "Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this series, we illustrated limitations of several state-of-the-art sPCA algorithms when modeling noise-free data, simulated following an exact sPCA model. In this Part II we provide a thorough analysis of the limitations of sPCA methods that use deflation for calculating subsequent, higher order, components. We show, both theoretically and numerically, that deflation can lead to problems in the model interpretation, even for noise free data. In addition, we contribute diagnostics to identify modeling problems in real-data analysis.",
keywords = "Artifacts, Data interpretation, Exploratory data analysis, Model interpretation, Sparse principal component analysis, Sparsity",
author = "J. Camacho and Smilde, {A. K.} and E. Saccenti and Westerhuis, {J. A.} and Rasmus Bro",
year = "2021",
doi = "10.1016/j.chemolab.2020.104212",
language = "English",
volume = "208",
journal = "Chemometrics and Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - All sparse PCA models are wrong, but some are useful

T2 - Part II: Limitations and problems of deflation

AU - Camacho, J.

AU - Smilde, A. K.

AU - Saccenti, E.

AU - Westerhuis, J. A.

AU - Bro, Rasmus

PY - 2021

Y1 - 2021

N2 - Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this series, we illustrated limitations of several state-of-the-art sPCA algorithms when modeling noise-free data, simulated following an exact sPCA model. In this Part II we provide a thorough analysis of the limitations of sPCA methods that use deflation for calculating subsequent, higher order, components. We show, both theoretically and numerically, that deflation can lead to problems in the model interpretation, even for noise free data. In addition, we contribute diagnostics to identify modeling problems in real-data analysis.

AB - Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this series, we illustrated limitations of several state-of-the-art sPCA algorithms when modeling noise-free data, simulated following an exact sPCA model. In this Part II we provide a thorough analysis of the limitations of sPCA methods that use deflation for calculating subsequent, higher order, components. We show, both theoretically and numerically, that deflation can lead to problems in the model interpretation, even for noise free data. In addition, we contribute diagnostics to identify modeling problems in real-data analysis.

KW - Artifacts

KW - Data interpretation

KW - Exploratory data analysis

KW - Model interpretation

KW - Sparse principal component analysis

KW - Sparsity

U2 - 10.1016/j.chemolab.2020.104212

DO - 10.1016/j.chemolab.2020.104212

M3 - Journal article

AN - SCOPUS:85098168203

VL - 208

JO - Chemometrics and Intelligent Laboratory Systems

JF - Chemometrics and Intelligent Laboratory Systems

SN - 0169-7439

M1 - 104212

ER -

ID: 254720978