All sparse PCA models are wrong, but some are useful: Part II: Limitations and problems of deflation
Research output: Contribution to journal › Journal article › Research › peer-review
Standard
All sparse PCA models are wrong, but some are useful : Part II: Limitations and problems of deflation. / Camacho, J.; Smilde, A. K.; Saccenti, E.; Westerhuis, J. A.; Bro, Rasmus.
In: Chemometrics and Intelligent Laboratory Systems, Vol. 208, 104212, 2021.Research output: Contribution to journal › Journal article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - All sparse PCA models are wrong, but some are useful
T2 - Part II: Limitations and problems of deflation
AU - Camacho, J.
AU - Smilde, A. K.
AU - Saccenti, E.
AU - Westerhuis, J. A.
AU - Bro, Rasmus
PY - 2021
Y1 - 2021
N2 - Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this series, we illustrated limitations of several state-of-the-art sPCA algorithms when modeling noise-free data, simulated following an exact sPCA model. In this Part II we provide a thorough analysis of the limitations of sPCA methods that use deflation for calculating subsequent, higher order, components. We show, both theoretically and numerically, that deflation can lead to problems in the model interpretation, even for noise free data. In addition, we contribute diagnostics to identify modeling problems in real-data analysis.
AB - Sparse Principal Component Analysis (sPCA) is a popular matrix factorization approach based on Principal Component Analysis (PCA). It combines variance maximization and sparsity with the ultimate goal of improving data interpretation. A main application of sPCA is to handle high-dimensional data, for example biological omics data. In Part I of this series, we illustrated limitations of several state-of-the-art sPCA algorithms when modeling noise-free data, simulated following an exact sPCA model. In this Part II we provide a thorough analysis of the limitations of sPCA methods that use deflation for calculating subsequent, higher order, components. We show, both theoretically and numerically, that deflation can lead to problems in the model interpretation, even for noise free data. In addition, we contribute diagnostics to identify modeling problems in real-data analysis.
KW - Artifacts
KW - Data interpretation
KW - Exploratory data analysis
KW - Model interpretation
KW - Sparse principal component analysis
KW - Sparsity
U2 - 10.1016/j.chemolab.2020.104212
DO - 10.1016/j.chemolab.2020.104212
M3 - Journal article
AN - SCOPUS:85098168203
VL - 208
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
SN - 0169-7439
M1 - 104212
ER -
ID: 254720978