Making sense of multiple distance matrices through common and distinct components

Institut for Fødevarevidenskab (KU FOOD)

Making sense of multiple distance matrices through common and distinct components

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Standard

Making sense of multiple distance matrices through common and distinct components. / Solberg, Lars Erik; Dahl, Tobias; Naes, Tormod.

I: Journal of Chemometrics, Bind 35, Nr. 11, 3372, 2021.

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt

Harvard

Solberg, LE, Dahl, T & Naes, T 2021, 'Making sense of multiple distance matrices through common and distinct components', Journal of Chemometrics, bind 35, nr. 11, 3372. https://doi.org/10.1002/cem.3372

APA

Solberg, L. E., Dahl, T., & Naes, T. (2021). Making sense of multiple distance matrices through common and distinct components. Journal of Chemometrics, 35(11), [3372]. https://doi.org/10.1002/cem.3372

Vancouver

Solberg LE, Dahl T, Naes T. Making sense of multiple distance matrices through common and distinct components. Journal of Chemometrics. 2021;35(11). 3372. https://doi.org/10.1002/cem.3372

Author

Solberg, Lars Erik ; Dahl, Tobias ; Naes, Tormod. / Making sense of multiple distance matrices through common and distinct components. I: Journal of Chemometrics. 2021 ; Bind 35, Nr. 11.

Bibtex

@article{dc55c267331c4d99baadf95f4019177b,

title = "Making sense of multiple distance matrices through common and distinct components",

abstract = "Multiblock analysis attacks the problem of how to combine data from various data sources for purposes such as prediction, classification, clustering, or visual data analysis. A key concept is the distinction between “common” and “distinct” parts, that is, what information repeats itself across the blocks and what is unique to an individual block.The statistical field of multiblock analysis holds many different approaches, which leads to different treatments both of the terms distinct and common themselves and to differences in the numerical results. In this article, we extend the discussion of distinct and common in multiblock analysis to the domain of distance matrices, that is, the situation where data point sets, so-called configurations, are analyzed via relative distances either because configurations are not available directly or because a distance representation is favorable. Situations typical for chemometrics will be highlighted and illustrated in examples.When analyzing different methods, we have focused on three key aspects. First, during the transition from the distance to configuration domains, one needs to consider how multiple distance matrices are treated. Second, when extracting common and distinct parts, one needs to manage a tradeoff between explaining variance and ensuring similarity between subspaces. Third, there is a design choice to be made as to whether the subspace containing the common parts is “shared” between blocks or if separate subspaces are associated with each individual block. The three aspects help to categorize and explain well-known methods in the field. A selection of methods was analyzed and subsequently applied to examples.",

keywords = "common, consensus, distances, distinct, multiblock, multidimensional scaling",

author = "Solberg, {Lars Erik} and Tobias Dahl and Tormod Naes",

year = "2021",

doi = "10.1002/cem.3372",

language = "English",

volume = "35",

journal = "Journal of Chemometrics",

issn = "0886-9383",

publisher = "Wiley",

number = "11",

}

RIS

TY - JOUR

T1 - Making sense of multiple distance matrices through common and distinct components

AU - Solberg, Lars Erik

AU - Dahl, Tobias

AU - Naes, Tormod

PY - 2021

Y1 - 2021

N2 - Multiblock analysis attacks the problem of how to combine data from various data sources for purposes such as prediction, classification, clustering, or visual data analysis. A key concept is the distinction between “common” and “distinct” parts, that is, what information repeats itself across the blocks and what is unique to an individual block.The statistical field of multiblock analysis holds many different approaches, which leads to different treatments both of the terms distinct and common themselves and to differences in the numerical results. In this article, we extend the discussion of distinct and common in multiblock analysis to the domain of distance matrices, that is, the situation where data point sets, so-called configurations, are analyzed via relative distances either because configurations are not available directly or because a distance representation is favorable. Situations typical for chemometrics will be highlighted and illustrated in examples.When analyzing different methods, we have focused on three key aspects. First, during the transition from the distance to configuration domains, one needs to consider how multiple distance matrices are treated. Second, when extracting common and distinct parts, one needs to manage a tradeoff between explaining variance and ensuring similarity between subspaces. Third, there is a design choice to be made as to whether the subspace containing the common parts is “shared” between blocks or if separate subspaces are associated with each individual block. The three aspects help to categorize and explain well-known methods in the field. A selection of methods was analyzed and subsequently applied to examples.

AB - Multiblock analysis attacks the problem of how to combine data from various data sources for purposes such as prediction, classification, clustering, or visual data analysis. A key concept is the distinction between “common” and “distinct” parts, that is, what information repeats itself across the blocks and what is unique to an individual block.The statistical field of multiblock analysis holds many different approaches, which leads to different treatments both of the terms distinct and common themselves and to differences in the numerical results. In this article, we extend the discussion of distinct and common in multiblock analysis to the domain of distance matrices, that is, the situation where data point sets, so-called configurations, are analyzed via relative distances either because configurations are not available directly or because a distance representation is favorable. Situations typical for chemometrics will be highlighted and illustrated in examples.When analyzing different methods, we have focused on three key aspects. First, during the transition from the distance to configuration domains, one needs to consider how multiple distance matrices are treated. Second, when extracting common and distinct parts, one needs to manage a tradeoff between explaining variance and ensuring similarity between subspaces. Third, there is a design choice to be made as to whether the subspace containing the common parts is “shared” between blocks or if separate subspaces are associated with each individual block. The three aspects help to categorize and explain well-known methods in the field. A selection of methods was analyzed and subsequently applied to examples.

KW - common

KW - consensus

KW - distances

KW - distinct

KW - multiblock

KW - multidimensional scaling

U2 - 10.1002/cem.3372

DO - 10.1002/cem.3372

M3 - Journal article

VL - 35

JO - Journal of Chemometrics

JF - Journal of Chemometrics

SN - 0886-9383

IS - 11

M1 - 3372

ER -

ID: 285870320