Clustering consumers based on product discrimination in check-all-that-apply (CATA) data
Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Standard
Clustering consumers based on product discrimination in check-all-that-apply (CATA) data. / Castura, J. C.; Meyners, M.; Varela, P.; Næs, T.
I: Food Quality and Preference, Bind 99, 104564, 2022.Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - JOUR
T1 - Clustering consumers based on product discrimination in check-all-that-apply (CATA) data
AU - Castura, J. C.
AU - Meyners, M.
AU - Varela, P.
AU - Næs, T.
PY - 2022
Y1 - 2022
N2 - Consumers can be clustered based on their product-related check-all-that-apply (CATA) responses. We identify two paradoxes that can occur if these clusters are derived from conventional similarity coefficients. The first paradox is that clustering similar consumers can nullify within-cluster sensory differentiation of products. The second paradox is that consumers who check many attributes yet disagree can be clustered together, whereas consumers who check fewer attributes without disagreement can be split into different clusters. After illustrating these paradoxes with toy data sets, we propose "b-cluster analysis", in which consumers are clustered according to how they differentiate products. We define performance metrics to compare cluster analysis solutions. By design, b-cluster analysis is expected to give different results than CLUSCATA, since the objective of CLUSCATA is to cluster consumers who characterize products similarly, not according to how they differentiate products. We apply b-cluster analysis to the same toy data sets and show that the identified paradoxes do not occur. Then we apply both b-cluster analysis and CLUSCATA to a real consumer data set. We find that the b-cluster analysis solutions have better within-cluster sensory differentiation, better sensory discrimination, and less redundant clusters than CLUSCATA solutions. To investigate the sensitivity of b-cluster analysis to the initial (random) cluster membership allocations, we obtained 10,000 two-cluster solutions, each initialized with a different random partitioning of consumers. The best solution, which retains the most sensory differentiation, was observed in 21.4% of the runs. As a best practice, we recommend running b-cluster analysis several times and choosing the best solution. The proposed b-cluster analysis approach can be extended to other types of sensometric data and may have applications in other fields.
AB - Consumers can be clustered based on their product-related check-all-that-apply (CATA) responses. We identify two paradoxes that can occur if these clusters are derived from conventional similarity coefficients. The first paradox is that clustering similar consumers can nullify within-cluster sensory differentiation of products. The second paradox is that consumers who check many attributes yet disagree can be clustered together, whereas consumers who check fewer attributes without disagreement can be split into different clusters. After illustrating these paradoxes with toy data sets, we propose "b-cluster analysis", in which consumers are clustered according to how they differentiate products. We define performance metrics to compare cluster analysis solutions. By design, b-cluster analysis is expected to give different results than CLUSCATA, since the objective of CLUSCATA is to cluster consumers who characterize products similarly, not according to how they differentiate products. We apply b-cluster analysis to the same toy data sets and show that the identified paradoxes do not occur. Then we apply both b-cluster analysis and CLUSCATA to a real consumer data set. We find that the b-cluster analysis solutions have better within-cluster sensory differentiation, better sensory discrimination, and less redundant clusters than CLUSCATA solutions. To investigate the sensitivity of b-cluster analysis to the initial (random) cluster membership allocations, we obtained 10,000 two-cluster solutions, each initialized with a different random partitioning of consumers. The best solution, which retains the most sensory differentiation, was observed in 21.4% of the runs. As a best practice, we recommend running b-cluster analysis several times and choosing the best solution. The proposed b-cluster analysis approach can be extended to other types of sensometric data and may have applications in other fields.
KW - Cluster analysis
KW - Unsupervised classification
KW - Binary data
KW - Sensory evaluation
KW - Consumer testing
KW - Agreement
KW - TRAINED ASSESSORS
KW - QUESTIONS
KW - ASSOCIATION
KW - ERROR
KW - ORDER
U2 - 10.1016/j.foodqual.2022.104564
DO - 10.1016/j.foodqual.2022.104564
M3 - Journal article
VL - 99
JO - Food Quality and Preference
JF - Food Quality and Preference
SN - 0950-3293
M1 - 104564
ER -
ID: 312640102