Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring

Research output: Contribution to journalJournal articleResearchpeer-review

Standard

Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring. / Camacho, Jose; Wasielewska, Katarzyna; Bro, Rasmus; Kotz, David.

In: IEEE Transactions on Network and Service Management, 2024.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

Camacho, J, Wasielewska, K, Bro, R & Kotz, D 2024, 'Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring', IEEE Transactions on Network and Service Management. https://doi.org/10.1109/TNSM.2024.3368501

APA

Camacho, J., Wasielewska, K., Bro, R., & Kotz, D. (2024). Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring. IEEE Transactions on Network and Service Management. https://doi.org/10.1109/TNSM.2024.3368501

Vancouver

Camacho J, Wasielewska K, Bro R, Kotz D. Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring. IEEE Transactions on Network and Service Management. 2024. https://doi.org/10.1109/TNSM.2024.3368501

Author

Camacho, Jose ; Wasielewska, Katarzyna ; Bro, Rasmus ; Kotz, David. / Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring. In: IEEE Transactions on Network and Service Management. 2024.

Bibtex

@article{508f371b19d3472dabb4f6048dbc78c8,
title = "Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring",
abstract = "There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows us to detect and diagnose disparate network anomalies, with a data-analysis workflow that combines the advantages of interpretable and interactive models with the power of parallel processing. We apply the extended MBDA to two case studies: UGR{\textquoteright}16, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth{\textquoteright}18, the longest and largest Wi-Fi trace known to date.",
keywords = "Analytical models, Anomaly Detection, Big Data, Dartmouth Campus Wi-Fi, Data models, Data visualization, Interpretable Machine Learning, Monitoring, Multivariate Big Data Analysis, Network Monitoring, Principal component analysis, Representation learning, UGR{\textquoteright}16",
author = "Jose Camacho and Katarzyna Wasielewska and Rasmus Bro and David Kotz",
note = "Publisher Copyright: Authors",
year = "2024",
doi = "10.1109/TNSM.2024.3368501",
language = "English",
journal = "IEEE Transactions on Network and Service Management",
issn = "1932-4537",
publisher = "Institute of Electrical and Electronics Engineers",

}

RIS

TY - JOUR

T1 - Interpretable Feature Learning in Multivariate Big Data Analysis for Network Monitoring

AU - Camacho, Jose

AU - Wasielewska, Katarzyna

AU - Bro, Rasmus

AU - Kotz, David

N1 - Publisher Copyright: Authors

PY - 2024

Y1 - 2024

N2 - There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows us to detect and diagnose disparate network anomalies, with a data-analysis workflow that combines the advantages of interpretable and interactive models with the power of parallel processing. We apply the extended MBDA to two case studies: UGR’16, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth’18, the longest and largest Wi-Fi trace known to date.

AB - There is an increasing interest in the development of new data-driven models useful to assess the performance of communication networks. For many applications, like network monitoring and troubleshooting, a data model is of little use if it cannot be interpreted by a human operator. In this paper, we present an extension of the Multivariate Big Data Analysis (MBDA) methodology, a recently proposed interpretable data analysis tool. In this extension, we propose a solution to the automatic derivation of features, a cornerstone step for the application of MBDA when the amount of data is massive. The resulting network monitoring approach allows us to detect and diagnose disparate network anomalies, with a data-analysis workflow that combines the advantages of interpretable and interactive models with the power of parallel processing. We apply the extended MBDA to two case studies: UGR’16, a benchmark flow-based real-traffic dataset for anomaly detection, and Dartmouth’18, the longest and largest Wi-Fi trace known to date.

KW - Analytical models

KW - Anomaly Detection

KW - Big Data

KW - Dartmouth Campus Wi-Fi

KW - Data models

KW - Data visualization

KW - Interpretable Machine Learning

KW - Monitoring

KW - Multivariate Big Data Analysis

KW - Network Monitoring

KW - Principal component analysis

KW - Representation learning

KW - UGR’16

U2 - 10.1109/TNSM.2024.3368501

DO - 10.1109/TNSM.2024.3368501

M3 - Journal article

AN - SCOPUS:85186994445

JO - IEEE Transactions on Network and Service Management

JF - IEEE Transactions on Network and Service Management

SN - 1932-4537

ER -

ID: 389672967