Traffic representations for network measurements

Azorin, Raphaël
Thesis

Measurements are essential to operate and manage computer networks, as they are critical to analyze performance and establish diagnosis. In particular, per-flow mon-itoring consists in computing metrics that characterize the individual data streams traversing the network. To develop relevant traffic representations, operators need to select suitable flow characteristics and carefully relate their cost of extraction with their expressiveness for the downstream tasks considered. In this thesis, we propose novel methodologies to extract appropriate traffic representations. In particular, we posit that Machine Learning can enhance measurement systems, thanks to its abil-ity to learn patterns from data, in order to provide predictions of pertinent traffic characteristics. The first contribution of this thesis is a framework for sketch-based measurements systems to exploit the skewed nature of network traffic. Specifically, we propose a novel data structure representation that leverages sketches’ under-utilization, re-ducing per-flow measurements memory footprint by storing only relevant coun-ters. The second contribution is a Machine Learning-assisted monitoring system that integrates a lightweight traffic classifier. In particular, we segregate large and small flows in the data plane, before processing them separately with dedicated data structures for various use cases. The last contributions address the design of a uni-fied Deep Learning measurement pipeline that extracts rich representations from traffic data for network analysis. We first draw from recent advances in sequence modeling to learn representations from both numerical and categorical traffic data. These representations serve as input to solve complex networking tasks such as clickstream identification and mobile terminal movement prediction in WLAN. Fi-nally, we present an empirical study of task affinity to assess when two tasks would benefit from being learned together.


HAL
Type:
Thèse
Date:
2024-06-18
Department:
Data Science
Eurecom Ref:
7683
Copyright:
© EURECOM. Personal use of this material is permitted. The definitive version of this paper was published in Thesis and is available at :
See also:

PERMALINK : https://www.eurecom.fr/publication/7683