Case study of WebAssembly Runtimes for AI Applications on the Edge

Khelifa, Saif Edine; Bagaa, Miloud; Messaoud, Ahmed Ouameur; Ksentini, Adlen
GIIS 2024, Global Information Infrastructure and Networking Symposium, 19-21 February 2024, Dubai, United Arab Emirates

In the realm of Artificial Intelligence (AI), the need for immediate response times has given rise to the Cloud Edge Computing Continuum (CECC). This new paradigm, aided by emerging technologies, addresses latency and network delays while promoting portability, security, and efficiency, thereby enhancing Quality of Service (QoS). A noteworthy technology in this context is WebAssembly (Wasm), originally conceived to amplify web performance. It has transitioned to the CECC, primarily due to key enablers like the WebAssembly System Interface (Wasi) and the Wasm runtime. Besides offering heightened security through its sandboxing mechanism, WebAssembly's compact code paves the way for rapid cold start times and seamless migration in AI applications. However, with WebAssembly's nascent integration into the CECC, several questions arise. Prominent among them is the efficiency of deploying AI tasks in Wasm binary format, particularly the performance of Wasm runtimes in AI-centric tasks and potential factors affecting such executions. Addressing these queries, our study examines various deep-learning models on standalone WebAssembly runtimes. Our findings indicate that, for smaller networks with optimized parameters, standalone runtimes approach native performance, presenting just a 1.1x overhead on average. Contrarily, networks with an extensive parameter set exhibited pronounced overheads. We also identified multiple factors, associated both with run-times and neural networks, offering insights for future research endeavors.


DOI
Type:
Conférence
City:
Dubai
Date:
2024-02-19
Department:
Systèmes de Communication
Eurecom Ref:
7631
Copyright:
© 2024 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
See also:

PERMALINK : https://www.eurecom.fr/publication/7631