Elastic DNN Inference with Unpredictable Exit in Edge Computing

出版物
IEEE Transactions on Mobile Computing

Multi-exit neural networks have gained popularity in edge computing to leverage the computing power of diverse devices. However, real-time tasks in edge applications often face frequent unpredictable exits caused by power outages or high-priority preemptions, which have been largely overlooked by multi-exit models. To address this challenge, it is crucial to determine the appropriate exit point in the multi-exit model to ensure desirable results during unpredictable exits. In this paper, we propose EINet, a sample-wise planner for real-time multi-exit deep neural networks. EINet enables efficient Elastic Inference with unpredictable exits while ensuring best-effort accuracy on various edge platforms. Our approach involves partitioning a trained deep neural network into multiple blocks, each with its exit. Furthermore, EINet utilizes block-wise model profiles, which include accuracy and inference time information for each block. By leveraging these profiles, EINet dynamically determines the optimal exit plan for each sample during the inference process. We introduce Confidence Score Predictors to adapt to the unique characteristics of input samples and employ the Search Engine to efficiently find near-optimal plans for elastic inference. Extensive evaluations of EINet using multiple deep neural networks and datasets with unpredictable exits demonstrate its superior performance. EINet exhibits significant accuracy improvements: 0.13%-16.5% compared to static plans, 0.79%-4.1% compared to other dynamic plans, and over 50%compared to predictable inference in typical scenarios.

高艺
高艺
教授

高艺,浙江大学计算机学院教授,博士生导师

董玮
董玮
教授

董玮,浙江大学计算机学院教授,博士生导师