Faculty Recruiting Support CICS

Radio-Adaptive Strategies for Efficient Deployment of Deep Learning on Resource-Constrained IoT Devices

04 Mar
Monday, 03/04/2024 2:00pm to 4:00pm
Hybrid - LGRC A311 and Zoom
PhD Dissertation Proposal Defense
Speaker: Jin Huang

Recent advancements in low-power radio technologies have significantly expanded the potential for model deployment on Internet of Things (IoT) devices, leading to innovative deployment strategies that optimize for energy and latency efficiency in real-time applications. However, the intrinsic limitations of low-power radios, including restricted throughput and duty-cycled operation, pose new challenges for model deployment. The traditional approach -- where IoT devices primarily gather sensor data and radios are tasked with transmitting this data to the cloud for processing -- fails to fully consider the impact of radio characteristics on the efficiency of IoT-cloud pipelines.

This thesis claims that deep learning models for IoT-based real-time cloud applications must be inherently designed to accommodate the bandwidth constraints, duty-cycling behavior of low-power radios, and the inherent variability of wireless communication at the edge. It addresses three key challenges: 1) the limited memory and computational capacities on IoT devices for deploying deep learning models; 2) the dynamic and short-term fluctuations in network connections; 3) the use of low-power, duty-cycled radios on IoT devices, which results in reduced throughput.

We make several contributions to this thesis by introducing systems that not only consider radio constraints and duty-cycled operations but also aim to improve latency and energy efficiency within IoT device capabilities. Firstly, we introduce the CLIO pipeline, which integrates deep learning model partitioning with progressive feature transmission, thereby accommodating bandwidth variations effectively. Secondly, we present the FLEET pipeline, an innovative approach that offloads early-exit model computations to the cloud while simultaneously transmitting features through duty-cycled radios, efficiently utilizing cloud resources to improve early exit performance, thereby enhancing both latency and energy consumption. Lastly, we propose a speculative inference pipeline that leverages a generative deep-learning model to offer a reconstruction pipeline. This pipeline is designed to tackle the asymmetric transmission delays across different data modalities caused by dynamic network conditions, enabling speculative inference to proceed without hindrance from delayed modalities.

Through these contributions, this thesis advances the deep learning model deployment on IoT devices by demonstrating how deep learning models and IoT-cloud pipelines can be optimized to meet the unique constraints and challenges of low-power radio communications, thereby paving the way for more efficient and reliable real-time IoT applications.
 

Advisor: Deepak Ganesan

Join via Zoom