Internet of Things (IoT) devices have communication, computation, size, weight, battery life, and power consumption limitations. Machine learning (ML) algorithms on IoT devices suffer from limited computational resources, resulting in poor performance. Techniques such as model pruning, offloading, and data compression can improve computational, network, and storage costs as well as performance at the expense of inference quality. This study examines the performance of ML inferencing tasks and adaptations on IoT devices using statistical learning methods such as ridge regression and random forest. We aim to understand the trade-offs of inference adaptations in a range of operational regimes including constraints on available bandwidth, distance from the data, and constraints on compute. Our results indicate that the task configuration, offload or on-device, and the bandwidth available are the most critical factors in determining inference performance, while the percentage of model pruned is the least important. These findings demonstrate how statistical learning can be used to better understand the effects of task performance on IoT device and offers insights into which inference adaptations provide the largest improvements in ML inference tasks to support real-time requirements.
|