KEYWORDS: Performance modeling, Image processing, Neural networks, Signal processing, Neurons, Tunable filters, Feature extraction, Digital signal processing, Systems modeling, Sensors
KeyWord Spotting (KWS), i.e. the capability to identify vocal commands as they are pronounced, is becoming one of the most important features of Human-Machine Interface (HMI), also thanks to the pervasive diffusion of high-performance MEMS audio sensors with very reduced dimensions. In-Sensor Computing (ISC) appears the most viable solution to get the maximum advantage of KWS, since the dimensions of MEMS microphones remain reduced and minimally invasive. ISC, indeed, represents the extreme evolution of the edge computing paradigm, where the processing circuits are moved close to the audio sensor, integrated into its auxiliary circuitry or in the same package. However, ISC introduces severe area and power constraints and must trade off with processing speed to meet real-time operations naturally required by KWS. In this work, we want to show a neural network-based KWS suitable for ISC contexts, when audio sensor data are converted into MEL spectrogram images and a Depthwise Separable Convolutional Neural Network (DSCNN) with feature extraction capabilities is designed. To show the advantages of the above approach, the DSCNN is compared with an alternative Fully Connected Neural Network (FCNN), operating on audio signals not converted into images. The considered models have been profiled on a microcontroller and implemented on an FPGA. Their performances are compared in terms of classification accuracy and HW resources. Comparisons show that the FCNN is very far from meeting the ISC real-time processing requirements, showing a number of parameters and a frame latency respectively of 3 and 1 orders of magnitude higher than required by the DSCNN alternative when mapped to a Xilinx Zynq Ultrascale+ MPSoC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.