PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
1UCLA Samueli School of Engineering (United States) 2National Institute of Information and Communications Technology (Japan) 3Hamamatsu Photonics (Japan)
This PDF file contains the front matter associated with SPIE Proceedings Volume 12438, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computational Imaging/ONN: Joint Session with Conferences 12435 and 12438
Existing photonic matrix processers are too small to tackle relevant problems. Here, I review our group’s recent work on scaling up analog photonic platforms. This work includes iterative advances to old approaches (accurate methods to calibrate MZI meshes), experimental demonstrations of recent proposals (a VCSEL array-based coherent detection ONN and a single-shot ONN based on reconfigurable free-space optical fan-out and weighting), and entirely new architectures (WDM-powered and RF-photonic fiber circuits for edge computing). The lessons learned from studying this diverse array of approaches helps inform the future development of photonic hardware for computation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Dynamic real-time optical processing has significant potential for accelerating specific tensor algebra. Here we present the first demonstration of simultaneous amplitude and phase modulation of an optical two-dimension signal in the Fourier plane of a thin lens. Two spatial light modulators (SLMs) arranged in a Michelson interferometer modulate the amplitude and the phase while being simultaneously in the focal plane of two Fourier lenses. The lenses frame an interferometer in a 4f-system enabling full modulation in the Fourier domain of a telescope. Main sources of phase noise and losses are discussed such as native to SLMs non-linear inter-pixel crosstalk, variability in modulation efficiency as a function of projected mask parameters, and Fresnel’s optics limitations. Such a system is of extreme utility in rapidly progressing fields of optical computing, hardware acceleration, encryption, and machine learning, where neglecting phase modulation can lead to impractical bit-error rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We report ultrahigh bandwidth applications of Kerr microcombs at data rates beyond 10 Terabits/s. Optical neural networks can dramatically accelerate the computing speed to overcome the inherent bandwidth bottleneck of electronics. At the same time, digital signal processing has become central to many fields, from coherent optical telecommunications where it is used to compensate signal impairments, to image processing, important for observational astronomy, medical diagnosis, autonomous driving, big data and particularly artificial intelligence. Digital signal processing had traditionally been performed electronically, but new applications, particularly those involving real time video image processing, are creating unprecedented demand for ultrahigh performance, including bandwidth and reduced energy consumption. We use a new and powerful class of micro-comb called soliton crystals that exhibit robust operation and stable generation as well as a high intrinsic efficiency with a low spacing of 48.9 GHz. We demonstrate a universal optical vector convolutional accelerator operating at 11 Tera-OPS/s (TOPS) on 250,000 pixel images for 10 kernels simultaneously — enough for facial image recognition. We use the same hardware to sequentially form a deep optical CNN with ten output neurons, achieving successful recognition of full 10 digits with 900 pixel handwritten digit images. Finally, we demonstrate a photonic digital signal processor operating at 18 Tb/s and use it to process multiple simultaneous video signals in real-time. The system processes 400,000 video signals concurrently, performing 34 functions simultaneously that are key to object edge detection, edge enhancement and motion blur. As compared with spatial-light devices used for image processing, our system is not only ultra-high speed but highly reconfigurable and programable, able to perform many different functions without any change to the physical hardware. Our approach, based on an integrated Kerr soliton crystal microcomb, opens up new avenues for ultrafast robotic vision and machine learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spintronic devices have received lots of attention recently due to their potential to provide a solution for the presentday challenge of increased power dissipation. Among spintronic devices, domain-wall synaptic devices are speed and energy efficient for solving image classification, speech recognition, and other problems. In this paper, a fully connected neural network (FCNN) is implemented using energy-efficient domain wall-based synaptic devices and transistor-based feedback circuits. The designed FCNN is trained on-chip for the classification of Fisher's Iris dataset. The proposed neural network achieves an accuracy of 95%. The proposed FCNN is 96% and 83.3% efficient in terms of energy and latency respectively when compared to previously proposed hardware for on-chip learning
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Advancements in nanophotonics have raised the bar for optoelectronic devices, demanding ultra-compact size, fast speeds, high efficiency, and low energy consumption. Emerging materials hold the potential to meet these demands, enabling the creation of high-performing optoelectronic devices. We present our latest breakthroughs and demonstrate device prototypes made from various materials, pushing the boundaries of optoelectronic performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optical neural networks hold tremendous potential for energy efficiency and low latency. Despite this potential, the mismatch between simulation training and experimental setups can negatively impact the system’s performance. To address this issue, we present a local search method for training optical neural networks in the system. The implementation of this training method allows optical neural networks to reach their state-of-the-art performance levels within the constraints of current experimental settings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With their recognized advantages such as system-level size, weight and power (SWaP) benefits, minimal monochromatic aberration, polarization discrimination capacity, and low-cost at scale, metasurfaces have emerged as a transformative optics technology. Here we present the applications of polarization-multiplexed, multifunctional metasurface optics for imaging and sensing. Specifically, depth resolved imaging using metalenses with custom-engineered point spread functions will be discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We demonstrated a large-scale space-time-multiplexed homodyne optical neural network (ONN) using arrays of high-speed (GHz) vertical-cavity surface-emitting lasers (VCSELs). Injection locking enables precise phase control over tens of VCSEL devices simultaneously, facilitating photoelectric-multiplication-based matrix operations and all-optical nonlinearity, operating at the quantum-noise limit. Our VCSEL transmitters exhibit ultra-high electro-optic conversion efficiency (Vπ=4 mV), allowing neural encoding at 5 attojoule/symbol. Three-dimensional neural connectivity allows parallel computing. The full-system energy efficiency reaches 7 fJ/operation, which is >100-fold better than the state-of-the-art digital microprocessors and other ONN demonstrations. Digit classification is achieved with an accuracy of 98% of the group truth.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural networks and other advanced image processing algorithms excel in a wide variety of computer vision and imaging applications, but their high performance also comes at a high computational cost and their success is sometimes limited. Here, we explore hybrid optical-digital strategies to computational imaging that outsource parts of the algorithm into the optical domain. Using such a co-design of optics and image processing, we can learn application-domain-specific cameras using modern artificial intelligence techniques or compute parts of a convolutional neural network in optics. Optical computing happens at the speed of light and without any memory or power requirements, thereby opening new directions for intelligent imaging systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reservoir Computers (RC) are brain-inspired algorithms that use partially untrained recurrent neural networks where only output connections are tuned. RCs can perform signal-analysis tasks such as distortion compensation. We recently demonstrated a photonic RC in which neurons are encoded in a frequency comb, untrained interconnections are realized by phase modulation, and trained output connections are realized by spectral filters. Here, we present a further development of this scheme in which the same substrate is used to implement two RCs simultaneously. The two RCs can either be used in parallel on different tasks, or in series, thereby implementing a “deep” RC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reservoir Computers (RCs) are brain-inspired algorithms based on recurrent neural networks where only output weights are tuned, while internal weights remain untrained. We recently demonstrated a photonic frequency-multiplexing RC encoding neurons in the lines of a frequency comb. We also demonstrated a single-layer feed-forward neural network based on a similar frequency-multiplexing principle. Here we present the design for an integrated optical output layer for such frequency multiplexing based photonic neural networks. The all-optical output layer uses wavelength (de)multiplexers and wavelength converters to apply signed weights to neurons encoded in comb lines.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Photonic Tensor Cores (PTCs) have been raised as one of the major candidates to accelerate Neural Network hardware, taking advantage of the high bandwidth, low latency, and energy efficiency that propagating light has over the electrical counterpart. However, actual solutions still rely on external bulk components, such as lasers, modulators, and photodetectors. In this work, we show the first fully integrated hybrid Silicon Photonics PTC for computing Matrix-Vector Multiplication (MVM), one of the main steps for any Neural Network layer. The PTC is formed by a WDM 3-fold InP laser array connected to an active Silicon Photonic PIC through Photonic Wire Bondings. The CW lasers are modulated by high-speed Mach-Zehnder modulators to generate the input vector. The mix of all the signals is sent into a 3x3 matrix, formed by high-speed add-drop coupled microring resonators, whose tuning signals are the weights of the matrix. Outputs are collected by a bank of high-speed integrated photodetectors. The whole photonics IC has a footprint of 4.1×1.7 mm2, including lasers, allowing to have just electrical I/O. The full integration of input modulators, weights, and photodetectors can allow the PIC to work at over 20 GHz bandwidth with extremely low latency. This integration is a key step toward the actual deployment of photonics as an NN-accelerator for future AI systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Artificial intelligence (AI) and machine learning (ML) have tremendous potential for increasing the scale and reach of the photonics industry. We present how the use of AI/ML has revolutionized the field of photonic integrated circuit design and manufacturing, and resulted in mass deployments of high-performance optical chips for multiple classes of datacom and telecom applications. First, we discuss our use of a deep neural network multivariate regression model to optimize the individual design parameters of hundreds of optical chips on a given mask. This work successfully addresses the systematic processing variations within a wafer, resulting in an unprecedented homogeneity of performance of optical chips in a high-volume production environment. Second, we present our approach of using ML to predict the performance of optical devices by wafer probing. This novel approach eliminates the expensive and time-consuming process of optical chip testing and instead relies on a wafer probe measurement to infer the performance of hundreds of chips on a wafer. We discuss the complexity of the problem of predicting the performance in multi-dimensional parameter space, the inherent challenges that cannot be overcome by traditional methods, and the reasons why ML is an essential tool to solve this problem. The support vector machine (SVM) that we developed performs nonlinear binary classification based on a regression from the probe measurement, allowing unprecedented control over our process, including in-situ monitoring of wafer fabrication and real-time process adjustments, and thus achieving consistently high performance of optical chips at high production volumes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A shift in the way we generate and distribute electricity from few, large-scale units; to many, small-scale, decentralized units has created power distribution complications that cannot be managed by one centralized authority in reasonable time. We hypothesize that through the utilization of photonic-based computing, physics-based hierarchical machine learning frameworks will allow for optimized grid operation at lower computational cost. Behavioral analysis of trained agents demonstrates that local, modular control of microgrids could provide a reliable alternative to the existing method of power grid control. It is our vision that these agents could discover new methods for optimal grid control.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In typical artificial neural networks, neurons adjust according to global calculations of a central processor, but in the brain neurons and synapses self-adjust based on local information. A man-made self-adjusting (distributed) system capable of performing machine-learning problems would have substantial scaling advantages over typical computational neural networks, in power consumption, speed, and robustness to damage. Furthermore, such a system would allow us to study physical learning without the added complexity of biology. Here we unveil the second-generation design of such a system – a transistor-based self-adjusting analog network that trains itself to perform a wide variety of tasks. Here we demonstrate basic features of the system, including the ability to monitor all internal states. This platform is already faster than a simulation of itself, and is thus an exciting platform for the investigation of physical learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Machine Learning for Optical Sensing and Metrology
Long-distance ranging in existing coherent lidar techniques suffer from the coherence length of lasers. Here we present a coherent multi-tone continuous-wave (MTCW) lidar technique that performs single-shot simultaneous ranging and velocimetry with a high resolution at distances far beyond the coherence length of a CW laser, without frequency/phase sweeping. The proposed technique utilizes relative phase accumulations at phase-locked RF sidebands and Doppler shifts to identify the range and velocity of the target after a heterodyne detection of the beating of the echo signal with an unmodulated CW optical local oscillator (LO). The predefined RF sidebands enable ultra-narrow-bandwidth RF filters in the analog or digital domain to suppress noise and achieve high SNR ranging and velocimetry. Up-to-date, we demonstrated that the MTCW-lidar could perform ranging ×500 beyond the coherence length of the laser with <1cm precision. In a quasi-CW configuration, >1km ranging is realized with <3cm precision. Moreover, we incorporate machine-learning algorithms into MTCW-lidar to identify the reflections from multiple targets and improve the range resolution. Since relative phases of RF-sidebands are utilized for ranging, and common phase noises can be suppressed in signal processing, we show that the LO in heterodyne detection does not have to be the same laser source. Hence a separate free-running laser can be used. This approach paves the way for novel optical localization. To prove the concept, we present that a receiver with a free-running CW LO can determine its relative distance to a remote transmitter at 1.5km away with a <5cm accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In many optical experiments, a long measurement time is necessary to collect enough information and improve the signal-to-noise ratio. This happens, for example, in total luminescence spectroscopy (TLS) where the data is acquired as excitation-emission matrices (EEMs). An EEM is an unique chemical fingerprint of the analyzed substance that allows its comprehensive characterization. To collect a high-resolution EEM, it is necessary to scan both the excitation and the emission wavelengths in small steps and, for each step, to collect the light for a long time to maximize the signal-to-noise ratio. Therefore, acquiring a high-resolution excitation emission matrix can take more than an hour, depending on the size of the wavelength steps, the intensity of the signal, and the spectral range to be analyzed. This paper proposes a new method to reconstruct a high-resolution EEM from low-resolution one using deep learning super-resolution techniques. Specifically, this work proposes a new artificial neural network architecture, a sub-pixel convolutional neural network, designed to be applied to fluorescence EEM images. The code used is made available via a GitHub repository with instructions for applying transfer learning to different types of images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Three-dimensional (3D) imaging captures depth information from a given scene and is used in a wide range of fields like industrial environments, smartphones and autonomous driving, among others. This paper summarises the results of a depth video super-resolution scheme that is tailored for single-photon avalanche diode (SPAD) image sensors, which produces 3D maps at frame rates > 100 FPS (32×64 pixels). Consecutive frames are used to super-resolve and denoise depth maps via 3D convolutional neural networks with an upscaling factor of 4. Due to the lack of noise-free, high-resolution depth maps captured with high-speed cameras, the neural network is trained with synthetic data using Unreal Engine, which is later processed to resemble the data outputted by a SPAD sensor. The model is then tested with different video sequences captured with a high-speed SPAD dToF, which processes frames at >30 frames per second. The super-resolved data shows a significant reduction in noise and presents enhanced edge details in objects. We believe these results are relevant to improve the accuracy of object detection in autonomous driving cars for collision avoidance or AR/VR systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The demand for efficient actuators in photonics has peaked with increasing popularity for large-scale general-purpose programmable photonics circuits. We present our work to enhance an established silicon photonics platform with low-power micro-electromechanical (MEMS) and liquid crystal (LC) actuators to enable largescale programmable photonic integrated circuits (PICs).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a fast and robust infrared remote target detection network is proposed based on deep learning. Furthermore, we construct our own IR image database imitating humans in remote maritime rescue situations using FLIR M232 IR camera. First, IR image is preprocessed with contrast enhancement for data augmentation and to increase Signal-to-Noise Ratio (SNR). Second, multi-scale feature extraction is performed combined with fixed weighted kernels and convolutional neural network layers. Lastly, the feature map is mapped into a likelihood map indicating the potential locations of the targets. Experimental results reveal that the proposed method can detect remote targets even under complex backgrounds surpassing the previous methods by a significant margin of +0.62 in terms of mIOU.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optical imaging sensors suffer from distortions caused by atmospheric particles such as dust, mist, fog, haze, and smoke, resulting in degradation of object detection and recognition. To circumvent these issues, image dehazing is an essential preprocessing stage for various real time applications. Several conventional dehazing methods rely on the haze formation model that are inherently dependent on a large number of variables, requiring huge computational burden on the processor. This severely affects the dehazing performance and also restricts real time processing. To overcome these issues, this work deals with an end-to-end real time dehazing architecture based on light weight Convolutional Neural Network (CNN) and Generative Adversarial Network (GAN). Proposed depthwise seperable and residual (DSR) block has been used instead of convolution layers that significantly lowered the parameters and computations. Furthermore, sigmoid and bilateral ReLu activation functions have been exploited to prevent oversaturation of dehazed images. The proposed model achieves significant enhancements in peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) for both synthetic and real-world hazy images, when compared to other architectures such as dark channel prior (DCP) and DehazeNet. The performance outcome of CNN and GAN based dehazing architectures are analyzed and compared.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The accurate and efficient detection of molecular absorption signatures in FTIR output spectra is a challenging task for traditional filter and statistics-based methods; especially with the quantification of density and robustness to the presence of multiple molecules is concerned. Cross correlation, matched filter and support vector machine techniques generalise poorly to unseen variations of the input. In this work, we employ the powerful embedding capabilities of deep learning models to extract path-integrated concentrations of target gases from the complex spectra generated by HITRAN simulation in the mid-infrared spectrum. A quantitative study is done comparing the applicability of the common neural network types MLP, CNN, and LSTM. The results confirm that convolutional layers are substantially effective at capturing the “spatial” information present in characteristic absorption spectra. Furthermore, we show that such neural networks are robust to noise, temperature and concentration variations, and interference from the presence of other molecules.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The power of artificial neural networks to determine the quality and properties of olive oil was proven by several studies in the last years. Less clear is, however, how the neural network is able to extract useful information from the input data. This work investigates the learning mechanism of one-dimensional convolutional neural networks (1D-CNNs) trained to predict the physicochemical properties of olive oil from single fluorescence spectra. Such a 1D-CNN can successfully predict the parameters relevant to the quality assessment: acidity, peroxide value, and UV absorbance. To go beyond a simple quality assessment algorithm, it is important to identify which spectral features in the measured spectra are correlated with each chemical parameter and therefore with the quality of olive oil. To obtain this information, explainability techniques can be used by studying the latent feature space generated by the intermediate layers of the one-dimensional trained convolutional neural network. This work analyses in detail the common features that are used by the 1D-CNN to predict the two physicochemical parameters: acidity and K232.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Non-contact measurements using digital cameras require a reliable camera calibration typically based on a pinhole camera model with a few lower-order distortions (in OpenCV typically up to 14 parameters). The assessment of the calibration quality is typically done with the re-projection error (RPE). We propose a different quality measure, the forward propagation error (FPE) that determines the deviation in real world coordinates using parameters from the camera calibration. In addition, we introduce a machine learning-inspired method for a more reliable camera calibration. We explore the quality of our camera calibration using RPE, FPE, and a machine learning method by a series of checkerboard (using spares points) or phase shifting patterns (dense points), different camera types, and different camera models by comparing results from simulations and experiments. The machine learning inspired method helps to identify outliers which can easily be removed from the calibration process ensuring a reliable camera calibration. Our investigation shows the better the camera the better the camera calibration. We found that the 5 parameter OpenCV model was sufficient for our camera calibration. In addition, the 5 parameter model and the dense phase shifting pattern were precise and accurate and only limited by our target 8k monitor with about 0.09 mm pixel pitch in terms of FPE. We found already a good calibration using about 20 different poses and a checkerboard pattern by a correct generation of poses. The checkerboard pattern shows good results, and can easily be interpreted with the FPE.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Si based image sensors have proliferated in recent decades as their reduction in cost and increase in performance reached extraordinary levels. This talk will highlight advancements made in Si-compatible materials for imaging (2D and 3D) at wavelengths extending into the longwave infrared (LWIR). Monolithic integration of low dark current Ge photodetectors on Si extends the sensitivity of integrated sensors to about 1.6 μm. Recently, researchers have grown high quality epitaxial GeSn alloys on Ge where the cutoff wavelength approaches 5.2 μm. Using these new materials, it will be possible to create high performance midwave infrared (MWIR) and LWIR image sensors monolithically integrated on Si.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
State-of-the-art style-based generative adversarial networks (StyleGANs) synthesize high-quality images by learning a mapping from a disentangled latent space onto the image manifold. Thereby, learned representations can be analyzed by interpreting the latent space and used subsequently to control the properties of the synthesized industrial machine vision data. StyleGANs in combination with an embedding into the latent space enable the assessment of the properties of embedded images by means of their latent space representations, however, a trade-off between the dimensionality of the StyleGAN’s latent space and the quality of generated images must be found. While a smaller latent space is easier to interpret, it might not capture all quality characteristics if lossless compression cannot be achieved. This work presents an evaluation scheme that uses statistical hypothesis testing to identify an advantageous dimensionality of the latent space for industrial machine vision applications. As quality measure of images synthesized by GANs, often the Fr´echet Inception distance (FID) based on features learned from the ImageNet dataset is used. However, the features of the underlying Inception network are opaque and might not be representative for application specific quality characteristics. Herein, synthetic data is evaluated instead by means of a Fr´echet distance based on selected and application specific features extracted from the used industrial machine vision dataset. With these application specific features, the image quality of multiple StyleGANs trained with different latent space dimensionalities is compared using statistical tests to select an advantageous latent space dimension.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Low-coherence-interferometry (LCI) is a powerful and widely used measurement approach in the fields of biomedicine, surface analysis, and imaging. Many techniques, such as optical coherence tomography (OCT) or Dispersion-Encoded-LCI (DE-LCI) derive from LCI. This work focuses on the DE-LCI measurement approach for profilometry. An estimation of axial displacement in the measuring arm of extremely low resolved DE-LCI spectrograms was achieved by instrumentalizing an artificial intelligence (AI) based analysis technique. It was proven effective even for spectrograms that partially fall below the Nyquist criterion. The presented estimation strategy considers the very low-resolution distorted data to be some kind of ”fingerprint” of the complete initial signal, which cannot be interpreted directly by classic deterministic models. It was shown, that this resolution limitation could be exceeded for certain boundary conditions with the introduced artificial neural network topology. The benefits of the proposed AI-Model are demonstrated in a series of reference measurements of the surface topography of an uncoated Si-reference object. The spectral resolution was varied throughout the process. The relation between the absolute axial resolution and full measurement range was used to evaluate the final measurement dynamic. Resistance to noise and mechanical displacement, especially the displacement of the reference and the typical detector noise is presented and discussed for the proposed estimation approach. The described novel method allows overcoming central instrumental limitations, namely the spectral resolution of the used instrument, while increasing measurement dynamics significantly. This is crucial for DE-LCI sensors applications as well as for in-line and high-speed DE-LCI metrology purposes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face recognition (FR) and license plate recognition (LPR) are very crucial algorithms for identification of humans and vehicles in several applications such as surveillance, traffic and access-control. The advances in small single-board computers with high parallel processing power capabilities and the use of low-power Neural Processing Units (NPU) inside embedded System on Chips (SoC), enable real-time face detection (FD) and LPR at the edge. On the other hand, it is still a challenge to run multiple algorithms concurrently with high accuracy and prompt execution (high frame rates) that requires a very efficient software/video analytics algorithm development. Both FR and LPR algorithms need two-stage processing that involve detection and recognition. In this study, we propose a method that enables simultaneous face detection associated with landmark and quality information and LPR at the edge. The FD pipeline detects and tracks the faces, extracts landmarks and quality of faces, to select appropriate faces for recognition and then sends them to face recognition server. LPR algorithm consecutively performs detection and recognition on the embedded platform. Extended YOLO model is utilized for face selection while pruned YOLO and LPRNet models are exploited for license plate detection and license plate reading, respectively. In order to enable real-time performance with high accuracy; optimized AI-models and software architecture are used. As a result of this study, we obtain a high-performance, high-precision and real-time combined face/LPR recognition system which can be very useful for surveillance and security applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Data assimilation is a well-established technique that combines computational models with observational data and is now widely used in a variety of fields and holds great promise for many digital twin applications. It refers to the estimation of the state of a physical system from models and measurements by fitting models of physical systems to data. In this talk, we introduce our new trial for internal structure modeling of waveguide devices using data assimilation. Our proposed approach can evaluate the waveguide structure by fitting models of waveguide structures to a measured nonlinear spectral change induced by optical pulse propagation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Water quality monitoring in sewer networks remains a technical challenge even though water pollution and control are high priorities since decades. Current water quality monitoring usually analyzes samples in laboratories, allowing only sporadic measurements, or uses immersed sensors in the wastewater, leading to clogging and sensor fouling resulting in expense due to intensive maintenance. Both techniques thus have serious limitations. Previous research showed that UV-Vis reflectance spectrometry can be used for non-contact monitoring of turbidity (TUR) and Chemical Oxygen Demand (COD), which are two key water quality indicators. Although spectrometer achieve high spectral resolution their limited spatial field of view is problematic for highly inhomogeneous surfaces as is the case wastewater In this study, we obtain beyond state-of-art measurement accuracies by combining machine learning techniques with increased spatial field-of-view Multi-Spectral Imaging (MSI) whilst substantially reducing the spectral resolution. We designed and built a dedicated setup with a monochromatic camera and an active illumination of thirteen LEDs covering the spectrum range of 200-700 nm. We acquired and calibrated data on 27 samples with different concentrations of TUR and COD. Machine learning regression models were trained and evaluated with the extracted spectra. We tested the Partial Least Square (PLS), Support Vector Machine (SVM) and Random Forest (RF). PLS regression performed best with excellent correlation coefficients (R2 ) of the 0.99 for TUR and 0.93 for COD. We obtained similar results with the SVM algorithm (R2 = 0.99 and 0.92), whilst RF had lower scores (R2 = 0.96 and 0.71).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Integrated optoelectronic devices represents a fundamental building block of hardware accelerators for photonics neural networks. Nanophotonic electro-optic modulators and detectors have significant performance advantages in power efficiency, communication bandwidth, and parallelism compared to conventional free-space photonics. Here, we present strategies and experimental validations of novel high-performance nanophotonic opto-electronic devices, involving heterogeneous integration of emerging materials into silicon photonic integrated circuits to exploit new functionality and device-scaling laws for efficient and ultrafast modulators, detectors, and photonic nonvolatile memory. The optoelectronic implementations of neural networks are demonstrated which significantly extends the spectrum of information processing capabilities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Integrated optical phased arrays (OPAs), fabricated in advanced silicon-photonics platforms, enable manipulation and dynamic control of free-space light in a compact form factor, at low costs, and in a non-mechanical way. In this talk, I will highlight our work on developing OPA-based platforms, devices, and systems that enable chip-based solutions to high-impact problems in areas including augmented-reality displays, LiDAR sensing for autonomous vehicles, optical trapping for biophotonics, 3D printing, and trapped-ion quantum engineering.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Practical imaging systems form images with spatially-varying blur, making it challenging to deblur them and recover critical scene features. To address such systems, we introduce SeidelNet, a deep-learning approach for spatially varying deblurring which learns to invert an imaging system’s blurring process from a single calibration image. SeidelNet leverages the rotational symmetry present in most imaging systems by incorporating the primary Seidel aberration coefficients into the deblurring pipeline. We train and test SeidelNet on synthetically blurred images from the CARE fluorescence microscopy dataset, and find that, despite relatively few parameters, SeidelNet outperforms both analytical methods as well as a standard deblurring neural network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We developed and implemented a deep optical neural network (ONN) design capable of performing large-scale training and inference in situ. For each elementary building block in the ONN, we introduce trainable parameters in a programmable device, weight mixing with a diffuser, and nonlinear detection on the camera for activation and optical readout. With automated reconfigurable neural architecture search, we optimized the architecture of deep ONNs that can perform multiple tasks at high speed and at large scale. The task accuracies achieved by our experiments are close to state-of-the-art benchmarks with conventional multilayer neural networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face recognition has been widely implemented in public places for security purposes. However, face photos are sensitive biometric data, and their privacy is a common concern, which often needs to be protected via cryptosystems. Popular software-based cryptosystems have limitations on short secret key lengths, posing a significant threat when facing high performance quantum computing. Recently, in order to achieve higher level security, hardware-based optical cryptosystems have been investigated. However, due to the complexity of optical system designs, it is difficult to integrate the extensively studied optical double random phase encryption into current face recognition systems. Speckle-based cryptosystems, on the contrary, affords high-level safety with high adaptivity, high speed, and low cost, using simpler optical setups. In this study, a speckle-based optical cryptosystem for face recognition is proposed, and encrypted face recognition is experimentally demonstrated. During encryption, a scattering ground glass is utilized as the only physical secret key with 17.2 gigabit length, so as to encrypt face images via random optical speckles at light speed. During decryption, a specially designed neural network is pre-trained to reconstruct face images from speckles with high fidelity, allowing for up to 98% accuracy in the subsequent face recognition process. Apart from face recognition, the proposed speckle-based optical cryptosystem can also be transferred to other high-security cryptosystems due to its high security, high adaptivity, fast speed, and low cost.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Holographic imaging and projection are increasingly used for important applications such as augmented reality,1 3D microscopy2 and imaging through optical fibres.3 However, there are emerging applications that require control or detection of phase, where deep learning techniques are used as faster alternatives to conventional hologram generation algorithms or phase-retrieval algorithms.4 Although conventional mean absolute error (MAE) loss function or mean squared error (MSE) can directly compare complex values for absolute control of phase, there is a class of problems whose solutions are degenerate within a global phase factor, but whose relative phase between pixels must be preserved. In such cases, MAE is not suitable because it is sensitive to global phase differences. We therefore develop a ‘global phase insensitive’ loss function that estimates the global phase factor between predicted and target outputs and normalises the predicted output to remove this factor before calculating MAE. As a case study we demonstrate ≤ 0.1% error in the recovery of complex-valued optical fibre transmission matrices via a neural network. This global phase insensitive loss function will offer new opportunities for deep learning-based holographic image reconstruction, 3D holographic projection for augmented reality and coherent imaging through optical fibres.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe an LSTM-based autoencoder for inversely designing an achromatic metalens comprised of cylindrical unit cells. The training data for our model has phase and transmission values corresponding to the heights and radii of each meta-unit. We use multiple data sequences (phase and transmission) to train the model and a multi-output model framework. The autoencoder is trained for 2500 iterations using the Adam optimizer with a learning rate of 0.001 and is subsequently used for inversely predicting the meta-unit dimensions at each radial position of the lens. Our model is validated via simulations as well as experiments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The vast majority of medical endoscopes used today are based on optical fibers. Although most endoscopes are used to retrieve the image of the object at the distal end of the fiber endoscope, recent developments enable imaging the surrounding of the endoscope, such as swallowable tethered capsules. However, these capsules are much thicker than the optical fiber itself, and require mechanical rotations in order to scan the surrounding of the capsule. Hence, the ability to image the surrounding of a standard single-core fiber will be a major improvement to the current capabilities since it will allow for a much more convenient and possibly faster operation of the device, and will enable reaching places that are unreachable using the currently available technology. A mechanism that may enable such ability is the Rayleigh scattering that is present in standard optical fibers, and causes scattering of light that propagates inside the fiber in all directions. In this work we discuss two tasks we have recently investigated towards imaging the surrounding of a standard step-index multi-mode fiber. The first one is retrieving visual data that was input to the fiber based on Rayleigh side-scattered light using deep learning. The second one is focusing of Rayleigh side-scattered light using wave-front shaping, which is a possible means to overcome the very low intensity of Rayleigh scattering.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face pose estimation is essential for interactive remote video communication as well as human-computer interaction. In the case of a vision system for video communication using multiple cameras, not only precise but also fast estimation is required for the switching control of the camera views. However, most of the methods based on facial landmarks are not fast enough due to the calculation cost for the detection and alignment of the landmarks. This paper proposes a straightforward method to directly estimate the face pose from input camera images, using multiple camera views and deep learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The visual system of arthropods, called the compound eye, has distinctive features such as a wide field of view, high-speed motion detection, and infinite depth of field. These features have attracted researchers to build artificial compound eyes. However, the compound eye is limited in spatial resolution by its structural constraints such as the number and size of ommatidia that compose the compound eye. These constraints also can be found in the existing artificial compound eye. In previous work, a design method overcame these limitations and achieved resolution improvements by increasing the acceptance angle of ommatidia and using numerical optimization based on compressive sensing (CS). However, the limitation is that prior information such as a sparsifying basis is needed to solve the numerical optimization problem, and obtaining the solution to this problem is computationally time-consuming. In this paper, we propose a deep learning-based artificial compound eye. The deep learning architecture takes a measurement from the compound eye as input and learns how to reconstruct the original image. The experimental result demonstrates that the proposed deep learning approach provides improved performance in image reconstruction for the artificial compound eye.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computer vision algorithms can quickly analyze numerous images and identify useful information with high accuracy. Recently, computer vision has been used to identify 2D materials in microscope images. 2D materials have important fundamental properties allowing for their use in many potential applications, including many in quantum information science and engineering. In order to use these materials for research and product development, single-layer 2D crystallites must be prepared through an exfoliation procedure and then identified using reflected light optical microscopy. Performing these searches manually is a time-consuming and tedious task. Deploying deep learning-based computer vision algorithms for 2D material search can automate the flake detection task with minimal need for human intervention. In this work, we have implemented a new deep learning pipeline to classify crystallites of 2D materials based on coarse thickness classifications in reflected-light optical micrographs. We have used DetectorRS as the object detector and trained it on 177 images containing hexagonal boron nitride (hBN) flakes of varying thickness. The trained model achieved a high detection accuracy for the rare category of thin flakes (< 50 atomic layers thick).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recording of the local field potentials(LFPs) and calcium signal is strongly influenced by external stimuli. However, stimulation artifacts are difficult to isolate from the signal. Current studies generally screen out trial segments with high stimulus amplitude. To keep more information of the traces, We propose a method to remove the negative effects of deep brain stimulation (DBS) on LFPs, this method performs well in dealing with periodic artifacts with high amplitude, large pulse width and high stimulation frequency, which may also alleviate artifacts of neuronal calcium signaling trajectories introduced during imaging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A computational imaging technique using a lens and Lucy-Richardson-Rosen algorithm (LRRA) has been developed for 3D imaging. A deep 3D point spread function (PSF) was recorded in the first step. A single camera shot of an object was recorded next. Using the 3D PSF and the LRRA, the complete 3D information of the object was reconstructed. In this configuration, direct imaging and indirect imaging concepts co-exist: when the imaging condition is satisfied, an image of the object is directly obtained and in other cases it is indirectly obtained. The proposed single lens incoherent digital holography system will be attractive for numerous imaging applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.