PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 13179, including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee information
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Artificially prepared microbial spores have special core-shell structure and excellent physicochemical properties, which can attenuate electromagnetic waves by the way of absorption and scattering. The application of bioaerosol in the field of light extinction has the advantages of safety and environmental protection. In this study, We selected three laboratoryprepared microbial spores and the reflectance spectra of the microbial spore compression tablets in the wavebands of 0.25-2.4μm and 3-12μm were measured. By utilizing the Krames-Kronig(K-K) algorithm and the Mie scattering theory, two important optical parameters namely complex refractive index(CRI) and mass extinction coefficient of microbial spores were calculated and analyzed, then we measured the transmittance of bioaerosols under dynamic conditions from ultraviolet to far-infrared wavebands and the mass extinction coefficients of the bioaerosols in full optical band were obtained. The results showed that the optical properties of different microbial spores in specific optical wavebands had significant variability. This study can help to comprehensively understand and quantitatively study the general and specific optical properties of microbial spores and provide new ideas for the research of novel extinction materials.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optical feedback may cause accelerated degradation as well as catastrophic optical damage in high-power Laser diodes, directly limiting their output optical power and lifetime. Near-field distribution change caused by optical feedback has high relevance with the reliability and is worthy to be studied. In this study, the influence of optical feedback on the near-field distribution of the laser diode is investigated, as well as the influence on the device failure. A feedback light testing system is successfully established, which integrates power monitoring, spectral measurement, and near-field assessment. Through an investigation into the influence of feedback light, it was observed that it induces instability in the near-field distribution, leading to temporal variations. Under conditions of strong feedback, a stable near-field peak emerged. At even higher current levels, a clear correspondence was identified between the near-field peak and the point of failure. These findings offer valuable insights for the understanding of the influence of optical feedback on the nearfield distribution of the laser diode and its reliability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problems of poor illumination uniformity and low efficiency of existing plant lamps, according to the basic principles of optics and Principle of non-imaging optics, this paper designs a hyperboloid lens module with high uniformity. The hyperboloid radian of the free-form lens is used to divide the light emitted by LED into two parts: refraction and transmission, which increases the coupling degree of light and improves the optical efficiency and uniformity of the light exit surface. In this paper, 3535 440 LEDs of 1W are used as the light source of the plant lamp. In the plant lamp with the size of 170mm×350mm, the LED light source is divided into 11 rows with 40 LEDs in each row. The light tracing simulation of the free-form lens module model of the designed plant lamp is carried out in TracePro software. The results show that when the maximum diameter of the free-form lens is 15.0mm, the height is 9.16mm and the thickness is 3.8mm, the beam angle of the plant lamp is 90, and the receiving surface is 0.5 m away from the light-emitting surface of the plant lamp, the illumination distribution with over 90% uniformity and 90% optical efficiency can be obtained in the illumination area with a diameter of 500mm×500mm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lithium niobate crystals are a multifunctional material, exhibiting numerous excellent properties such as photoelectric, acousto-optic, and ferroelectric characteristics. They are widely used in various applications including optical communication technology, photonic integrated circuit, and high-frequency filters. Femtosecond lasers are a means of processing with extremely high peak power, a very small heat-affected zone, and the ability to process in three dimensions in a flexible and controllable manner. To date, femtosecond lasers have successfully fabricated a variety of structures in lithium niobate crystals, including photonic crystals, nanogratings, waveguides, and surface periodic stripes. Therefore, investigating the interaction mechanism between femtosecond lasers and lithium niobate is essential. This paper mainly employs time-resolved reflective pump-probe techniques to obtain transient reflectivity evolution images within femtosecond-picosecond-nanosecond time scales. It analyzes ultrafast processes such as photon-electron, electron-phonon coupling, and phase transitions induced by femtosecond laser irradiation on material surfaces, revealing the ultrafast dynamics of near-infrared femtosecond laser interactions with lithium niobate single crystals of different orientations. Our research results are of significant importance for promoting the fabrication of various photonic devices such as electro-optic modulators, frequency converters, and beam splitters on lithium niobate materials using femtosecond lasers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper designs a non-contact laser confocal system and its mechanical structure. A relatively simple optical structure was sought to be used while ensuring that the design specifications were accomplished. The article designed the lens set and analyzed its imaging quality using Zemax software. The results show that the focusing dispersion spot diameter of the microfocusing system is less than 3 μm, and the RMS radius is less than 1 μm; the focusing dispersion spot diameter of the detection and acquisition system is less than 15 μm, and the RMS radius is less than 5 μm; and the MTF curves of all subsystems are close to the diffraction limit, which indicates that the optical transmission quality is good. In addition, the mechanical structure of the optical path system is designed in this paper to ensure the compactness and simplicity of the whole system.Finally, it can be seen from the physical experiment and software simulation that the designed confocal optical path system has good optical imaging performance
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Objects with high reflectivity are used for precision bearings, wafers and cell phone covers. However, the quality of objects is affected by morphological defects, such as pits and scratches. Here, image acquisition and segmentation of the samples were performed based on the double-plate shear laser interference (DPSLI) system and the ConvNeXt convolutional neural network algorithm, respectively. The DPSLI system used can complete the mapping of the sample with a size of 40.4 mm × 29.6 mm within 15 μs. The defect detection results shown that the ConvNeXt algorithm was used to train 5977 batches of 100 training samples and perform image segmentation on 273 actual samples under the RTX-3060 hardware conditions, which results in an accurate classification of surface defects in images with the resolution of 540 × 428 in 15 ms. Meanwhile, the detection accuracies of pits and scratches were 98.2% and 95.52%, respectively, and the total accuracy of defect recognition was 95.6%. The method used herein enables fast and accurate detection of surface defects on objects with high reflectivity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For defect detection of objects with highly reflective surfaces, the diffuse light source has reflected light interference, which leads to the blurring of the edge of the defect area. In response to the problem that the problem that diffuse light source is difficult to highlight the surface defects of highly reflective objects, an LED planar array light source system is designed. It is composed of four tilted planar light sources, which can produce highly directional and high-brightness striped structured light on the target surface, thereby improving the gray value gap of the surface defects of highly reflective objects concerning the background. At the same time, to solve the light intensity change caused by the oblique off-axis illumination of the plane light source, the nonlinear constraint is used to optimize the illumination uniformity of the tilted non-focusing direction. Through software simulation and tracking, a stripe-structured light scheme with a period of 12.5 mm and an area of 200 mm×300 mm was designed, and the corresponding planar array light source system was built for defect detection experiments. The experimental results show that the accuracy rates of pits, scratches, and abrasions are 96.8%, 91.6%, and 95.8%, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have demonstrated a 26.6 μJ microsecond GHz tunable burst-mode pulsed laser. The burst-mode laser has a pulse duration of 5.5 μs, and the repetition rate of the intra-burst can be adjusted from 1 to 2 GHz. To mitigate amplified spontaneous emission and non-linear effects, we have implemented a synchronous pumping scheme in the secondary pre-amplifier, and achieved a maximum energy output of 26.6 µJ. Furthermore, the envelope of the burst-mode pulsed laser is maintained with uniformity through a pre-compensation acoustic-optical modulator.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In ICF research, CsI is the preferred material for photocathodes in X-ray streak cameras due to its excellent properties, but the deliquescence of CsI photocathodes in air significantly reduces the quantum yield. In this paper, CsI photocathodes were prepared, and the morphology of fresh and deliquescent CsI photocathodes was measured by SPM and XRD, which showed that the thickness of CsI crystals increased, and the surface of CsI crystals changed from polycrystalline to crystalline and became sparse. The mechanism of deliquescence affecting the quantum yield of CsI photocathode is analyzed. It is proposed that the surface morphology and crystalline state changes of the CsI photocathode lead to longer photoelectron transport paths and reduced photoemission areas, which are important factors in the attenuation of the CsI photocathode.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A terahertz refractive index sensor with high sensitivity that can be actively adjusted and is controlled by the states in the continuum is designed. The sensor is constructed from an asymmetric graphene double split-ring, and by disrupting the symmetry of the structure, it can transform the resonance into a high-Q leak-mode resonance, which is very sensitive to variations in the surrounding medium’s refractive index. The effects of the chemical potential of graphene, the refractive index of the analyte and the thickness of the analyte on the transmission spectrum were numerically simulated by the finite element method. The results show that the Q-factor of the graphene metasurface reaches 105, its sensitivity is 288 GHz/RIU, and its FOM is 21. The proposed structure has potential applications in the field of highly sensitive photonic sensors in the terahertz range.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a visible-near infrared imaging spectrometer optical system is designed, which is used to detect seawater elements and provides a favorable monitoring means for the protection of marine resources. According to the application background of ocean color detection instruments, the index parameters of the instrument are determined, and a high resolution imaging spectrum system is designed. The front telescope system adopts an off-axis three mirror anastigmat without intermediate real image, with the focal length of 590.75mm and the entrance diameter of 118.15mm;The system comprises an innovative spectrometer design with 0.6 magnification. The de-magnifying optical design allows the telescope to operate at F#/5, while the spectrometers are built in a more compact arrangement at F#/3.The main mirror and three mirrors use Zernike Fringe Sag surface to compensate for the aberration introduced by breaking the classical Offner concentric and isometric structure. The whole optical system comprises a single telescope feeding two functionally identical spectrometers. The spectral range is 0.4-0.9μm, field of view angle is 19 °, spectral resolution is 5nm and instantaneous field of view angle is 0.048mrad, the full field of view and full band MTF is more than 0.8, which is close to the diffraction limit, the RMS radius is less than 4 μm, and the smile and keystone of the spectrometer are less than 10% pixels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fiber Bragg grating sensors enable full optical measurement without being affected by electromagnetic radiation and other interferences. They are lightweight, compact, highly sensitive, corrosion-resistant, and exhibit high structural stability and durability over time. The measurement is characterized by the central wavelength of the reflected light, taking into account factors such as the microbending effect of the fiber, power fluctuations of the light source, and losses caused by coupling. Furthermore, fiber Bragg gratings can be directly written into the fiber core, resulting in low insertion loss, easy achievement of full optical integration, good wavelength selectivity, and absolute wavelength encoding of sensing information. Therefore, they have become a research hotspot in the field of sensors and fiber optic technology both domestically and internationally.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The calibration of the projector is essential in existing monocular structured light measurement systems, and its parameters are not only the basis for measurement, but the parameter accuracy also has a direct impact on the measurement results. The calibration plate cannot be photographed directly by the projector, and it is generally necessary to use a camera to give the projector the ability to photograph it, making the calibration process relatively complex. In this paper, projector calibration is carried out by calculating the coordinates of the projector images of the corner points of the calibration plate using a color tessellated calibration plate, using a method based on radial basis function interpolation. Only two photos are needed to find the image coordinates of the corner point on the projector, and the method has high flexibility and accuracy. The performance of the method is experimentally verified with a calibrated reprojection error of 0.0808 pixels, the number of images used was reduced by 90% and the time for calibration was reduced by 95% compared to the original method, which can be used in the calibration of projectors for structured light systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the selective laser melting (SLM) process, the melt pool temperature field reflects the quality of the product. When the temperature field is unstable, defects may occur. This paper employs photodiodes for coaxial detection of the melt pool with a time resolution of 70 μs. Simultaneously, a two-color thermometry method is utilized for temperature estimation, effectively eliminating the influence of factors such as emissivity. During experiments with TA1 material, optical errors induced by the F-θ lens and the galvanometer scanner were identified. The introduction of a position coefficient successfully addressed this issue. Temperature fields under different laser powers were measured, validating the accuracy of the proposed method. The approach provides precise real-time temperature information, contributing to subsequent closed-loop control of the temperature field
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study investigates the influence of truncation on the beam quality β factor measurement in Gaussian beams. Employing an annular energy approach, the paper analyzes the β method for calculating beam width's half-radius. Different truncation ratios of Gaussian beams are considered, deriving coefficients for precise β factor calculation. A critical truncation ratio is identified, beyond which diffraction effects are minimal, offering insights into beam measurement and production processes. This study provides key data and in-depth analysis for understanding and analyzing the beam quality β-factor of Gaussian beams induced by the truncation effect, thus laying the foundation for further research on the beam quality β-factor.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Narrow linewidth ultraviolet (UV) lasers are widely used in high-resolution spectroscopy. Due to the limitation of gain materials, UV lasers are usually generated via optical nonlinear process. In this paper, the noise characteristics of UV lasers generated in a cavity-enhanced second harmonic generator (SHG) is studied both theoretically and experimentally. Evolution of the laser noise is calculated with the noise ellipse rotation model. Due to the interaction with the optical cavity, noise of the two quadratures of the input field are intercoupled. Therefore, the power spectrum density (PSD) of the generated second harmonic can significantly deviate from that of the fundamental. This effect is verified by measuring the PSD of a 388 nm SHG pumped by a common diode laser. The results in this paper give an insight into the research of noise suppression for UV lasers
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, an inserted U-tapered fiber optic probe is proposed, which has the characteristics of simple structure, small size, easy to manufacture, easy to store, and can realize the detection of parameters in small space. In this work, firstly, we optimize the parameters of the linear taper sensor from the theoretical simulation and experiment, and then prepare the linear taper sensor structure into a U-tapered fiber optic probe according to the optimized parameters, and explore the influence of different U-tapered curvature radius on the interference spectrum. According to the research results, when the curvature radius of the U-tapered fiber optic probe is 1. 75 mm, the sensor has a good response to refractive index (RI), and the RI sensitivity is 386. 23 nm/RIU, the linear responsivity is 0. 998, and the measured temperature sensitivity is -0. 04 nm/℃, and the response to temperature is not significant. This work provides a feasible method for the future development of RI detection in narrow space.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Titanium alloys are widely used as biomedical material in the field of implants because of outstanding mechanical properties and biocompatibility. The surface properties of implants are important factors that affect cell activities on the surface. In order to investigate the effect of surface nanostructures on the wettability and biocompatibility, in this paper, laser-induced surface periodic structures (LIPSS) were produced with femtosecond laser at 1030 nm and 343 nm wavelengths on the titanium alloy. The surface morphology was observed by scanning electron microscopy (SEM) and parameters such as depth and roughness were obtained by atomic force microscopy (AFM). The wettability was investigated by static contact angle measurement. The results revealed that LIPSS caused varying degrees of reduction in the contact angle linked with the period and depth of ripples. MC3T3-E1 pre-osteoblasts were cultured on the surfaces of titanium alloys to study the biocompatibility. Different LIPSS have differences in the biocompatibility. The period and depth of LIPSS determine the orientation and elongation of cell stretching. The alkaline phosphatase concentration of the cells suggested that LIPSS affect the osteogenic difference.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target Detection and Feature Recognition Technology
With the development of China's space station, rendezvous and docking between spacecraft and the station have become more frequent. Smooth and safe docking speed is important for the effectiveness of docking missions. In this context, vision-based docking speed measurement comes into view. Visual measurement is a commonly used method. It is a noncontact measurement method, which is realized by optical measurement principles and equipment to measure the structure under test. We propose an improved ellipse detection method for arc-support LSs.The method first forms an arc support group, verifies this prior knowledge on the basis of the arc support group according to the feature that the ellipse cross target is always in the center of the image, and sets a prior box to narrow the detection range of the ellipse. and then generates an initial ellipse set using two complementary methods, and after selecting the significant ellipse candidates and refining them as the detection points, achieves an efficient and high-quality ellipse detection. The docking speed calculation formula was established based on the physical imaging model. It is validated on our own docking simulation video and the real public Shenzhou XVI and Shenzhou XVII spacecraft docking videos, with a recall of 0.9353 and an FPS of 8.513 on the simulation video, which is more efficient and high-quality than other traditional ellipse detection methods, and the speed measurement errors are 5.8% and 3.6% on the two real public videos, which improves the spacecraft docking speed measurement robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Currently in the field of ship perception, datasets lack 3D information fusing images and point clouds, and the real dataset faces difficulties such as collecting data in extreme working conditions and the low accuracy of data labeling. In this paper, a synthesized dataset SSP3D5000 for 3D perception of ships is constructed by virtual synthesis technology. The dataset contains three data types, binocular image, depth image and point cloud. 5000 sets of binocular and depth images containing 90 ship models and 325 accompanying point cloud data are acquired for different factors such as ambient lighting, weather, viewing angles and sea surface. SSP3D5000 provides 12-dimensional labeling information including categories and 2D/3D bounding boxes. Virtual images and point clouds are evaluated using a variety of imagebased 2D detection and point cloud-based 3D detection baseline models. The evaluation results show the feasibility and effectiveness of synthetic data in the field of ship perception sensing, which can help in the realization of various computer vision tasks for ship perception.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We revisit the relationship between attention mechanisms and large kernel ConvNets in visual transformers and propose a new spatial attention named Large Kernel Convolution Attention (LKCA). It simplifies the attention operation by replacing it with a single large kernel convolution. LKCA combines the advantages of convolutional neural networks and visual transformers, possessing a large receptive field, locality, and parameter sharing. We explained the superiority of LKCA from both convolution and attention perspectives, providing equivalent code implementations for each view. Experiments confirm that LKCA implemented from both the convolutional and attention perspectives exhibit equivalent performance. We conducted experiments with LKCA on a wide range of ViT variants, consistently improving classification and performance compare to various models, showcasing the capabilities of LKCA. Our code will be made publicly available
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The HKG-07C infrared pulse sensor is used to sample pulse signals. Remove the main ascending segment of the pulse signal, and extract the main descending trend segment of the pulse signal. Use the least squares method to perform polynomial fitting on the extracted signal. First extracting four features, including the mean and variance of the fitting error, and the sum of the maximum and minimum of ten data points whose absolute values are taken. Then perform a differential operation on the fitting error, and extract the four features of the difference, including the mean and variance of the differential data, and the sum of the maximum and minimum ten absolute values of the differential data. Construct a one-dimensional convolutional neural network.Train and test the network using the extracted eight features. The experimental results show that the accuracy of this pregnancy pulse recognition method reaches over 85%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To address the challenges of significant variations in target scales and high computational complexity, low accuracy and slow inference speed in target detection of remote sensing images, this study proposes a multi-scale dense target detection network for remote sensing images based on improved YOLOv5. Firstly, the network adopts partial convolution in feature extraction to improve the inference speed for remote sensing images. Secondly, the ASPPF module is introduced to address the problem of multi-scale feature information loss during feature fusion in the pyramid network. Finally, the WISE-IOU function is introduced to compute IOULOSS, which reduces the negative impact of dense arrangement on the accuracy of horizontal box detection. Experimental evaluations on the DOTAv1.0 dataset show that the improved model achieves mAP0.5 of 0.73, demonstrating improvements in parameters, FLOPs, FPS, and other aspects compared to YOLOv5.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an improved stereo vision measurement system with enhanced binocular rectification. Traditional binocular vision algorithms do not constrain the two cameras to be perpendicular to the measurement baseline during the rectification phase, leading to inaccurate vertical distance measurements between two points. To address this issue, a binocular rectification method based on plane normal constraints is proposed. Initially, the camera calibration yields the initial correction transformation matrix. The matrix is then refined by adjusting the correction transformation matrices of both cameras based on the normal of the reference plane. Subsequently, the rectified images are obtained using the refined projection matrices, and the spatial coordinates of the measured points are computed through stereo matching algorithms. Experimental results demonstrate that the improved method enhances measurement accuracy significantly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Marine target recognition and tracking are of great significance for achieving intelligent perception of the marine environment, avoiding ship collisions, and maintaining ship navigation safety. Propose a sea multi-target tracking model based on visual, AIS, and radar data fusion to achieve real-time monitoring of sea navigation targets. Firstly, an imagebased maritime target recognition model is established based on the YOLOv3 algorithm to achieve automatic recognition of maritime navigation targets around ships; Secondly, based on the SORT framework, a maritime navigation target tracking model was established to achieve real-time tracking of multiple targets at sea. The actual ship test results show that the average accuracy, average detection time, average tracking accuracy, and tracking precision of multi-target detection results at sea under different weather conditions can provide effective technical support for multi-target tracking tasks at sea.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the current field of autonomous driving, millimeter-wave radar serves as an important complement to optical sensors in Simultaneous Localization and Mapping (SLAM) technology. Due to its ability to penetrate visual obstacles like dense smoke, millimeter-wave radar has become a key tool for localization and navigation in adverse weather conditions such as rain and snow. Particularly, the emergence of 4D millimeter-wave radar has provided an expansion of point cloud data from two-dimensional to three-dimensional. However, in the SLAM field, research on 4D millimeter-wave radar is still lacking. Due to its resolution and point cloud density limitations, it is difficult for 4D millimeter-wave radar to extract geometric features, such as edges and planes. Therefore, contemporary approaches predominantly utilize characteristics of spatial statistical distributions. However, these methodologies do not adequately exploit the scattering features in synergy with the SLAM process. This paper proposes a SLAM algorithm based on Scattering Angle Feature Model. In the algorithm's front-end, scattering angle feature constraints are introduced to enhance semantic information recognition during the scan matching process. This paper presents a computational method for three-dimensional scattering angle features, which is applied in front-end scan matching. At last, by collecting 4D millimeter-wave radar data from real-world scenarios, it was verified that scattering angle features have improved SLAM performance. These results confirm that scattering angle features can enhance the accuracy of pose estimation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problem of low automation in pig farms, this paper proposes a new pig posture estimation method based on breeding scenarios for intelligent monitoring of pig farms. Firstly, the video image data of indoor and outdoor scenes in pig breeding scenarios were collected and labeled, and a self-constructed pig posture estimation dataset was built; Secondly,Resnet50, VGG16 and MobileNetV2 were used as the back-bone network, and the three methods based on coordinate regression, heat map and simple coordinate classification were analyzed experimentally, and the Simple Coordinate Classification (SimCC) algorithm with the optimal effect was selected as the extraction method of key points of pigs; Finally, we integrated High Resolution Network(HRNet) and HRFormer, which incorporates Transformer modules, as backbone net-works. They were combined with the SimCC to formulate an effective pig pose estimation framework. The experimental results show that the mAP of HRFormer-SimCC reaches 83.2%, which is an average improvement of 7.2% over the use of traditional CNN model and 0.4% over the HRNet-SimCC, and the floating-point computation and parameter counts of HRFormer-SimCC are only 45.05% and 36.48% of it. This is more suitable to be deployed in breeding environments and provides a theoretical basis for intelligent monitoring of pig farms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the emergence of deep learning methodologies, vision-based gesture recognition technology has continuously advanced. This paper primarily delves into four main stages of vision-based gesture recognition: gesture segmentation, gesture tracking, feature extraction, and gesture classification. It sequentially introduces pertinent techniques from representative literature spanning from 2018 to 2023. Based on this analysis, the current status of vision-based gesture recognition technology is examined, paving the way for predicting its future trends and developments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to solve the problems of missing detection and poor detection effect of small targets in autonomous driving scenarios, a road target detection algorithm with improved YOLOv8 algorithm was proposed. Firstly, the backbone network is replaced by FasterNet, which combines the multi-scale attention mechanism and depth separable convolution to improve the feature expression and receptive field range. Secondly, CBAM is integrated into the attention mechanism module, which combines the channel attention mechanism with the spatial attention mechanism to form a new convolutional block structure, so as to better carry out feature fusion. Finally, to solve the problem that CIOU loss function does not take into account the mismatch between the desired real frame and the predicted frame, Inner-SIoU loss function is introduced to effectively improve the accuracy of reasoning. Experimental results show that for the public Udacity data set, the proposed algorithm can improve the detection accuracy by2.9% while maintaining the same detection speed as the original algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to fully perceive the environment in maritime navigation activities, multi-target detection tasks are required. The commonly used method for extracting candidate boxes in the pre task of object detection in high-resolution sea and air images is inefficient and computationally intensive. This article combines color name nonlinear color mapping and grayscale co-occurrence matrix information to design manual features. By building an SVDD-TSVM joint classifier to reduce the missed detection rate of foreground targets, a grid based candidate box extraction method is constructed. A selfmade dataset and comparison method are used for simulation verification. The results show that the missed detection rate of the SVDD-TSVM joint classifier designed in this article is somewhat reduced compared to other comparison methods, And the candidate box extraction method in this article reduced the number of invalid candidate boxes by an average of 87%, and the detection time by an average of 69.5%. The self-made dataset NAME-D and some code in this article have been uploaded to https://github.com/guanhar/tolabin/.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the continuous development of intelligent system technology and the continuous progress of analysis technology, the research and development of various video surveillance systems with certain intelligence have received widespread attention in many fields, The detection and tracking of moving objects has become an important research topic because of its wide application prospects. Based on the technology of moving object detection, this paper studies an efficient and practical traffic flow detection algorithm. The software and hardware design of the detection method are completed on the system based on TMS320DM6437 platform of TI company, and the algorithm is optimized according to the characteristics of DSP devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Small object detection is a challenging problem in object detection, in practice, we found a special kind of small object detection results are very poor, we call it camouflaged small object, the color of the object itself is similar to the surrounding environment, that is, the contrast between the object and the surrounding environment is low. Therefore, we propose a method to stretch the grayscale map to directly increase the contrast between the object and the surrounding environment. Then, as to how to effectively fuse the feature of object in grayscale map with the feature of object in original RGB image, we propose a dual-channel feature fusion module DCF, which allows the features extracted from the grayscale map to enhance the features extracted from the original RGB image. We also constructed a camouflaged small object detection dataset SSD, which is a smoking object detection dataset with only one smoking object, containing 4349 images and 4705 object instances. Our proposed DCF framework can be easily embedded into existing object detection frameworks. We experimented on a number of different excellent detectors, Faster R-CNN, RetinaNet, and YOLOv8, and the results show that our method is effective in all of these detectors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Structured light 3D reconstruction technology is currently widely used in 3D measurement. In the past few years, there have been many studies on calibration and reconstruction methods for structured light systems, but the description of implementation details in papers is often lacking. In this work, we combine monocular structured light with binocular vision and propose a new monocular structured light 3D reconstruction method. The pseudo-code of the core algorithm is given to facilitate scientific researchers to reproduce this work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, combined with the relevant professional knowledge of tower solar power station, a nonlinear optimization model is established through physical geometric equations, and the adaptive chaotic particle swarm optimization algorithm is used to solve the problem, and the solution obtains that the size of the heliostat mirror is 7m*7m, the installation height of the heliostat is 5m, the total number of heliostat mounts is 1271, and the annual average output power (MW) reaches the maximum of 59.42MV, in order to provide a reference for the layout optimization design of the heliostat field.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the problem of low intelligent detection accuracy of apple fruit, an improved YOLOv8 is proposed based on the original YOLOv8. The multi-head self-attention mechanism(MHSA)is introduced to improve the detection accuracy of the model and verified on the public Apple dataset. Compared with the original YOLOv8, mAP 0.5 increased by 1% and mAP 0.5:0.95 increased by 4.5%. Compared with the popular YOLOv5 and YOLOv7 algorithms, According to the experimental results that the mAP 0.5 obtained by this research algorithm is as high as 95.1 %,and the mAP 0.5:0.95 is as high as 54.3%,which is better than the comparison algorithm. It shows that the improved YOLOv8 has high precision and efficiency of apple positioning,and can serve the apple picking robot for picking.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Accurate detection of water surface targets using radar is of great significance for ensuring water traffic safety. Traditional detection methods are often limited by complex and changing water environments and clutter interference, resulting in missed and false detections of targets. To address this problem, a multi-frame radar target detection method based on target association is proposed. First, the noise in the radar images is removed using preprocessing techniques. Then, a constant false alarm rate (CFAR) method is used for single-frame detection, resulting in candidate targets. Next, taking into account the randomness of clutter, a sliding window approach is used to perform target association within each window. By utilizing the association of the targets, strong targets, potential targets, and clutter are identified from the candidate target set. For the potential targets, their signal is enhanced by accumulating multiple frames. Ultimately, it achieves precise detection of radar targets. In real ship detection experiments, the proposed detector has demonstrated superior performance compared to traditional radar detectors and visual detectors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep aspherical surfaces are often detected using computer-generated hologram, this method testing the surface shape with high accuracy, but it is necessary to design a unique computer-generated hologram for a certain asphere to achieve the detection, in order to deal with this problem, this paper proposes to utilize the multifocal lens for the detection of the depth aspherical surface shape. First, the key parameters of the multifocal lens are calculated, second, the simulation experiments of the multifocal lens for detecting aspherical surface are carried out, finally, the surface shape detection experiments of aspherical surface are carried out with the multifocal lens. The depth aspherical surface shape detection method proposed in this paper can effectively complete the depth aspherical surface shape detection without limitation of individual compensation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image dehazing is an important task in computer vision, as it can transform unclear images taken in haze weather into clear ones. Some deep-learning based dehazing models struggle to balance the relationship between the convolutional kernel receptive field and the number of parameters when improving model performance. We propose dilated convolutional kernels with a size of 13 to balance between the receptive field and the number of parameters. Furthermore, some deeplearning-based dehazing methods employ a significant number of channel attention mechanisms and spatial attention mechanisms, both of which can share a common fully connected layer. Our proposed fused attention mechanism incorporates the effects of both channel attention and spatial attention mechanisms, while reducing the number of parameters. Combining these two modules, we introduce a lightweight attention mechanism dehazing network (LAMNet). This network demonstrates effective dehazing results while maintaining a relatively low number of parameters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Target recognition is an essential study path in computer vision since it helps recognize and localize image-specific target items. Scene context has been discovered to influence target recognition performance. What function does object and backdrop orientation play in the target recognition process? In the present study, consistent and inconsistent objects are shown in upright and inverted scenarios. In the experiments, whether in upright or inverted scenes, the response time for object recognition is shorter in consistent scenes than in inconsistent ones, and when the scenes are consistent and the object orientation is upright, there is a significant difference in the background orientation, with the response time for a upright background being significantly lower than that for an inverted one. Meanwhile, when the scenes are inconsistent and the background orientation is upright, objects with upright orientation have a considerably shorter reaction time than inverted objects. Taken together, these findings indicate that the orientation properties of the item and the background influence object recognition in the scene.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An infrared dim target detection method based on Human Visual System and Low Rank Matrix Factorization is proposed in this paper. Firstly, the sparse component of a small-target infrared image is obtained through fast matrix decomposition based on weighted scene priors, achieving preliminary screening of small target regions in the image. Then, utilizing prior knowledge of the shape and contrast distribution of small targets, a three-layer sliding window with oversampled sub-windows is applied to further suppress non-target areas in the sparse part image. Lastly, to accomplish accurate extraction of small targets, an adaptive threshold segmentation method is applied.The experimental results reveal that, while preservation of good real-time performance, the recommended approach outperforms traditional infrared small target identification procedures in terms of BSF and SCRG.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A swept source optical coherence tomography (SS-OCT) imaging system is proposed. The swept source used in the system is a Fourier-domain mode-locked (FDML) laser which has a narrow instantaneous linewidth due to the nonlinear spectrum narrowing effect generated in the FDML laser ring cavity. Experimental results are presented, which demonstrate the PSFs diagram corresponding to different imaging depths and show the two-dimensional imaging of the cover slip.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hyperspectral image (HSI) data consists of images with numerous contiguous spectral bands and promotes the extensive applications in the field of remote sensing. Recent approaches based on Vision Transformer (ViT) have achieved remarkable performance in HSI classification tasks due to their ability to extract global spatial features and model long range dependencies. However, ViT has complex network structure, training is challenging and lacks adequate consideration of local spatial and spectral receptive fields in hyperspectral data. To solve the problems above, we propose a lightweight network model known as the Groupwise Separable Convolution and Vision Transformer (GSCViT). Firstly, we introduce a parameter-free attention for spectral calibration (SC). Then, we meticulously design a novel convolutional approach named Groupwise Separable Convolution (GSC), which greatly reduce the number of convolutional kernel parameters and effectively capture local spatial-spectral information in HSI data. In addition, we employ Groupwise Separable Multi-Head Self-Attention (GSSA) to replace the traditional Multi-Head Self-Attention (MSA) in ViT, thus can simultaneously attend to local and global spatial features in HSI with lower computational burden. Experiments on two benchmark HSI datasets demonstrate that our GSCViT model achieves excellent accuracy with relatively small training samples and outperforms some existing HSI classification algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Optical projection tomography (OPT) is a three-dimensional (3D) imaging technique for biological samples, capable of visualizing tissues, embryos, and organs within 1mm to 10 mm scale. Filtered back projection (FBP) is an extensively used 3D reconstruction algorithm for OPT with dense sampling data from all view angles. In-vivo OPT can reduce the inspection time by using equally-spaced sparse angle projections to mitigate the side effects of phototoxicity and anesthetics. This work compares the reconstruction results of the sparse-angle OPT using different algorithms, including the FBP algorithm and two kinds of compressive sensing (CS) algorithms with different projection numbers. We also build up a testbed of OPT to verify these algorithms using experimental data. It shows that the CS algorithms result in reconstructed images with fewer artifacts compared to the FBP algorithm. Especially, the advantage of CS algorithms over FBP algorithm becomes more obvious as the projection number is reduced.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
LiDAR point cloud data plays a vital role in autonomous driving systems by enabling essential 3D perception tasks like 3D object detection and segmentation. However, the scarcity of labeled LiDAR data hampers the development of robust deep learning algorithms for these tasks. Data augmentation has been widely used to increase labeled data in various ways, such as geometric transformation, mixup, and inserting synthetic objects. In this paper, we specifically focus on exploring more effective online LiDAR data augmentation techniques. We propose OL-Aug, which contains two online augmentation modules, namely Swap-GT and GT-Aug++, to enhance the realism and usefulness of augmented data. Unlike previous offline LiDAR data augmentation approaches, our Swap-GT module swaps objects in the current scene with the objects which have closest location and size from an object database in an online manner. In addition, the GTAug++ module not only inserts objects from the database but also removes the occluded background point clouds. To evaluate the effectiveness of our proposed OL-Aug approach, we conduct experiments on the KITTI dataset for 3D object detection. The results demonstrate that OL-Aug outperforms previous state-of-the-art LiDAR data augmentation methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The multispectral imaging system captures images using multiple different light sources. As the camera focuses on different light sources, the wavelength of the light changes, causing a shift in the focal length of the camera and a corresponding change in the information captured by the image. To address this issue, this paper analyzes other evaluation functions and modifies the Tenengrad function to extract image gradient information from multiple directions. The paper then proposes the SIFTQuad_Tenen image clarity evaluation function, which is combined with the SIFT feature point extraction algorithm. Experiments were conducted using three different light sources: red, green, and blue. The resulting clarity evaluation curves and related indicators were compared with those of other evaluation functions. The results show that the proposed evaluation function has good performance in all three lighting scenarios, as well as better stability and higher sensitivity than other evaluation functions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Using hyperspectral imaging technology and machine learning methods to classify and identify whether tobacco leaves have undergone mold contamination. Visible-near-infrared hyperspectral imaging technology was employed, and various preprocessing techniques such as normalization, standard normal variate (SNV), multiplicative scatter correction (MSC), first derivative (FD), and convolutional smoothing (SG) were applied to preprocess the spectral data. Feature wavelength selection was carried out through successive projections algorithm (SPA) and principal component analysis loadings (PCA loadings). Classification models were built using random forest (RF), Softmax, and support vector machine (SVM).Among the preprocessing methods, SNV was identified as the optimal spectral preprocessing technique. The RF model established through feature wavelength selection using SPA demonstrated the best performance, with training and testing accuracies reaching 98.82% and 98.64%, respectively. The combination of hyperspectral imaging technology with the SPA-RF model proved to be effective in accurately classifying and identifying mold contamination in tobacco leaves.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Ceramic cultural relics are an important way for people to understand and converse with history. Efficient classification of a large number of ceramic cultural relics fragments through deep learning methods is of great research significance. The current classification methods have problems with low classification accuracy and complex network structure. Therefore, this article proposes a comparative learning method HySiam based on a dual channel attention mechanism. Firstly, various data augmentation strategies such as random cropping and flipping of input images are used to enrich the semantic features of the data. Secondly, this article proposes a plug and play dual channel attention mechanism - HAC (Hybrid Attention in CNN), which improves network performance and reduces the number of model parameters by combining it with convolutional neural networks to extract semantic features. Finally, use the mean square loss function for comparative learning iterative optimization. The experimental results show that compared with unsupervised learning methods such as SimSiam and SimCLR, the classification accuracy of our method on the ceramic microscopic image dataset CRMI collected in this paper has improved by the highest of 3%, reaching 96.8%. At the same time, this article added the HAC module to the supervised classification networks ResNet and ConvNeXt, and performed linear classification on the CRMI dataset, improving the accuracy by 2.4% and 0.8%, respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Brain tumors can be life-threatening. Early detection and accurate determination of the type and location of brain tumors are crucial for intervening in the condition of brain tumor patients and saving lives. In areas with limited medical facilities and doctor resources, obtaining timely diagnosis from medical experts can be challenging, resulting in missed treatment opportunities. The use of the medical Internet of Things for remote diagnosis of brain tumors, which includes automatic diagnosis, is a solution to the global issue of imbalanced distribution of equipment and expert resources. Therefore, it is crucial to develop a robust and highly accurate intelligent diagnosis system for brain tumors. Acquiring and annotating brain tumor MR image data is a time-consuming and expensive process due to the large image sizes and limited sample annotation data available. These factors pose significant challenges to the robustness and accuracy of the model. To address these issues, we propose a new deep semi-supervised learning approach that fully utilizes unlabeled sample data. Data augmentation methods were used in the model to increase training data, avoid overfitting, and improve the model's generalization ability. A new sliding window feature extraction method was used to avoid resizing images, which may lead to loss and neglect of small features, in order to accurately diagnose brain tissue lesions of small brain tumors that are difficult to recognize. The feature extraction backbone network introduced the Convolutional Block Attention Module (CBAM) attention mechanism, which enabled the network to fully understand image information in both spatial and channel aspects, enhancing the network's perception ability of key features. Experiments were conducted on publicly available brain tumor datasets to validate the advantages of our method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This research paper explores the application of image enhancement processing algorithms in the context of computer art education. The paper acknowledges the significance of high-quality images in art education and addresses the challenges in enhancing image quality due to factors like equipment performance, environmental conditions, and photographic techniques. It delves into various image enhancement methods, such as those based on physical models for dehazing and underwater image enhancement, including the dark channel prior theory and methods that do not require prior knowledge of camera parameters. The study also covers the importance of objective and precise evaluation of image quality changes before and after processing, highlighting the necessity for reliable image quality evaluation methods. It introduces the Unsupervised Low-Light Enhancement Algorithm (ULEA) based on attention mechanisms, detailing its network structure, loss functions, and the impact of these functions on low-light image enhancement processing. Through experimental research and comparative analysis, the paper demonstrates the effectiveness of ULEA in enhancing low-light images, significantly improving overall brightness and image texture layers compared to traditional and deep learning-based methods. In conclusion, the paper presents a compelling case for the practical application of advanced image enhancement algorithms in computer art education, proving their effectiveness in enhancing the visual quality of images and their applicability in a variety of conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital micromirror device (DMD) based lithography system, which generates the mask pattern via a spatial light modulator, is increasingly applied in micro-nano fabrication due to its high flexibility and low cost. However, the exposure image is subject to distortion because of the optical proximity effect and the non-ideal system conditions. Correcting mask pattern with calibrated imaging model is an essential approach to improve the image fidelity of DMD-based lithography system. This paper introduces an imaging model calibration method for the DMD-based lithography testbed established by our group. The error convolution kernel and the point spread function in the imaging model are optimized using the batch gradient descent algorithm to fit a set of training data, which represent the impacts of non-ideal imaging process of the DMD-based lithography testbed. Based on the calibrated imaging model, the steepest descent algorithm is used to correct the mask pattern, thus improving the image fidelity of the testbed. Experiments demonstrate the effectiveness of the proposed model calibration method. It also shows that the size of error convolution kernel significantly influences the accuracy of the calibrated imaging model within a certain range. Finally, the effectiveness of the mask correction method is proved by experimental results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the past year or two, various implicit representations have made significant breakthroughs in reconstructing heads, faces, and objects. For example, an implicit digital avatar can be created from a video. But there is an obvious problem with such methods. They have big issues with the compatibility and editability of existing methods. The method we proposed can effectively solve this problem and obtain a high-fidelity explicit geometric head avatar through the same data output. At the same time, because it has a specific geometry, it can be changed by changing the position of the vertices and the texture. Color to facilitate editing of the current model. At the same time, because of the properties of explicit geometry, it can also be easily applied to various rendering engines and frameworks. At the same time, if you only use explicit geometry methods, there will naturally be problems with expression ability. Therefore, our method is to combine explicit and implicit, using probability theory and other related knowledge to build this model so that it can have both the advantages of the person. Our method does perform well on the shortcomings of both.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital humans are extensively utilized across various industries, including gaming, films, and AR/VR. The reconstruction of the human head from images or videos is a significant area of research. Existing methods could generally be categorized into two groups. One approach involves fitting a 3D Morphable Model (3DMM), which offers fast reconstruction results but often lacks precision and struggles with accurately capturing the hair region. The other approach employs neural implicit representations to model the human head, resulting in more detailed geometry but requiring a lengthy optimization process. Moreover, both methods have limitations in terms of practical application due to specific constraints on the data format used. To address these challenges, we propose a hybrid methodology. Our approach utilizes a triangular mesh to represent head geometry and optimizes per-vertex offsets while employing a well-designed neural appearance field to capture the texture. Leveraging this representation, we develop a differentiable renderer and perform joint optimization of geometry and texture. Experimental results demonstrate that our method achieves high-precision head reconstruction within 5 minutes based on head videos captured under arbitrary lighting conditions. The reconstruction encompasses the hair region, and all the required data can be captured exclusively using the front-facing camera of a smartphone.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image matting is a widely-used image processing technique that aims at accurately separating foreground from an image. However, this is a challenging and ill-posed problem that demands additional input, such as trimaps and background images, for providing prior knowledge. However, the manual annotation of trimaps require lots of labor, limiting the application of trimap-based methods. Some trimap-free methods explore alternatives with low labor requirements by utilizing captured background images, including background-based methods. However, the quality of alpha mattes predicted by trimap-free methods still fall short of trimap-based methods. To reduce the performance gap between background-based and trimap-based methodes, we present Trimap Generation from Background Image (TG-BG) method which can generate trimaps from the input image and a captured background image. It provides an economical solution to facilitate the application of trimap-based methods, allowing for low-cost and high-quality alpha matte predictions. TPBG leverages a ViT backbone for feature extraction and employs the Image and Background Detail Fusion Stream (IBDFS) to capture multi-scale detail information. The introduction of foreground impact loss encourages the network to pay more attention to the foreground in the image. We validate the trimap prediction performance of TP-BG by comparing the alpha matte quality obtained by background-based methods and that obtained by trimap-based methods integrated with TP-BG. The experimental results demonstrate that TP-BG can generate high-quality trimap from a background image, and trimap-based methods integrated with TP-BG outperform the state-of-the-art background-based methods in terms of four alpha matte quality metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To address the problem of how to improve the utilisation of a large number of unlabelled samples and the accuracy of classification using a few labelled ship image samples, this paper proposes a semi-supervised ship image classification network based on Double threshold FixMatch (DT-FixMatch). Firstly, double threshold is introduced into the FixMatch, and the model predicts the unlabelled images after weak enhancement, and retains the category of the results with a certain confidence level higher than the high threshold and transforms them into pseudo-labels; for the results higher than the low threshold, the softmax output value of the prediction result is directly used to compare it with the model's softmax output value of the strongly-enhanced version of the same image, and calculates the Mean Square Error (MSE) loss, which allows the low confidence samples to be fully utilised. Secondly, to reduce the influence of noisy pseudo-labels, we have implemented label smoothing and consistency regularization. Additionally, we have incorporated an improved attention mechanism into the backbone network, concatenated with I_CBAM, which enhances ResNeXt's ability to extract potentially critical features from fuzzy ship images. The experimental results, based on DataCastle's public dataset, indicate that the model achieves a classification accuracy of 92.86%, precision of 87.68%, and F1-Score of 81.67% when using only 5 images with 50 labelled images in each of the ten different ship images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the wide application of large-diameter optical components, the measurement efficiency of single-point chromatic confocal system is very low, which cannot complete the detection of surface defects of large-diameter optical components efficiently. Line-scanning chromatic confocal technology has been widely used because of its ability to achieve fast and low time-consuming measurements. This study designs and analyzes a line-scanning confocal system for optical component defects based on the principle of chromatic confocal. Firstly, the basic principle of the line-scanning chromatic confocal system is elaborated; secondly, the dispersive objective in the measurement system are designed, enables the system to realize high-precision, wide-range measurements; the theoretical analysis results show that the axial resolution of the system can reach 0.5 μm, which provides certain technical support for the detection of defects on the surface of sub-micron optical components.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at the optical design of free-form surface can effectively reduce the size of systemic structure and improve the imaging quality. Through the modulation transfer function diagram and point diagram, the rationality and superiority of the high-order aspheric lens design of the complex optical-mechanical system of aerial camera are analyzed. Based on the surface shape error caused by free-form surface optical elements in machining, the effects of surface error of free-form surface optical elements on the optical performance of complex optical-mechanical system are studied by ray tracing and wavefront fitting, and the influence law is revealed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In remote sensing imagery, the detection of fine-grained objects can be seen as a challenging issue, there are 3 conundrums in fine-grained objects detection that cannot be well addressed by the existing deep learning-based methods, the first one is hard to adapt multi-scale objects, the second one and the last is that which have a slow convergence speed and limitations to scarce datasets respectively. To deal with the conundrum we noted before, we propose a deep convolutional neural network (CNN) with a residual structure as the backbone network, to extract deep-level details from the image, which is a mainstream detection method for multi-scale fine-grained objects in complex remote sensing scenarios. Besides, we introduce a multi-scale region generation network to overcome the limitations of fixed receptive field convolution kernels and enable multi-scale object detection. Lastly, we replace fully connected layers in the fully convolutional region classification network with 1×1 convolutional layer to enhance detection efficiency and detection speed. To overcome the limitation of scarce datasets, we conducted experiments on the FAIR1M dataset, which is currently the largest fine-grained object detection dataset in the remote sensing field. Simulation results show that the proposed detection method achieves the highest average precision (35.86%) among all benchmarks and outperforms the classic Faster R-CNN-based method by 3.44%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Studying how to quickly and efficiently match LED lights to obtain the standard white light in night mode that required by the national military standard is of great significance to improving the scientific research and production efficiency of special LCD displays. This article designs a visual color matching calculation program by establishing a color matching model and combining it with MATLAB software. Given the initial parameters and target color parameters, the RGB threecolor brightness level that meets the color matching requirements can be automatically generated, thereby completing the color matching needs quickly and efficiently. The experimental results show that, aiming at the standard white light required by the national military standard, the given color matching parameters can achieve white light display very well, and the color coordinate error is within 0.4%, which can meet the needs of engineering practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A fringe projection profilometry with a large depth range geometric constraint is proposed, which can periodically encode the background without reducing the fringe amplitude to achieve high-frequency phase unwrapping. This method makes adequate use of the redundant part of the sinusoidal signal to embed the period information into the background, which ensures that the fringes have high signal-to-noise ratios; furthermore, in order to realize the high-frequency phase unwrapping, we introduce geometric constraints into the fringe projection system, whereby the depth range of the 3D measurement system is effectively extended. In addition, a phase-based compensation method is proposed to compensate for period orders at the edges of the period background encoding. Compared with the conventional phase coding method, this method performs period encoding of the background and avoids the problem of incorrect period order due to system nonlinearity. Also, the measurement depth range for geometric constraints is significantly extended without the need to obtain any information about the object in advance. Experimentally, the 3D profile of a standard plane, a complex object and separated objects were measured using the proposed method. The results show that this method can achieve fast and high-precision measurement of object contours using four fringe patterns.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Generating stylized captions for images is a challenging task, in order to solve this problem, this paper proposes a new stylized image captioning method M-tag. we design a memory module M containing a set of embedding vectors for encoding stylistically relevant phrases in a training corpus. To obtain style-related phrases, we develop a sentence decomposition algorithm P, which divides the stylized sentence into a style-related part that reflects the linguistic style and a content-related part that contains the visual content. Transformer encoder is used to encode the target of target features within the image and swin-Transformer encoder is used to encode the relational features within the image to jointly encode different aspects of information within the image from different perspectives. The stylized target features encoded by the target Transformer are fused with the relational features encoded by the swin-Transformer through the splicing method to achieve the purpose of fusion of intra-image relational features and local target features. When generating the caption, the content-related style knowledge is first extracted from the memory module through the noticing mechanism, and the extracted style features are then integrated into the language model, and finally the fused encoded features are decoded to generate the corresponding image description using the Transformer decoder.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the development of artificial intelligence, the use of drones in everyday life is becoming more and more popular, but there is still room for improvement in terms of detection accuracy and detection speed when using drones to capture aerial images for small target detection. This paper proposes a target detection and recognition method based on an improved YOLOv7 for small target detection applied to UAV aerial photography. First, the backbone network of MobileOne is improved, and the MobileOne module is added to the YOLOv7 model, which replaces some of the standard convolutional layers in the YOLOv7 backbone network, effectively speeding up the inference speed and reducing the number of model parameters; secondly, based on the ConvNeXt structure, the CNeB module is constructed to improve the feature extraction capability of the YOLOv7 network and target detection; finally, the Wise-IoU loss function is introduced to reduce the competitiveness of high-quality anchor frames while minimizing the harmful gradients generated by low-quality examples. Our improved method is compared with traditional target detection algorithms such as YOLOv7 on the VisDrone2019 dataset. The results show that the improved method has better detection results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.