We present a model for real-time pedestrian detection based on a deep learning framework. With respect to the base network for feature extraction, we have improved the network based on Mobilenet which is a simple and fast convolutional neural network. We only use the front part of its network and then build several new multi-scale convolutional layers to calculate multi-scale feature maps. With respect to the detection network behind the feature extraction, we use a simplified SSD(single shot multibox detector) model to detect pedestrians with fewer feature maps. In addition, we design detection boxes with specific sizes according to pedestrian’s shape characteristics. To avoid overfitting, we apply data augmentation and dropout techniques to training. Experimental results on PASCAL VOC and KITTI confirm that the speed of our detection model has been increased by 22.2% while precision remains almost unchanged. Our approach makes a trade-off between speed and precision, and has an obvious speed advantage over other detection approaches.
The application of deep learning in traditional industries has not gained much attention. However, deep learning has a great potential to be transplanted to other fields. And we managed to apply two techniques in deep learning, object detection and tracking, in dynamic object counting. We test it on one of the basic problem in steel industry, rebar counting. To cope with this, we used an infrared camera to collect video of rebar on the spot so that rebar can be distinguished from background apparently. Then we use the video to complete counting work. We divided the counting process into two parts: detection and tracking. We improved SSD model to satisfy the detection demand of accuracy and speed, and use KCF to track. Given the fact that the rebar objects in video are scale-invariable, we reduced the feature map numbers as well as the anchors and gained a considerable speed-up, without worsening the accuracy. To getting rid of the error from the vibration of conveyor belt, we improved the tracking algorithm and make a satisfactory result. The application of our object counting system is not limited in rebar counting, and it can be transplanted to some other field.
With the improvement of 3D reconstruction theory and the rapid development of computer hardware technology, the reconstructed 3D models are enlarging in scale and increasing in complexity. Models with tens of thousands of 3D points or triangular meshes are common in practical applications. Due to storage and computing power limitation, it is difficult to achieve real-time display and interaction with large scale 3D models for some common 3D display software, such as MeshLab. In this paper, we propose a display system for large-scale 3D scene models. We construct the LOD (Levels of Detail) model of the reconstructed 3D scene in advance, and then use an out-of-core view-dependent multi-resolution rendering scheme to realize the real-time display of the large-scale 3D model. With the proposed method, our display system is able to render in real time while roaming in the reconstructed scene and 3D camera poses can also be displayed. Furthermore, the memory consumption can be significantly decreased via internal and external memory exchange mechanism, so that it is possible to display a large scale reconstructed scene with over millions of 3D points or triangular meshes in a regular PC with only 4GB RAM.
We present a fast feature matching approach based on coherence and geometry constraints. Our method first estimates the epipolar geometry between the images with a small number of feature points, then uses the epipolar geometry constraint and the coherence among the matches to guide the matching of the remaining features. For the rest of the feature points, we firstly reduce the scope of the candidate matching points according to the epipolar geometry constraint. After that, we use the coherence constraint, which requires the matching points of neighboring feature points to be neighbors, to further reduce the number of the candidate matching points. Such a strategy can effectively reduce the matching time and retain more correct matches which are filtered by David Lowe’s ratio test. Finally, we remove the mismatches roughly with the coherence among the matches. We validate the effectiveness of our method through matching and SfM results on various of public datasets.
Tracking deforming objects involves estimating the global motion of the object and its local deformations as a function
of time. Tracking algorithms using Graph Cut have been proposed. However, because of its globally optimal nature, it is
prone to capture outlying areas similar to the object of interest. Constraint was introduced to confine the standard graph
cut technique in many methods. In this paper, narrow band is introduced to constrain the standard graph cut where it is
build on. With the narrow band constraint, segmentation is ensured to remain in it, which is useful when the result is not
robust using standard graph cut. Secondly, Graph Cut is built based on the band region, compared with traditionally
method, calculation cost is reduced. Thirdly, formulations are adjusted to distinguish between different parts of the
narrow band, which improve the segmentation quality.
Content-Based Image Retrieval (CBIR) is an important research topic of information retrieval,
involved in computer graphics, image processing, data mining and pattern recognizing. To make
content-based image retrieval suitable large-scale image database, we develop an effective dynamic
hierarchical clustering index scheme. Although this system uses a hierarchical clustering technology, with
the increasing in the number of cluster centers, it is slow to find the centers, and it becomes a system
performance bottleneck. In this paper, content features of image memory indexing is built. This method
effectively improves the retrieval speed without loss of the precision. Moreover, the clustering model was
improved, integrating the content features and textual features of image, which greatly improve the
accuracy of the clustering, thus significantly improves the system precision.
Image matching is a critical issue in many image processing applications. It is very likely that the real-time sensed image and the reference image have significant differences in terms of brightness, contrast, and angle due to the changing parameters of imaging devices, the illumination conditions, and the view angles. This has greatly affected the precision and the efficiency of the target recognition tasks in remote sensing. We construct a novel moment invariant as a confidence measure for the image matching task. Using the distance between template and local region in a real-time image in the feature space of moment invariant, a target detection algorithm is implemented that does not rely on either the imaging angles or the illumination conditions. Based on the phases of complex moments, the direction of each matching region in relation to the template can also be obtained. The experiments show that the algorithm can be used in target identification with changing conditions of the brightness, contrast, and the rotation angles in relation to the template.
We propose a unified approach that incorporates the mean shift-based image segmentation algorithm and the SST (shortest spanning tree)-minmax-based graph grouping method to achieve effective IR object segmentation performance amenable for real-time application. It preprocesses an image by using the mean shift algorithm to form segmented regions that can not only remove the noise, but also preserve the desirable discontinuity characteristics of the ship object. The segmented regions can then effectively represent the original image by using the graph structures, and we apply the SST-minmax method to perform merging procedure to form the final segmented regions. Due to the good discontinuity-preserving filtering characteristic, we can effectively remove the clutter disturbance of the sea background without loss of the IR ship object information, and significantly reduce the number of basic image entities. Therefore, the region merging based on SST-minmax can produce excellent segmentation performance at low computational cost due to smaller clutter disturbance and less region nodes. The superiority of the proposed method is examined and demonstrated through a large number of experiments using a real IR ship image sequence.
A method for automatic segmentation and detection of small target in the earth to sky background was presented in this paper. When small target in far distance some objects on ground came upon to the scene. This always resulted in false detection in automatic detecting systems if we segmented small object by only using its illumination. In order to remove the big block and disconnected part of a gradient image was made because the gradient scale of the small target and background objects may be comparatively similar. Then an adaptive threshold method was adopted by using the image means as a threshold for several times to segment objects in the gradient image. After that successive over-relaxation was used to incorporate the disperse regions and the nearby isolated points would connect to the big block. The lowest value of valley (nonzero value) was searched and this value was used as the threshold to make binary image. So the connected clutters could be removed only by counting the number of pixels in the connected regions. Eventually the pipeline target detection algorithm was used to process the sequential images to detect the real small target automatically.
In this paper an object recognition method based on genetic algorithm was presented. One of the most difficult problems in object recognition is to correctly decide whether two dissimilar images are originated from different items, or belong to the same object but viewed from different camera positions. Invariant-object recognition is to identiy an object independently of its position (translated or rotated) and size (larger or smaller). It was found that images of objects observed from two different viewpoints can be approximately related by affine transformation if the camera is placed sufficiently far away. This paper proposes an affine and projective invariant ship object recognition approach and employs genetic algorithm (GA) to find the appropriate affine transformation parameters to implement the matching process between the reference object and the scene object. Experiment results show good performances of the proposed method.
In this paper a FLIR image segmentation algorithm based on genetic algorithm and fuzzy set theory was presented. Image processing has to deal with many ambigious situations. Fuzzy set theory is a useful mathematical tool for handling the ambiguity or uncertainty. A fuzzy entropy is a functional on fuzzy sets that becomes smaller when the sharpness of its argument fuzzy set is improved. The paper defined different member function for the object and background of the image to transform the image into fuzzy domain and chose Z-function and S-function as the membership functions for the object and background of the image respectively and threshold the image into the object and background by maximizing the fuzzy entropy. The procedure for finding combination of a, b and c is implemented by genetic algorithm with appropriate coding method to avoid useless chromosomes. The experiment results show that our proposed method gives better performance than other general methods with good real-time by using genetic algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.