PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
A theory of 3-D visual perception and figure/ground separation by visual cortex is described. Called FACADE Theory it suggests a solution of the 3-D figure/ground problem for biological vision and makes many predictions whereby it can be tested. The theory further develops my 3-D vision theory of 1987 which used multiple receptive field sizes or scales to define multiple copies of two interacting systems: a Boundary Contour System (BCS) for generating emergent boundary segmentations of edges textures and shading and a Feature Contour System (FCS) for discounting the illuminant and filling in surface representations of Form-And-Color-And-DEpth or FACADEs. The 1987 theory did not posit interactions between the several scales of the BCS and FCS. The present theory suggests how competitive and cooperative interactions that were previously defined within each scale also act between scales. 2 / SPIE Vol. 1382 Intelligent Robots and Computer Vision IX: Neural Biological and 3-D Methods (1990)
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There is growing interest in using the complex logarithmic mapping for depth determination in motion stereo applications. This has lead to a need for a comprehensive error analysis. Rather than just giving an analytic description of the errors inherent in the approach an attempt will be made to characterize the errors that occur when using the mapping with real images. Techniques to reduce the impact of these errors will also be discussed. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Our fds-theory accepts context effects in perception. It takes account of early perception phase by establishing the concept of " Structural IDentity" of a k-norm " SkID" . Early perception detects the " maximally abstracted" holistic view of an object by representing it in terms of an attributed semantic graph-word. The SkID graph-word along with the contextual abstraction-level dependent description of perceptual characteristics constitute the " formal description schema - fds" models of the norm-objects. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Adaptive resonance theory (ART) has been used to develop neural network architectures in order to self-organize pattern recognition codes stably in real-time in response to random input sequences of patterns. A brief background of the motivations and design considerations underlying the development of adaptive resonance networks an outline of their basic operation a new idea for improving the model and some experimental results are discussed in this article. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The aim of this paper is to propose a neural network architecture as an approach to the feature matching problem in stereo vision. The model is based on the principle of shunting feedback competitive equations studied in depth by Grossberg and his colleagues. Psychophysical constraints utilized in the early computational models ofMarr-Poggio-Grimson Pollard-Mayhew- Frisby and Prazdny serve as basis for the architecture design of our network and for the selection of candidate matches. Competition and cooperation take place among the candidate matches and provide a strong and natural disambiguation power. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image enhancement as an aid for the visually impaired may be used to improve visibility of broadcast TV programs and to provide a portable visual aid. Initial work in this area was based on a linear model. The finite dynamic range available in the video display and contamination of the enhanced image by high spatial frequency noise limited the usefulness of this model. I propose a new enhancement method to address some of the limitations of the original model. It considers the nonlinear response of the visual system and requires enhancement of sub-threshold spatial information only. This modification increases the dynamic range available by decreasing the range previously used by the linear models to enhance visible details. Implementation of an image-enhancing visual aid in a head-mounted binocular full-field virtual vision device may cause substantial difficulties. Adaptation for the patient may be difficult due to head movement and interaction of the vestibular system response with the head-mounted display. I propose an alternate bioptic design in which the display is positioned above or below the line of sight to be examined intermittently possibly in a freeze-frame mode. Such implementation is also likely to be less expensive enabling more users access to the device. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A multi-task dynamic neural network that can be programmed for storing processing and encoding spatio-temporal visual information is presented in this paper. This dynamic neural network called the PNnetwork is comprised of numerous densely interconnected neural subpopulations which reside in one of the two coupled sublayers P or N. The subpopulations in the P-sublayer transmit an excitatory or a positive influence onto all interconnected units whereas the subpopulations in the N-sublayer transmit an inhibitory or negative influence. The dynamical activity generated by each subpopulation is given by a nonlinear first-order system. By varying the coupling strength between these different subpopulations it is possible to generate three distinct modes of dynamical behavior useful for performing vision related tasks. It is postulated that the PN-network can function as a basic programmable processor for novel vision machine systems. 1. 0
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A robust vision model has been developed and implemented with a self-organizing/unsupervised artificial neural network (ANN) classifier-KART which is a novel hybrid model of a modified Kohonen''s feature map and the Carpenter/Grossberg''s ART architecture. The six moment invariants have been mapped onto a 7-dimensional unit hypersphere and have been applied to the KART classifier. In this paper the KART model will be presented. The non-adaptive neural implementations on the image processing and the moment invariant feature extraction will be discussed. In addition the simulation results that illustrate the capabilities of this model will also be provided. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Biological basis for machine vision is a notion which is being used extensively for the development of machine vision systems for various applications. In this paper we have made an attempt to emulate the receptive fields that exist in the biological visual channels. In particular we have exploited the notion of receptive fields for developing the mathematical functions named as discriminantfunctions for the extraction of transition information from signals and multi-dimensional signals and images. These functions are found to be useful for the development of artificial receptive fields for neuro-vision systems. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural network models have been studied for a number of years for achieving human-like performances in the fields of image and speech recognition. There has been a recent resurgence in the field of neural networks caused by new topologies and algorithms analog VLSI implementation techniques and the belief that massive parallelism is essential for high performance image and speech recognition. This paper presents an idea of implementing neural networks with Boolean programmable logic models. Though the approach didn''t adopt continuous analog framework commonly used in related research it can handle a variety of neural network applications and avoid some of the limitations of threshold logic networks. Dynamically programmable logic modules (or DPLM''s) can be implemented with digital multiplexers. Each node performs a dynamically-assigned Boolean function of its input vectors. Therefore the overall network is a combinational circuit and its outputs are Boolean global functions of the network''s input variables. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We investigate the use of chromatic information in dense stereo correspondence. Specifically the chromatic photometric constraint which is used to specify a mathematical optimality criterion for solving the dense stereo correspondence problem is developed. The result is a theoretical construction for developing dense stereo correspondence algorithms which use chromatic information. The efficacy of using chromatic information via this construction is tested by implementing singleand multi-resolution versions of a stereo correspondence algorithm which uses simulated annealing as a means of solving the optimization problem. Results demonstrate that the use of chromatic information can significantly improve the performance of dense stereo correspondence. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Insects use a relatively simple visual system to navigate and avoid obstacles. In particular they use self motion to determine the range to objects by the angular velocities of the contrasts across the retina array. Adopting principles learnt from studying insect behaviour and neurophysiology we have modelled aspects of the motion detection mechanism of an insect visual system into a means of categorising edges and computing their motion and thus determining range. Copying insect motion perception a camera is scanned across a scene and a temporal sequence of line images captured. The 8-bit grey scale image is immediately reduced to a 1og23 1. 6 bit image by saturating the contrast. Behind each pixel one state is formed by increasing intensity one by decreasing intensity and a third is indeterminate. Pairs of receptors at two consecutive times forming a 2 by 2 template in space-time give a finite number of combinations of which it is found that only a small subset provide useful motion information. Combinations of selected templates results in a distribution of template responses that is amenable to analysis by the Hough transform. Running the model on real scenes reveals the value of lateral inhibition as well as insights into the effect of different edge types and the use of parallax. The model suggests a possible new neurophysiological construction that can be copied in hardware to provide a fast means inferring 3-d structure in a scene where the observer is moving with a known velocity. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Biological sensor design has long provided inspiration for sensor design in machine vision. However relatively little attention has been paid to the actual design parameters provided by biological systems as opposed to the general nature of biological vision architectures. In the present paper we will provide a review of current knowledge of primate spatial vision design parameters and will present recent experimental and modeling work from our lab which demonstrates that a numerical conformal mapping which is a refinement of our previous complex logarithmic model provides the best current summary of this feature of the primate visual system. In this paper we will review recent work from our laboratory which has characterized some of the spatial architectures of the primate visual system. In particular we will review experimental and modeling studies which indicate that: . The global spatial architecture of primate visual cortex is well summarized by a numerical conformal mapping whose simplest analytic approximation is the complex logarithm function . The columnar sub-structure of primate visual cortex can be well summarized by a model based on a band-pass filtered white noise. We will also refer to ongoing work in our lab which demonstrates that: . The joint columnar/map structure of primate visual cortex can be modeled and summarized in terms of a new algorithm the ''''proto-column'''' algorithm. This work provides a reference-point for current engineering approaches to novel architectures for
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There is a growing need for progress in the technology for the management of knowledge derived from sensory data. In this paper we address the structure and operations of a cellular " Experiential Knowledge Base E*KB system which acts as a cognitive prosthesis to the decision maker. We concern ourselves with the experiential knowledge representation techniques and with the methods and tools required for the extraction (relevance filtering) the contextual abstraction (data compression and generalization) the classification and storage of real time sensory knowledge in the cellular architecture of the E*KB. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We advance new associative processor (AP) synthesis algorithms and performance measures. We compare 1 0 different 1 :1 APs (where each input key vector is associated with a different output recollection vector). We find that unless new output recollection vector encoding techniques are used APs are not competitive. We find the robust Ho Kashyap-2 CAAP (content addressable AP) to be preferable and that the parameter used in it should not be chosen larger than necessary. 1 .
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We explore the idea that the visual system uses specialized processes to extract critical information about 3D motion and structure for visuallyguided navigation. We first consider the computation of three essential properties: the relative 3D direction of heading of the observer the timetocollision with approaching object surfaces and the locations of object boundaries defined by motion discontinuities. We then focus on the heading computation relating current algorithms to the perception of heading direction. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper discusses the application of connectionist learning systems (e. g. artificial neural networks) in the design and implementation of automatic feedback control systems. The benefits of this approach are primarily realized in applica- Uons involving nonlinear dynamical systems. For such problems connectionist learning systems may be used advantageously to: (i) facilitate the control system design and tuning process (ii) improve performance by reducing delays that might otherwise be associated with gain or parameter adaptation and (iii) improve robustness by providing an on-line capability for accommodating some unmodeled dynamics (e. g. nonlinear and time-varying behavior). Several control architectures and connectionist learning systems are described. Preliminry experimental results are also presented. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The brain can perform the tasks of associative recall detection recognition and optimization. In this paper space-time system field models of the brain are introduced. They are called the space-time maximum likelihood associative memory system (ST-ML-AMS) and the space-time adaptive learning system (ST-ALS). Performance of the system is analyzed using the probability of error in memory recall (PEMR) and the space-time neural capacity (ST-NC). 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For autonomous machines equipped with vision capabilities and in a controlled environment 3-D model-based object identification methodologies will in general solve rigid body recognition problems. In an uncontrolled environment however several factors pose difficulties for correct identification. We have addressed the problem of 3-D object recognition using a number of methods including neural network classifiers and a Bayesian-like classifier for matching image data with model projection-derived data [1 21. Neural network classifiers used began operation as simple feature vector classifiers. However unmodelled signal behavior was learned with additional samples yielding great improvement in classification rates. The model analysis drastically shortened training time of both classification systems. In an environment where signal behavior is not accurately modelled two separate forms of learning give the systems the ability to update estimates of this behavior. Required of course are sufficient samples to learn this new information. Given sufficient information and a well-controlled environment identification of 3-D objects from a limited number of classes is indeed possible. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent successes ofneural networks has led to an optimistic outlook for neural network applications to image processing(IP). This paperpresents a general architecture for performing comparative studies of neural processing and more conventional IF techniques as well as hybrid pattern recognition (PR) systems. Two hybrid PR systems have been simulated each of which incorporate both conventional and neural processing techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a stereo vision matching algorithm, implemented via a neural network architecture, is described. The stereo matching problem, that is, finding the correspondence of features between two images, can be cast as a constraint satisfaction problem. The algorithm uses image edge features and assumes a parallel-axis camera geometry such that the corresponding image points must lie in the same scanline. Intra-scanline constraints are used to to perform multipleconstraint satisfaction searches for the correct match. Further, inter-scanline constraints are used to enforce consistent matches by eliminating those that are not getting enough support from the neighboring scanlines. The inter-scanline constraints are implemented in a 3-D neural network which is formed by a stack of 2-D neuron layers. First, a mulilayered network is designed to extract the features points for matching using a static neural network. A similarity measure is defined for each pair of feature point matches which are then passed on to the second stage of the algorithm. The purpose of the second stage is to turn the difficult correspondence problem into a constraint satisfaction problem by imposing relational constraints. The result of computer simulations are presented to demonstrate the effectiveness of the approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mutually inhibitory networks are the fundamental building blocks of many complex systems. Despite their apparent simplicity they exhibit interesting behavior. We analyze a special class of such networks and provide parameters for reliable K-winner performance. We model the network dynamics using interactive activation and compare our results to the sigmoid model. When the external inputs are all equal we can derive network parameters that reliably select the units with the larger initial activations because the network converges to the nearest stable state. Conversely when the initial activations are all equal we can derive networks that reliably select the units with larger external inputs because the network converges to the lowest energy stable state. But when we mix initial activations with external inputs we get anomalous behavior. We analyze these discrepancies giving several examples. We also derive restrictions on initial states which ensure accurate K-winner performance when unequal external inputs are used. Much of this work was motivated by the K-winner networks described by Majani et at. in [1]. They use the sigmoid model and provide parameters for reliable K-winner performance. Their approach is based primarily on choosing an appropriate external input the same for all units that depends on K. We extend their work to the interactive activation model and analyze external inputs constant but possibly different for each unit more closely. Furthermore we observe a parametric duality in that changing
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There is an increasing demand for fast and reliable pattern recog nition methods in many fields of industry in particular in inspec tion. Conventional pattern recognition systems mostly are not ca pable to cope with such tasks. Neural nets seem to be well suited for most of the requirements. Preprocessing steps to reduce the number of neurons in cognitive units are essential in applying neural paradigms to vision. Joint space/ spatialfrequency representations are discussed in view of their application to such image preprocessing. A system is proposed consisting of a low level and a cognitive unit of the Hopfield type. Experimental results reached with a simulation of this system are demonstrated. With the system the joint recognition of separately learned patterns is possible. SPIE Vol. 1382 Intelligent Robots and Computer Vision IX: Neural Biological and3-D Methods (1990) / 255
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Skeletons provide a compact and elegant description of the shape of binary objects. They are usually obtained by performing a distance transformation on the original binary data or by thinning. In this paper we summarize some of the existing techniques in this area and introduce iterative neural networks for skeletonization and thinning. The networks are trained to learn a deletion rule and they iteratively delete points from the objects until only the skeleton remains. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Edge linearization operators are often used in computer vision and in neural network models of vision to reconstruct noisy or incomplete edges. Such operators gather evidence for the presence of an edge at various orientations across all image locations and then choose the orientation that best fits the data at each point. One disadvantage of such methods is that they often function in a winner-take-all fashion: the presence of only a single orientation can be represented at any point multiple edges cannot be represented where they intersect. For example the neural Boundary Contour System of Grossberg and Mingolla implements a form of winner-take-all competition between orthogonal orientations at each spatial location to promote sharpening of noisy uncertain image data. But that competition may produce rivalry oscillation instability or mutual suppression when intersecting edges (e. g. a cross) are present. This " cross problem" exists for all techniques including Markov Random Fields where a representation of a chosen favored orientation suppresses representations of alternate orientations. A new adaptive technique using both an inhibitory learning rule and an excitatory learning rule weakens inhibition between neurons representing poorly correlated orientations. It may reasonably be assumed that neurons coding dissimilar orientations are less likely to be coactivated than neurons coding similar orientations. Multiplexing by superposition is ordinarily generated: combinations of intersecting edges become represented by simultaneous activation of multiple neurons each of which represents a
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces an understanding of the role of certain types of neural cells in the vertebrate retina as a process for edge detection and localization. A design of an electronic neural ''dge detector is proposed and analyzed. Our hope is that this and similar efforts will eventually lead to the formation of engineering principles that will assist developments in the science and technology of the hardware electronic vision and machine perception. The use of neural networks is the optimal choice for the hardware implementation of the edge detector because of parallel processing is satisfied. This is similar to the role played by the physical neurons in vertebrate retina while processing the images. A computer simulation is used to test the performance of this approach. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We consider high capacity mean square error (MSE) associative processors (APs) and the first use of multiple APs. The application considered is multi-class distortion-invariant pattern recognition (PR). 1 .
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Reconstruction, Description, and Modeling of 3-D Surfaces
This paper deals with quantitative aspects of camera fixation for reconstruction of a static scene. In general when the camera undergoes translation and rotation there is an infinite number of points that produce equal optical flow for any instantaneous point in time. For the case where the rotation axis of the camera is perpendicular to the instantaneous translation vector these points form a circle (called the Equal Flow Circle or simply EFC) and a line. A special case of the EFCs is the Zero Flow Circle (ZFC) where both components of the optical flow are equal to zero. A fixation point is the intersection of all the ZFCs. Points inside and outside the ZFC are quantitatively mapped using the EFCs. We show how to find the exact location of points in space during fixation. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Analyzing sensor data to describe the shape of unknown three-dimensional objects randomly jumbled together is an area of great research interest. It is encountered in a large variety of industrial tasks of the bin-picking type. Classical approaches to bin-picking use strong object models. However a priori models are not available in many unstructured material handling applications such as mailpiece singulation random or mixed part feeding scavenging and other similar tasks. In such applications the key vision problem is determining how the partially visible objects relate to each other and to other invisible objects that may be underneath. The shapes of the partially visible objects are constrained by the invisible contacts between the objects the forces such as friction and gravity acting at these contacts and the assumed solidity (impenetrability) of the objects. This paper shows how heuristics such as object symmetry and assumptions such as general viewpoint can be used to generate initial hypotheses about the shapes of partially visible objects. These hypotheses are then iteratively expanded to determine the possible extents ofthe objects using criteria such as coplanarity ofdisconnected surfaces and intersection of swept volumes. A detailed example that illustrates the methods is described. 1. 0
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image flow the apparent motion of brightness patterns on the image plane can provide important visual information such as distance shape surface orientation and boundaries. It can be determined by either feature tracking or spatio-temporal analysis. The optical flow thus determined can be used to reconstruct the 3-D scene by determining the depth from camera of every point in the scene. However the optical flow determined by either of the methods mentioned above will be noisy. As a result the depth information obtained from optical flow can not be successfully used in practical applications such as image segmentation 3-D reconsiruction path planning etc. By using temporal integration we can increase the accuracy of both the optical flow and the depth determined from optical flow. In this work we describe an incremental integration scheme called the running average method to temporally integrate the image flow. We integrate the depth from camera obtained using optical flow determined from gradient based methods and show that the results of temporal integration are much more useful in practical applications than the results from local edge operators. Finally we consider an image segmentation example and show the advantages of temporal integration. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Three-dimensional (3D) position estimation using a single passive sensor particularly vision has frequently suffered from unreliability and has involved complex processing methods. Past research has combined vision with other active sensors in which the emphasis has been on data fusion. This paper attempts to integrate multiple passive 3D cues - camera focus camera vergence and stereo disparity - using a single sensor. We argue that in the active vision paradigm an estimate of the position is obtained in the process of fixation in which the imaging parameters are dynamically controlled to direct the attention of the imaging system at the point of interest. Fixation involves integration of the passive cues in a mutually consistent way in order to overcome the deficiencies of any individual cue and to reduce the complexity of processing. Taking into account their reliabilities the individual position estimates from the different cues are combined to form a final overall estimate. 1
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The tasks of navigation and exploration require a robot to acquire and utilize knowledge about its environment. The robot''s sensor data often noisy or incomplete must be augmented with additional constraints to derive a reasonable environmental description. Models of generic objects and their geometric relationships can provide such constraints. This paper describes an application of geometric reasoning that uses generic object and relationship models to merge images from three points of view and hypothesize missing elements of the scene. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The reconstruction of curves and surfaces from sparse data is an important task in many applications. In computer vision problems the reconstructed curves and surfaces generally represent some physical property of a real object in a scene. Thus the characteristics of the reconstruction process differs from straight forward fitting of smooth curves and surfaces to a set of data. Since the collected data is represented in an arbitrarily chosen coordinate system the reconstruction process should be invariant to the choice of the coordinate system (except for the transformation between the two coordinate systems). In this paper reconstruction algorithms are presented for reconstructing invariant estimates of both curves and surfaces. The reconstruction problem will be cast as an illposed inverse problem which must be stablized using a priori information about the constraint formation. Tikhonov regularization is used to form a wellposed mathematical problem statement. Examples of typical reconstructed objects are also given. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image flow the apparent motion of brightness patterns on the image plane can provide important visual information such as distance shape surface orientation and boundaries. It can be determined by either feature tracking or spatio-temporal analysis. We consider spatio-temporal methods and show how differential range can be estimated from time-space imagery. We generate a time-space image by considering only one scan line of the image obtained from a camera moving in the horizontal direction at each time interval. At the next instant of time we shift the previous line up by one pixel and obtain another line from the image. We continue the procedure to obtain a time-space image where each horizontal line represents the spatial relationship of the pixels and each vertical line the temporal relationship. Each feature along the horizontal scan line generates an edge in the time-space image the slope of which depends upon the distance of the feature from the camera. We apply two mutually perpendicular edge operators to the time-space image and determine the slope of each edge. We show that this corresponds to optical flow. We use the result to obtain the differential range and show how this can be implemented on the Pipelined Image Processing Engine (PIPE). We use a simple technique to calibrate the camera and show how the depth can be obtained from optical flow. We provide a statistical analysis of the
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new technique for the three-dimensional recognition of symmetric objects from range images. Beginning from the implicit representation of quadrics a set of ten coefficients is determined for symmetric objects like spheres cones cylinders ellipsoids and parallelepipeds. Instead of using these ten coefficients trying to fit them to smooth surfaces (patches) based on the traditional way of determining curvatures a new approach based on two-dimensional geometry is utilized. For each symmetric object a unique set of two-dimensional curves is obtained from the various angles at which the object is intersected with a plane. Utilizing the same ten coefficients obtained earher and based on the discriminant method each of these curves is classified as a parabola a circle an ellipse or a hyperbola. Each symmetric object is found to possess a unique set of these two-dimensional curves whereby it can be differentiated from the others. In other words it is shown that instead of using the three-dimensional discriminant which involves evaluation of the rank of its matrix it is sufficient to utilize the two-dimensional discriminant which only requires three arithmetic operations. This approach seems to be more accurate and computationally inexpensive compared to the traditional approaches. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Topographic mappings are neigbourhood preserving transformations between twodimensional data structures. Mappings of this type are a general means of information processing in the vertebrate visual system. In this paper we present an application of a special topographic mapping termed the inverse perspective mapping for the computation of stereo and motion. More specifically we study a class of algorithms for the detection of deviations from an expected " normal" situation. These expectations concern the global spacevariance of certain image parameters (e. g. disparity or speed of feature motion) and can thus be implemented in the mapping rule. The resulting algorithms are minimal in the sense that no irrelevant information is extracted from the scene. In a technical application we use topographic mappings for a stereo obstacle detection system. The implementation has been tested on an automatically guided vehicle (AGV) in an industrial environment. 1
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In [1 we have introduced a new pose eslimalion algoriihm based on a ierahedravolume measuremeni meihod. In This paper we preseni an exiended iheorelical and experinienal evalualion of This algorithm. The pose esimaion iechnique exploits ihe redundancy in the georneric informaiion inherent in the daia io minimize errors in Ihe recovery of the pose of a vision system. The resuUs of This study show among other Things ha errors in the pose recovery are sensiiive io he exact coordinales of image poinis. These errors increase drasiically as the camera is moved awayfrom the argeL To correciforihis sensiiviy an enhanced version ofihis algoriihm based on image/shape resoraion is introduced. This enhancement ulilizes a Conjugate Gradieni based lechnique io minimize the effect of errors inhereni in he image daia. This enhanced technique is lesied on images of synihelic daia and of real world scenes. These experimenis show Thai a significant improvemeni of he solulion o The pose can be achieved by this technique. 1
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper addresses the problems of how to efficiently extract information from different data sources and how to fuse the information to achieve a more complete and accurate interpretation of the underlying 3-D scene. This paper describes an integrated approach to 3-D image interpretation. The approach is to first obtain region information by applying a multi-resolution segmentation technique to a monocular image. The elevation information is then extracted by applying an edge-based matching technique to a stereo pair covering the same scene. The region information the elevation information and the a priori knowledge about the scene are then integrated by using a rule-based scheme to classify the various objects in the scene. Results of applying this integrated technique to an overhead visible urban scene are presented. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an algorithm and implementation for recovering range data from stereo images of edge points. The recovered data are used for object identification and localization. Parallel laser light planes are projected on a polyhedral object. The light planes appear as a set of broken straight segments in images. Discontinuities along these straight segments correspond to normal discontinuities on the underlying surfaces. Points of discontinuities in the images are extracted as edge points and they lie on edges of the underlying object. Matching edge points between stereo images gives range data. The matching algorithm uses the epipolar constraint a relational constraiit and an ordering constraint. The range data can he arranged in order according to the light planes to which they belong. Experimental results of the matching process and the accuracy of recovered data are presented. 1
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper deals with dynamic acquisition of range data using a multiple proximity sensor system located in a robot gripper for the purpose of pose estimation of 3-D regular objects. A sensor structure is proposed and an algorithm of dynamic sensing presented. Pose estimation of regular objects using Newton''s method and cylindrical objects is described and results of experiments are presented and discussed. 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper is concerned with recovering qualitative descriptors of three-dimensional surface shape from binocular stereo disparity. In particular the emphasis is on recovering the sign of Gaussian curvature of binocularly projected three-dimensional surface patches (i. e. surface patches that are locally parabolic elliptic or hyperbolic in shape). It is shown that this information can be recovered from patterns in the differential orientation of projected surface detail (e. g. texture). 1.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.