KEYWORDS: Video compression, Imaging systems, Image processing, Video, Data processing, Mobile communications, Detection and tracking algorithms, Image segmentation, Image processing algorithms and systems, Algorithm development
Ambient Intelligent is expected to become one of the driving key factors of the semiconductors industry in this decade. One of the most promising areas in this respect is the advent of embedded smart imaging applications in a variety of consumer applications, like mobile communication devices and the automotive domain. The efficient VLSI implementation of these applications requires architectural concepts that enable the extraction of objects and associated information out of video sequences in real-time. The main architectural challenge is to find an appropriate trade-off between architectural flexibility and scalability in order to cope with moderate variations of the applied smart imaging algorithms on one hand and cost efficiency of the implementation on the other hand. This paper describes the algorithmic and architectural requirements for the implementation of smart imaging applications in the mentioned fields. The target system, based on an embedded RISC processor, embedded memory, and cores for accelerating essential functions, like morphological operations, connected component labeling, motion extraction etc., is presented. The functional system partitioning applied is based on HW acceleration of core functions that enable the extraction of low-level information out of the images of a video sequence. This information is provided to the embedded RISC processor for further abstraction of the image content information and interpretation of the image content by SW means. One of the focal points of this paper is the derivation of efficient architectural concepts for smart imaging coprocessors, acting as a system toolbox for accelerating the required smart imaging core functions.
KEYWORDS: Video, Signal processing, Video coding, Video processing, Very large scale integration, Silicon, Clocks, Image compression, Video compression, Standards development
The paper presents an overview on architectures for VLSI implementations of video compression schemes as specified by standardization committees of the ITU and ISO, focussing on programmable architectures. Programmable video signal processors are classified and specified as homogeneous and heterogeneous processor architectures. Architectures are presented for reported design examples for the literature. Heterogenous processors outperform homogeneous processors because of adaptation to the requirements of special subtasks by dedicated modules. The majority of heterogenous processors incorporate dedicated modules for high performance subtasks of high regularity as DCT and block matching. By normalization to a fictive 1.0 micron CMOS process typical linear relationships between silicon area and through-put rate have been determined for the different architectural styles. This relationship indicated a figure of merit for silicon efficiency.
KEYWORDS: Image processing, Video coding, Image segmentation, Quantization, Data processing, Video, Process control, Motion estimation, Signal processing, Silicon
A multiprocessor architecture for compact realizations of video coding applications is presented. The actual standards for video coding e.g. H.261 and MPEG are based on a hybrid coding scheme, which allows parallelization at both data level and task level. The parallelization at data level is performed by distribution of image data among the processors. Each processor works on locally stored image segments. The parallelization at task level is realized inside the processors by functional modules which are adapted to classes of algorithms. The functionality of the modules and the number of their data paths is determined by applying efficiency calculations resulting in a module for motion estimation and a block- level coprocessor for transform and quantization. The controlling and synchronization is accomplished by a programmable module. A hierarchical controlling concept reduces the on- chip control overhead. A chip size of 70 mm2 is estimated for one processor, when using 0.6 micrometers CMOS technology. With an operating frequency of 65 MHz one chip will perform the computations for a full CIF H.261 codec with 30 Hz framerate and motion estimation based on +/-15 pel full search blockmatching algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.