KEYWORDS: Image segmentation, Video, Motion estimation, Video coding, Analytical research, Semantic video, Automatic tracking, Video processing, Video compression, Head
We develop a method for automatic segmentation of natural video sequences. The method is based on low-level spatial and temporal analyses. It features three designs to help facilitate good region segmentation while keeping the computational complexity at a reasonable level. Firstly, a preliminary seed-area identification and a final re-segmentation process are performed on each video frame to help region tracking. Secondly, a simple way to measure homogeneity of texture in a region is devised and the segmentation tries to locate object boundaries at where the texture shows significant changes. And thirdly, a reduced-complexity motion estimation technique is used, so that dense motion fields can be computed at a reasonable complexity. The overall method is organized into four tasks, namely, seed-area identification (for each frame), initial segmentation (only for the first frame in the sequence), motion-based segmentation (for all later frames), and region tracking and updating (also for all later frames). Some examples are provided to illustrate the performance of this method.
Many methods for video transmission error control have been proposed recently, especially in relation to transmission over bursty-error channels. However, a thorough, structural taxonomical framework for analysis and design of such methods seems lacking. Such a framework helps clarify in thought and aids in inspiring new error control methods or combinations thereof. We present a framework for classification of the various transmission error control techniques. We then consider error control for H.263. Several techniques are presented from the viewpoint of the proposed analysis framework and they illustrate how different techniques can be integrated coherently to achieve enhanced error resilience in the overall system. In particular, we employ slotted multiplexing at the multiplex level to reduce synchronization errors in variable-length coded data and to randomize the locations of the remaining error-corrupted image areas. At the source level, we introduce two standard-compatible schemes, called length- based intra refresh and motion vector pairing, respectively, which further limit spatial-temporal error propagation.
We consider optimal encoding of a sequence of video units under a given set of rate constraints which may arise from finite codec delay, finite channel capacity, and finite codec buffer sizes. A Lagrange-multiplier approach is employed and some useful properties of the optimal Lagrange- multiplier solution are obtained under the assumption that the allowed video data rates are continuous. Based on these properties, we derive two solution algorithms for discrete allocation. The algorithms are more efficient than that have been presented to date. The solution is optimal when the distortion-rate relations of the video units are convex and the selectable rates of the video units are uniformly spaced with the same granularity. When these conditions do not hold, the Lagrange-multiplier solution may be suboptimal, but can be improved or optimized by a search about the solution.
Block-based transform coding employing the discrete cosine transform (DCT) is a popular technique in image and video compression. We consider Wiener-filter-based restoration for transform-coded images and motion video. The scheme operates at the decoder end. It capitalizes on the residual correlation among quantized DCT coefficients and the quantization errors. The scheme, termed error pattern compensation or EPC in short, is derived by simplifying and extending related work of other researchers. When applied to motion video, it is activated for intraframe-coded macroblocks only. It first classifies an encoded image block according to some visually meaningful features and vector quantization. Different Wiener filters can be designed for different classes of input blocks. Experimental results show that the scheme yields performance improvement in the range of a few percent in MSE or PSNR. It is more effective in restoring image blocks with greater quantizing distortion. Results also show that intraframe-coded pictures usually benefit the most from EPC, followed by predictive-coded pictures and then by bidirectional-predictive-coded pictures. At times, the last two types of pictures show performance degradation rather than gain. Possible reasons are discussed.
In typical image and video coding techniques, the choice of quantizer scales at the encoder plays a key role in determining the generated bit-rate and the coding performance. The distortion-rate (D-R) curve of a video unit fully characterizes the relation between quantizing distortion and encoder output rates and can thus be used in the choice of good quanitizer scales. However, obtainment of the D-R curves is a heavy computational load. We develop a piecewise-linear/exponential model to approximate the true curves for macroblocks. Based on this D-R model, we devise a quantizer control method which is a modification of that in Test Model 5 (TM5) for MPEG2. A reference quanitizer scale is first calculated from past bit usages and the buffer fullness as in TM5. Then the slope of each approximated macroblock D-R curve at this quanitizer scale is calculated. In line with human vision characteristics, an adjusted slope value is computed according to a macroblock activity measure. The quantizer scale yielding the adjusted slope on the D-R curve is found and used. Simulation results show that this method of quanitizer choice can attain not only higher PSNR values but also higher visual quality in coded video.
A new technique called motion restoration method for estimating the global motion due to zoom and pan of the camera is proposed. It is composed of three steps: (1) block-matching motion estimation, (2) object assignment, and (3) global motion restoration. In this method, each image is first divided into a number of blocks. Step (1) may employ any suitable block- matching motion estimation algorithm to produce a set of motion vectors which capture the compound effect of zoom, pan, and object movement. Step (2) groups the blocks which share common global motion characteristics into one object. Step (3) then extracts the global motion parameters (zoom and pan) corresponding to each object from the compound motion vectors of its constituent blocks. The extraction of global motion parameters is accomplished via singular value decomposition. Experimental results show that this new technique is efficient in reducing the entropy of the block motion vectors for both zooming and panning motions and may also be used for image segmentation.
A video encoder has the task of producing lowest-distortion coded video subject to some constraints on delay, rate, and buffer conditions. We present a general optimization approach to this problem in a framework of delayed coding and we motivate a certain formulation of the optimization objective. Two forms of distortion measures are considered, namely, the maximum distortion and the total distortion, each defined over a segment of the video to be coded. These distortion measures are chosen for their mathematical tractability and practical importance. A solution (computational algorithm) for each case is described. Subject to some conditions, the solutions may be suboptimal. Simulation results show an improved performance with this approach compared to a simple typical approach which varies the quantization scale linearly with the encoder buffer level.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.