MPEG-5 Essential Video Coding Standard is currently being prepared as the video coding standard of ISO/IEC Moving Picture Experts Group. The main goal of the EVC standard development is to provide a significantly improved compression capability over existing video coding standards with timely publication of availability terms. This paper provides an overview of the feature and the characteristics of the MPEG-5 EVC standard.
KEYWORDS: Video coding, Copper, Lithium, Standards development, Plutonium, Digital image processing, Current controlled current source, Image processing, Raster graphics, Video
The new state-of-the-art video coding standard, H.265/HEVC, has been finalized in 2013 and it achieves roughly 50%
bit rate saving compared to its predecessor, H.264/MPEG-4 AVC. In this paper, two additional merge candidates,
advanced temporal motion vector predictor and spatial-temporal motion vector predictor, are developed to improve
motion information prediction scheme under the HEVC structure. The proposed method allows each Prediction Unit
(PU) to fetch multiple sets of motion information from multiple blocks smaller than the current PU. By splitting a large
PU into sub-PUs and filling motion information for all the sub-PUs of the large PU, signaling cost of motion information
could be reduced. This paper describes above-mentioned techniques in detail and evaluates their coding performance
benefits based on the common test condition during HEVC development. Simulation results show that 2.4%
performance improvement over HEVC can be achieved.
The new state-of-the-art video coding standard, H.265/HEVC, has been finalized in 2013 and it achieves roughly 50% bit rate saving compared to its predecessor, H.264/MPEG-4 AVC. This paper provides the evidence that there is still potential for further coding efficiency improvements. A brief overview of HEVC is firstly given in the paper. Then, our improvements on each main module of HEVC are presented. For instance, the recursive quadtree block structure is extended to support larger coding unit and transform unit. The motion information prediction scheme is improved by advanced temporal motion vector prediction, which inherits the motion information of each small block within a large block from a temporal reference picture. Cross component prediction with linear prediction model improves intra prediction and overlapped block motion compensation improves the efficiency of inter prediction. Furthermore, coding of both intra and inter prediction residual is improved by adaptive multiple transform technique. Finally, in addition to deblocking filter and SAO, adaptive loop filter is applied to further enhance the reconstructed picture quality. This paper describes above-mentioned techniques in detail and evaluates their coding performance benefits based on the common test condition during HEVC development. The simulation results show that significant performance improvement over HEVC standard can be achieved, especially for the high resolution video materials.
Screen content video coding extension of HEVC (SCC) is being developed by Joint Collaborative Team on Video Coding (JCT-VC) of ISO/IEC MPEG and ITU-T VCEG. Screen content usually features a mix of camera captured content and a significant proportion of rendered graphics, text, or animation. These two types of content exhibit distinct characteristics requiring different compression scheme to achieve better coding efficiency. This paper presents an efficient block matching schemes for coding screen content to better capture the spatial and temporal characteristics. The proposed schemes are mainly categorized as a) hash based global region block matching for intra block copy b) selective search based local region block matching for inter frame prediction c) hash based global region block matching for inter frame prediction. In the first part, a hash-based full frame block matching algorithm is designed for intra block copy to handle the repeating patterns and large motions when the reference picture constituted already decoded samples of the current picture. In the second part, a selective local area block matching algorithm is designed for inter motion estimation to handle sharp edges, high spatial frequencies and non-monotonic error surface. In the third part, a hash based full frame block matching algorithm is designed for inter motion estimation to handle repeating patterns and large motions across the temporal reference picture. The proposed schemes are compared against HM-13.0+RExt-6.0, which is the state-of-art screen content coding. The first part provides a luma BD-rate gains of -26.6%, -15.6%, -11.4% for AI, RA and LD TGM configurations. The second part provides a luma BD-rate gains of -10.1%, -12.3% for RA and LD TGM configurations. The third part provides a luma BD-rate gains of -12.2%, -11.5% for RA and LD TGM configurations.
3D-AVC being developed under Joint Collaborative Team on 3D Video Coding (JCT-3V) significantly
outperforms the Multiview Video Coding plus Depth (MVC+D) which simultaneously encodes texture views
and depth views with the multiview extension of H.264/AVC (MVC). However, when the 3D-AVC is
configured to support multiview compatibility in which texture views are decoded without depth information,
the coding performance becomes significantly degraded. The reason is that advanced coding tools incorporated
into the 3D-AVC do not perform well due to the lack of a disparity vector converted from the depth information.
In this paper, we propose a disparity vector derivation method utilizing only the information of texture views.
Motion information of neighboring blocks is used to determine a disparity vector for a macroblock, so that the
derived disparity vector is efficiently used for the coding tools in 3D-AVC. The proposed method significantly
improves a coding gain of the 3D-AVC in the multiview compatible mode about 20% BD-rate saving in the
coded views and 26% BD-rate saving in the synthesized views on average.
This article introduces adaptive loop filtering (ALF) techniques being considered for the HEVC standard. The key idea
of ALF is to minimize the mean square error between original pixels and decoded pixels using Wiener-based adaptive
filter coefficients. ALF is located at the last processing stage of each picture and can be regarded as a tool trying to catch
and fix artifacts from previous stages. The suitable filter coefficients are determined by the encoder and explicitly
signaled to the decoder. In order to achieve better coding efficiency, especially for high resolution videos, local
adaptation is used for luma signals by applying different filter to different region in a picture. In addition to filter
adaptation, filter on/off control at largest coding unit (LCU) level is also helpful for improving coding efficiency.
Syntax-wise, filter coefficients are sent in a picture level header called adaptation parameter set (APS), and filter on/off
flags of LCUs are interleaved at LCU level in the slice data. Besides supporting picture-based optimization of ALF, the
syntax design can support low delay applications as well. When the filter coefficients in APS are trained by using a
previous picture, filter on/off decisions can be made on the fly during encoding of LCUs, so the encoding latency is only
one LCU. Simulation results show that the ALF can achieve on average 5% bit rate reduction and up to 27% bit rate
reduction for 25 HD sequences. The run time increases are 1% and 10% for encoders and decoders, respectively, with
un-optimized C++ codes in software.
Transform coefficient coding in HEVC encompasses the scanning patterns and the coding methods for the last significant coefficient, significance map, coefficient levels and sign data. Unlike H.264/AVC, HEVC has a single entropy coding mode based on the context adaptive binary arithmetic coding (CABAC) engine. Due to this, achieving high throughput for transform coefficient coding was an important design consideration. This paper analyzes the throughput of different components of transform coefficient coding with special emphasis on the explicit coding of the last significant coefficient position and high throughput binarization. A comparison with H.264/AVC transform coefficient coding is also presented, demonstrating that HEVC transform coefficient coding achieves higher average and worst case throughput.
3D film and 3D TV are becoming reality. More facilities and devices are now 3D capable. Compared to capture 3D
video content directly, 2D to 3D video conversion is a low-cost, backward compatible alternate. There also exists a
tremendous amount of monoscopic 2D video content that are of high interest to be displayed on 3D devices with
noticeable immersiveness. 2D to 3D video conversion, therefore, has drawn lots of attention recently. In this paper, a low
complexity 2D to 3D conversion algorithm is presented. The conversion generates stereo video pairs by 3D warping
based on estimated per-pixel depth maps. The depth maps are estimated jointly by motion and color cues. Subjective
tests show that the proposed algorithm achieves 3D perception with acceptable artifact.
This paper describes video coding technology proposal submitted by Qualcomm Inc. in response to a joint call for
proposal (CfP) issued by ITU-T SG16 Q.6 (VCEG) and ISO/IEC JTC1/SC29/WG11 (MPEG) in January 2010. Proposed
video codec follows a hybrid coding approach based on temporal prediction, followed by transform, quantization, and
entropy coding of the residual. Some of its key features are extended block sizes (up to 64x64), recursive integer
transforms, single pass switched interpolation filters with offsets (single pass SIFO), mode dependent directional
transform (MDDT) for intra-coding, luma and chroma high precision filtering, geometry motion partitioning, adaptive
motion vector resolution. It also incorporates internal bit-depth increase (IBDI), and modified quadtree based adaptive
loop filtering (QALF). Simulation results are presented for a variety of bit rates, resolutions and coding configurations to
demonstrate the high compression efficiency achieved by the proposed video codec at moderate level of encoding and
decoding complexity. For random access hierarchical B configuration (HierB), the proposed video codec achieves an
average BD-rate reduction of 30.88c/o compared to the H.264/AVC alpha anchor. For low delay hierarchical P (HierP)
configuration, the proposed video codec achieves an average BD-rate reduction of 32.96c/o and 48.57c/o, compared to the
H.264/AVC beta and gamma anchors, respectively.
This paper describes design of transforms for extended block sizes for video coding. The proposed transforms are
orthogonal integer transforms, based on a simple recursive factorization structure, and allow very compact and efficient
implementations. We discuss techniques used for finding integer and scale factors in these transforms, and describe our
final design. We evaluate efficiency of our proposed transforms in VCEG's H.265/JMKTA framework, and show that
they achieve nearly identical performance compared to much more complex transforms in the current test model.
KEYWORDS: Quantization, Computer programming, Lithium, Distortion, Video coding, Binary data, Motion estimation, Chemical elements, Video compression, Digital image processing
In this paper, a rate-distortion optimized quantization scheme is described with application to H.264 video encoding. An efficient implementation of H.264 macroblock level adaptive quantization parameter selection is also described. Together these two encoder-only changes can achieve on average over 6% bit rate reduction under common testing conditions that are used in the H.264 standardization community. The described techniques provide this improvement in compression capability while retaining conformance of the encoded data to the H.264 standard. Thus, full compatibility with standard decoders can be achieved when applying these techniques.
New criteria for shape preservation are presented. These criteria are applied in optimizing soft morphological filters. The filters are optimized by simulated annealing and genetic algorithms which are briefly reviewed. A situation where the given criteria give better results compared to the traditional MAE and MSE criteria is illustrated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.