KEYWORDS: Feature extraction, Distance measurement, Weapons of mass destruction, Volume rendering, Visualization, Image retrieval, Visual process modeling, Visual system, Databases, Chemical elements
While it is recognized that images are described through color, texture and shapes of objects in the scene, the general image understanding is still very difficult. Thus, to perform an image retrieval in a human-like manner one has to choose a specific domain, understand how users achieve similarity within that domain and then build a system that duplicates human performance. Since color and texture are fundamental aspects of human perception we propose a set of techniques for retrieval of color patterns. To determine how humans judge similarity of color patterns we performed a subjective study. Based on the result of the study five most relevant visual categories for the perception of pattern similarity were identified. We also determined the hierarchy of rules governing the use of these categories. Based on these results we designed a system which accepts one or more texture images as input, and depending on the query, produces a set of choices that follow human behavior in pattern matching. Processing steps in our model follow those of the human visual system, resulting in perceptually based features and distance measures. As expected, search results closely correlate wit human choices.
Multimedia information systems are experiencing a tremendous growth as a direct consequence of the popularity and pervasive use of the world wide web. As a consequence, it is becoming increasingly important to provide efficient and flexible solutions for accessing and retrieving multimedia data. Images and video are emerging as significant data types in multimedia systems. And yet, most commercial systems are still text and keyword based and do not fully exploit the image content of these systems. We believe that there is an opportunity to build a novel interactive multimedia system for some specific applications in electronic commerce. In this paper, we present an overview of our approach, the rationale behind it and the problems that are inherent in building such a system. We address some of the technical issues in representing and analyzing image primitive features. These are the building blocks of any such systems. They can be generalized into a much broader range of applications as well.
Over the past several years there have been many attempts to incorporate perceptual masking models into image compression systems. Unfortunately, there is little or no information on how these models perform in comparison to each other. The purpose of this paper is to examine how two different perceptual models perform when utilized in the came coding system. The models investigated are the Johnson-Safranek and the Watson. They both develop a contrast masking threshold for DCT based coders. The coder used for comparison is the Baseline Sequential mode of JPEG. Each model was implemented and used to generate image dependent masking thresholds for each 8x8 pixel block in the image. These thresholds were used to zero out perceptually irrelevant coefficients, while the remaining coefficients were quantized using a perceptually optimal quantization matrix. Both objective and subjective performance data was gathered. Bit rate saving versus standard JPEG was computed, and a subjective comparison of images encoded with both models and the nonperceptual JPEG was run. The perceptually based coders gave greater compression with no loss in subjective image quality.
The international JPEG (Joint Photographics Experts Group) standards for image compression deal with the compression of still images. It specifies the information contained in the compressed bit stream, and a decoder architecture that can reconstruct an image from the data in the bit stream. However, the exact implementation of the encoder is not standardized. The only requirement on the encoder is that it generate a compliant bit stream. This provides an opportunity to introduce new research results. The challenge in improving these standards based codecs is to generate a compliant bitstream that produces a perceptually equivalent image as the baseline system that has a higher compression ratio. This results in a lower encoded bit rate without perceptual loss in quality. The proposed encoder uses the perceptual model developed by Johnston and Safranek to determine, based on the input data, which coefficients are perceptually irrelevant. This information is used to remove (zero out) some coefficients before they are input to the quantizer block. This results in a larger percentage of zero codewords at the output of the quantizer which reduces the entropy of the resulting codewords.
KEYWORDS: Image compression, Distortion, Quantization, Visual process modeling, Digital signal processing, Signal processing, Video, Image quality, Visual system, Visualization
The problem of image compression is to achieve a low bit rate in the digital representation of an input image or video signal with minimum perceived loss of picture quality. Since the ultimate criterion of quality is that judged or measured by the human receiver, it is important that the compression (or coding) algorithm minimizes perceptually meaningful measures of signal distortion, rather than more traditional and tractable criteria such as the mean squared difference between the waveform at the input and output of the coding system. This paper develops the notion of perceptual coding based on the concept of distortion-masking by the signal being compressed, and describes how the field has progressed as a result of advances in classical coding theory, modelling of human vision, and digital signal processing. We propose that fundamental limits in the science can be expressed by the semi-quantitative concepts of perceptual entropy and the perceptual distortion-rate function, and we examine current compression technology with respect to that framework. We conclude with a summary of future challenges and research directions.
KEYWORDS: Image quality, Image filtering, Image compression, Quantization, Visual process modeling, Human vision and color perception, Mirrors, Electronic imaging, RGB color model, Computer programming
In this paper we present a sub-band coder for true color images that uses an
empirically derived perceptual masking model to set the allowable quantization noiselevel
not only for each sub-band but also for each pixel in a given sub-band. The input
image is converted into YIQ space and each channel is passed through a separable
Generalized Quadrature Mirror Filterbank (GQMF). This separates the image's
frequency content into into 4 equal width bands in both the horizontal and vertical
dimension, resulting in a representation consisting of 16 sub-bands for each channel.
Using this representation, a perceptual masking model is derived for each channel.
The model incorporates spatial-frequency sensitivity, contrast sensitivity, and texture
masking. Based on the image dependent information in each sub-band and the
perceptual masking model, noise-level targets are computed for each point in a subband.
These noise-level targets are used to set the quantization levels in a DPCM
quantizer. The output from the DPCM quantizer is then encoded, using an entropybased
coding scheme, in either lxi , 1x2, or 2x2 pixel parts, based on the the statistics
in each 4x4 sub-block of a particular sub-band. One set of codebooks, consisting of
100,000 entries, is used for all images. A block elimination algorithm takes
advantage of the peaky spatial energy distribution of sub-bands to avoid using bits for
quiescent parts of a given sub-band. The resultant bitrate depends on the complexity
of the input image. For the images we use, high quality output requires bitrates from
0.25 to 1 .25 bits/pixel, while nearly transparent quality requires 0.5 to 2.5 bits/pixel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.