The MPEG-21 standard defines a framework for the interoperable delivery and consumption of multimedia content.
Within this framework the adaptation of content plays a vital role in order to support a variety of terminals and to
overcome the limitations of the heterogeneous access networks. In most cases the multimedia content can be adapted by
applying different adaptation operations that result in certain characteristics of the content. Therefore, an instance within
the framework has to decide which adaptation operations have to be performed to achieve a satisfactory result. This
process is known as adaptation decision-taking and makes extensive use of metadata describing the possible adaptation
operations, the usage environment of the consumer, and constraints concerning the adaptation. Based on this metadata a
mathematical optimization problem can be formulated and its solution yields the optimal parameters for the adaptation
operations. However, the metadata is represented in XML resulting in a verbose and inefficient encoding. In this paper,
an architecture for an Adaptation Decision-Taking Engine (ADTE) is introduced. The ADTE operates both on XML
metadata and on metadata encoded with MPEG's Binary Format for Metadata (BiM) enabling an efficient metadata
processing by separating the problem extraction from the actual optimization step. Furthermore, several optimization
algorithms which are suitable for scalable multimedia formats are reviewed and extended where it was appropriate.,
This paper focuses on an approach for real-time metal extraction of x-ray images taken from modern x-ray machines like C-arms. Such machines are used for vessel diagnostics, surgical interventions, as well as cardiology, neurology and orthopedic examinations. They are very fast in taking images from different angles. For this reason, manual adjustment of contrast is infeasible and automatic adjustment algorithms have been applied to try to select the optimal radiation dose for contrast adjustment. Problems occur when metallic objects, e.g., a prosthesis or a screw, are in the absorption area of interest. In this case, the automatic adjustment mostly fails because the dark, metallic objects lead the algorithm to overdose the x-ray tube. This outshining effect results in overexposed images and bad contrast. To overcome this limitation, metallic objects have to be detected and extracted from images that are taken as input for the adjustment algorithm. In this paper, we present a real-time solution for extracting metallic objects of x-ray images. We will explore the characteristic features of metallic objects in x-ray images and their distinction from bone fragments which form the basis to find a successful way for object segmentation and classification. Subsequently, we will present our edge based real-time approach for successful and fast automatic segmentation and classification of metallic objects. Finally, experimental results on the effectiveness and performance of our approach based on a vast amount of input image data sets will be presented.
KEYWORDS: Video, Multimedia, Visualization, Control systems, Standards development, Video coding, Spatial resolution, Video compression, Computer programming, Digital filtering
An adaptive multimedia proxy is presented which provides (1) caching, (2) filtering, and (3) media gateway functionalities. The proxy can perform media adaptation on its own, either relying on layered coding or using transcoding in the decompressed domain. A cost model is presented which incorporates user requirements, terminal apabilities, and video variations in one formula. Based on this model, the proxy acts as a general broker of different user requirements and of different video variations. This is a first step towards What You Need is What You Get (WYNIWYG) video services, which deliver videos to users in exactly the quality they need and are willing to pay for. The MPEG-7 and MPEG-21 standards enable this in an interoperable way. A detailed evaluation based on a series of simulation runs is provided.
XML-based metadata is widely adopted across the different communities and plenty of commercial and open source tools for processing and transforming are available on the market. However, all of these tools have one thing in common: they operate on plain text encoded metadata which may become a burden in constrained and streaming environments, i.e., when metadata needs to be processed together with multimedia content on the fly. In this paper we present an efficient approach for transforming such kind of metadata which are encoded using MPEG's Binary Format for Metadata (BiM) without additional en-/decoding overheads, i.e., within the binary domain. Therefore, we have developed an event-based push parser for BiM encoded metadata which transforms the metadata by a limited set of processing instructions - based on traditional XML transformation techniques - operating on bit patterns instead of cost-intensive string comparisons.
This paper introduces the principal approach and describes the basic architecture and current implementation of the knowledge-based multimedia adaptation framework we are currently developing. The framework can be used in Universal Multimedia Access scenarios, where multimedia content has to be adapted to specific usage environment parameters (network and client device capabilities, user preferences). Using knowledge-based techniques (state-space planning), the framework automatically computes an adaptation plan, i.e., a sequence of media conversion operations, to transform the multimedia resources to meet the client's requirements or constraints. The system takes as input standards-compliant descriptions of the content (using MPEG-7 metadata) and of the target usage environment (using MPEG-21 Digital Item Adaptation metadata) to derive start and goal states for the planning process, respectively. Furthermore, declarative descriptions of the conversion operations (such as available via software library functions) enable existing adaptation algorithms to be invoked without requiring programming effort. A running example in the paper illustrates the descriptors and techniques employed by the knowledge-based media adaptation system.
Multimedia streaming is becoming more and more popular. Seamless video streaming in heterogeneous networks like the Internet turns out as almost impossible due to varying network conditions -- streams must be adapted to the current network QoS. Temporal scalability is one of the most reasonable adaptation techniques because it is fast and easy to perform. Today's approaches simply drop frames out of a video without spending much effort on finding an intelligent dropping behavior. This usually leads to good adaptation results in terms of bandwidth consumption but also to suboptimal video quality within the given bounds. Our approach offers analysis of video streams to achieve the qualitatively best temporal scalability. For this reason, we introduce a data structure called modification lattice which represents all frame dropping combinations within a sequence of frames. On the basis of the modification lattice, quality estimations on frame sequences can be performed. Moreover, a heuristic for fast and efficient quality computation in a modification lattice is presented. Experimental results illustrate that temporal video adaptation based on QCTVA information leads to a better video quality compared to "usual" frame dropping approaches. Furthermore, QCTVA offers frame priority lists for videos. Based on these priorities, numerous adaptation techniques can increase their overall performance when using QCTVA.
Due to the heterogeneity of the current terminal and network infrastructures, multimedia content needs to be adapted to specific capabilities of these terminals and network devices. Furthermore, user preferences and user environment characteristics must also be taken into consideration. The problem becomes even more complex by the diversity of multimedia content types and encoding formats. In order to meet this heterogeneity and to be applicable to different coding formats, the adaptation must be performed in a generic and interoperable way. As a response to this problem and in the context of MPEG-21, we present an approach which uses XML to describe the high-level structure of a multimedia resource in a generic way, i.e., how the multimedia content is organized, for instance in layers, frames, or scenes. For this purpose, a schema for XML-based bitstream syntax descriptions (generic Bitstream Syntax Descriptions or gBSDs) has been developed. A gBSD can describe the high-level structure of a multimedia resource in a coding format independent way. Adaptation of the resource is based on elementary transformation instructions formulated with respect to the gBSDs. These instructions have been separated from the gBSDs in order to use the same descriptions for different adaptations, e.g., temporal scaling, SNR scaling, or semantic adaptations. In the MPEG-21 framework, those adaptations can be steered for instance by the network characteristics and the user preferences. As a result, it becomes possible for coding format agnostic adaptation engines to transform media bitstreams and associated descriptions to meet the requirements imposed by the network conditions, device capabilities, and user preferences.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.