Сжатие видео - Video Indexing & Segmentation
Английские материалы |
|||
Авторы | Название статьи | Описание | Рейтинг |
Edouard Franёcois and Bertrand Chupeau | Depth-Based Segmentation |
Abstract—The tool presented in this paper performs an automatic segmentation of stereoscopic image sequences, based on the modeling of distance maps obtained by image processing. Two three-dimensional (3-D) image analysis algorithms are combined: i) estimation of dense depth maps from stereoscopic image sequences and ii) depth-based object segmentation. A Markovian statistical approach for such a segmentation of a dense depth map into arbitrarily shaped and oriented planar surfaces is described in detail. The simulation results on sequences “Fun fair” and “Tunnel,” provided on video tape to the MPEG-4 tests of November 1995, are discussed. RAR 89 кбайт |
|
Demin Wang | Unsupervised Video Segmentation Based on Watersheds and Temporal Tracking |
Abstract—This paper presents a technique for unsupervised video segmentation. This technique consists of two phases: initial segmentation and temporal tracking, similar to a number of existing techniques. However, new algorithms for spatial segmentation, marker extraction, and modified watershed transformation are proposed for the present technique. The new algorithms make this technique differ from existing techniques by the following features: 1) it can effectively track fast moving objects, 2) it can detect the appearance of new objects as well as the disappearance of existing objects, and 3) it is computationally efficient because of the use of watershed transformations and a fast motion estimation algorithm. Simulation results demonstrate that the proposed technique can efficiently segment video sequences with fast moving, newly appearing, or disappearing objects in the scene. RAR 1535 кбайт |
|
A. Aydyn Alatan, Levent Onural, Michael Wollborn, Roland Mech, Ertem Tuncel, and Thomas Sikora, | Image Sequence Analysis for Emerging Interactive Multimedia Services—The European COST 211 Framework |
Abstract— Flexibility and efficiency of coding, content extraction, and content-based search are key research topics in the field of interactive multimedia. Ongoing ISO MPEG-4 and MPEG-7 activities are targeting standardization to facilitate such services. European COST Telecommunications activities provide a framework for research collaboration. COST 211bis and COST 211ter activities have been instrumental in the definition and development of the ITU-T H.261 and H.263 standards for videoconferencing over ISDN and videophony over regular phone lines, respectively. The group has also contributed significantly to the ISO MPEG-4 activities. At present a significant effort of the COST 211ter group activities is dedicated toward image and video sequence analysis and segmentation—an important technological aspect for the success of emerging object-based MPEG-4 and MPEG-7 multimedia applications. The current work of COST 211 is centered around the test model, called the Analysis Model (AM). The essential feature of the AM is its ability to fuse information from different sources to achieve a high-quality object segmentation. The current information sources are the intermediate results from frame-based (still) color segmentation, motion vector based segmentation, and changedetection- based segmentation. Motion vectors, which form the basis for the motion vector based intermediate segmentation, are estimated from consecutive frames. A recursive shortest spanning tree (RSST) algorithm is used to obtain intermediate color and motion vector based segmentation results. A rule-based region processor fuses the intermediate results; a postprocessor further refines the final segmentation output. The results of the current AM are satisfactory; it is expected that there will be further improvements of the AM within the COST 211 project. RAR 495 кбайт |
|
Takashi Ida, and Yoko Sambonsugi | Image Segmentation and Contour Detection Using Fractal Coding |
Abstract—Fractal coding was applied to image segmentation and contour detection. The encoding method was the same as in conventional fractal coding, and the compressed code, which we call fractal code, was used for image segmentation and contour detection instead of image reconstruction. An image can be segmented by calculating the basin of attraction on a mapping that is a set of local maps from the domain block to the range block. The local maps are parameterized using the fractal code, and contours of the objects in the image are detected by the inverse mapping from the range block to the domain block. Some objects in the test image Lena were segmented, and the contours were detected well. The proposed methods are expected to enable compressed codes to be used directly for image processing. RAR 796 кбайт |
|
Shih-Fu Chang, William Chen, Horace J. Meng, Hari Sundaram, and Di Zhong | A Fully Automated Content-Based Video Search Engine Supporting Spatiotemporal Queries |
Abstract—The rapidity with which digital information, particularly video, is being generated has necessitated the development of tools for efficient search of these media. Content-based visual queries have been primarily focused on still image retrieval. In this paper, we propose a novel, interactive system on the Web, based on the visual paradigm, with spatiotemporal attributes playing a key role in video retrieval. We have developed innovative algorithms for automated video object segmentation and tracking, and use real-time video editing techniques while responding to user queries. The resulting system, called VideoQ (demo available at http://www.ctr.columbia.edu/VideoQ/), is the first on-line video search engine supporting automatic objectbased indexing and spatiotemporal queries. The system performs well, with the user being able to retrieve complex video clips such as those of skiers and baseball players with ease. RAR 475 кбайт |
|
Chee Sun Won | A Block-Based MAP Segmentation for Image Compressions |
Abstract—In this paper, a novel block-based image segmentation algorithm using the maximum a posteriori (MAP) criterion is proposed. The conditional probability in the MAP criterion, which is formulated by the Bayesian framework, is in charge of classifying image blocks into edge, monotone, and textured blocks. On the other hand, the a priori probability is responsible for edge connectivity and homogeneous region continuity. After a few iterations to achieve a deterministic MAP optimization, we can obtain a block-based segmented image in terms of edge, monotone, or textured blocks. Then, using a connected blocklabeling algorithm, we can assign a number to all connected homogeneous blocks to define an interior of a region. Finally, uncertainty blocks, which are not given any region number yet, are assigned to one of neighboring homogeneous regions by a block-based region-growing method. During this process, we can also check the balance between the accuracy and the cost of the contour coding by adjusting the size of the uncertainty blocks. Experimental results show that the proposed algorithm yields larger homogeneous regions which are suitable for the objectbased image compression. RAR 284 кбайт |
|
Ioannis Kompatsiaris, Dimitrios Tzovaras, and Michael G. Strintzis, | 3-D Model-Based Segmentation of Videoconference Image Sequences |
Abstract— This paper describes a three-dimensional (3-D) model-based unsupervised procedure for the segmentation of multiview image sequences using multiple sources of information. The 3-D model is initialized by accurate adaptation of a twodimensional wireframe model to the foreground object of one of the views. The articulation procedure is based on the homogeneity of parameters, such as rigid 3-D motion, color, and depth, estimated for each subobject, which consists of a number of interconnected triangles of the 3-D model. The rigid 3-D motion of each subobject for subsequent frames is estimated using a Kalman filtering algorithm, taking into account the temporal correlation between consecutive frames. Information from all cameras is combined during the formation of the equations for the rigid 3-D motion parameters. The threshold used in the object segmentation procedure is updated at each iteration using the histogram of the subobject parameters. The parameter estimation for each subobject and the 3-D model segmentation procedures are interleaved and repeated iteratively until a satisfactory object segmentation emerges. The performance of the resulting segmentation method is evaluated experimentally. RAR 555 кбайт |
|
Jae Gark Choi, Si-Woong Lee, and Seong-Dae Kim | Spatio-Temporal Video Segmentation Using a Joint Similarity Measure |
Abstract—This paper presents a new morphological spatiotemporal segmentation algorithm. The algorithm incorporates luminance and motion information simultaneously and uses morphological tools such as morphological filters and watershed algorithm. The procedure toward complete segmentation consists of three steps: joint marker extraction, boundary decision, and motion-based region fusion. First, the joint marker extraction identifies the presence of homogeneous regions in both motion and luminance, where a simple joint marker extraction technique is proposed. Second, the spatio-temporal boundaries are decided by the watershed algorithm. For this purpose, a new joint similarity measure is proposed. Finally, an elimination of redundant regions is done using motion-based region fusion. By incorporating spatial and temporal information simultaneously, we can obtain visually meaningful segmentation results. Simulation results demonstrates the efficiency of the proposed method. RAR 356 кбайт |
|
Joo-Hee Moon, Gwang-Hoon Park, Sung-Moon Chun, and Seok-Rim Choi | Shape-Adaptive Region Partitioning Method for Shape-Assisted Block-Based Texture Coding |
Abstract—In the content-based image coding scheme, segmentation information of the arbitrarily shaped regions may be available for both encoder and decoder. The shape-assisted block-based texture coding methodologies, such as shape-adaptive discrete cosine transform (SADCT), can use this segmentation information to improve coding efficiency. In this paper, we introduce the shape-adaptive region partitioning (SARP) methods which can reduce the number of coding blocks that partition the arbitrarily shaped region by modifying the block positions. By simply adding SARP method to aid the SADCT, the coded texture bits can be reduced by 5–10%, in comparison with the SADCT using common block-based coding infrastructure which is usually used in the MPEG-1/2 and H.263. RAR 207 кбайт |
|
Ullas Gargi, Rangachar Kasturi, and Susan H. Strayer | Performance Characterization of Video-Shot-Change Detection Methods |
Abstract—A number of automated shot-change detection methods for indexing a video sequence to facilitate browsing and retrieval have been proposed in recent years. Many of these methods use color histograms or features computed from block motion or compression parameters to compute frame differences. It is important to evaluate and characterize their performance so as to deliver a single set of algorithms that may be used by other researchers for indexing video databases. We present the results of a performance evaluation and characterization of a number of shot-change detection methods that use color histograms, block motion matching, or MPEG compressed data. RAR 285 кбайт |
|
Qian Huang, Atul Puri, and Zhu Liu | Multimedia Search and Retrieval: New Concepts, System Implementation, and Application |
Abstract—We first present new concepts applicable to the design of multimedia search and retrieval schemes in general, and to MPEG-7 in particular, the multimedia description standard in progress. Raw multimedia data is assumed to exist in the form of programs that typically consist of a combination of media types such as visual, audio, and text. We partition each such media stream into smaller units based on actual physical events. These physical events within each media stream can then be effectively indexed for retrieval. The concept of logical events is introduced next; we define logical events as those that can provide different “views” of the content as may be desired by a user. Such events usually result from either the correlation of events that cross different media types, or by merging recursively chosen events from a lower level within each media type. We then address the related issue of how to develop a practical multimedia information retrieval system that exploits the aforementioned concepts of physical and logical events as well as other aspects such as storage, representation and indexing to enable efficient search, retrieval, and browsing. Finally, we implement the proposed concepts and solutions within a multimedia system that addresses a real application, effective browsing of broadcast news, and evaluate its performance. RAR 1127 кбайт |
|
Jungwoo Lee and Bradley W. Dickinson | Hierarchical Video Indexing and Retrieval for Subband-Coded Video |
Abstract—In this paper, we present a multiresolution approach for video indexing and feature matching of subband-coded video databases. Four different scene-change detectors were tested; scene-change detection is applied only on the lowest subband for computational efficiency. Two kinds of scene changes, abrupt and smoothly accumulated, mark the beginning of new scene segments. The index for each scene segment is the pair of the histograms of two representative frames, the first and the last frame of the scene. Using the approach of query by example, the index-matching algorithm takes a multiresolution approach by hierarchically comparing histograms at different resolutions. The search algorithm for the match between example query and its target scene segment starts from the coarsest resolution and moves to the next finer resolution until the finest resolution is reached. Experimental results are presented, and the proposed indexing technique appears to be promising for its speed and its inherent hierarchical search procedure. RAR 2215 кбайт |
|
Niels Haering, Richard J. Qian, and M. Ibrahim Sezan, | A Semantic Event-Detection Approach and Its Application to Detecting Hunts in Wildlife Video |
Abstract—We propose a three-level video-event detection methodology and apply it to animal-hunt detection in wildlife documentaries. The first level extracts color, texture, and motion features, and detects shot boundaries and moving object blobs. The mid-level employs a neural network to determine the object class of the moving object blobs. This level also generates shot descriptors that combine features from the first level and inferences from the mid-level. The shot descriptors are then used by the domain-specific inference process at the third level to detect video segments that match the user-defined event model. The proposed approach has been applied to the detection of hunts in wildlife documentaries. Our method can be applied to different events by adapting the classifier at the intermediate level and by specifying a new event model at the highest level. Event-based video indexing, summarization, and browsing are among the applications of the proposed approach. RAR 610 кбайт |
|
Thomas Meier, and King N. Ngan, | Video Segmentation for Content-Based Coding |
Abstract—The extensive use of discrete transforms in image and video coding suggests the investigation on filtering before downsampling (FBDS) and filtering after upsampling (FAUS) methods directly acting on the transform domain. In this paper, we describe the “transform-domain resolution translation” technique that gives flexibility to resize windows of each video conferencing session for server compositing without explicit decompression, spatial domain processing, and compression. We generalize transform- domain filtering (TDF) to include nonuniform and multirate cases to implement the transform-domain resolution translator. The former is defined as a TDF problem in which the original transform domain is of different size from the target one, while the latter considers the implementation of sampling rate conversion in the transform domain. The implementation architecture is based on a pipeline that involves matrix–vector product blocks and vector addition, but is not limited to particular hardware. Such techniques are particularly useful for fast algorithms for processing compressed images and video where transform coding is extensively used (e.g., in JPEG, H.261, MPEG-1, MPEG-2, and H.263). RAR 697 кбайт |
|
P. Salembier, and F. Marquґes | Region-Based Representations of Image and Video: Segmentation Tools for Multimedia Services |
Abstract—This paper discusses region-based representations of image and video that are useful for multimedia services such as those supported by the MPEG-4 and MPEG-7 standards. Classical tools related to the generation of the region-based representations are discussed. After a description of the main processing steps and the corresponding choices in terms of feature spaces, decision spaces, and decision algorithms, the state of the art in segmentation is reviewed. Mainly tools useful in the context of the MPEG-4 and MPEG-7 standard are discussed. The review is structured around the strategies used by the algorithms (transition based or homogeneity based) and the decision spaces (spatial, spatio-temporal, and temporal). The second part of this paper proposes a partition tree representation of images and introduces a processing strategy that involves a similarity estimation step followed by a partition creation step. This strategy tries to find a compromise between what can be done in a systematic and universal way and what has to be application dependent. It is shown in particular how a single partition tree created with an extremely simple similarity feature can support a large number of segmentation applications: spatial segmentation, motion estimation, region-based coding, semantic object extraction, and region-based retrieval. RAR 1016 кбайт |
|
Gene K. Wu and Todd R. Reed, | Image Sequence Processing Using Spatiotemporal Segmentation |
Abstract— We investigate the improvements that can be obtained in several conventional video-processing algorithms through the incorporation of three-dimensional (3-D) (spatiotemporal) segmentation information. Four classes of image sequence processing techniques are considered: low-pass filtering, high-pass filtering, high-frequency emphasis, and 3-D Sobel filtering. It is demonstrated that segmentation information can improve the performance of these techniques substantially so that this approach may be promising for other applications (e.g., deinterlacing and resolution conversion) as well. can also be used to represent the interiors of regions. While more accurate, these expressions are also more complex. RAR 1300 кбайт |
|
Peter van Beek, A. Murat Tekalp, Ning Zhuang, Iёsil Celasun, and Minghui Xia | Hierarchical 2-D Mesh Representation, Tracking, and Compression for Object-Based Video |
Abstract—This paper proposes methods for designing, tracking and coding hierarchical two-dimensional (2-D) content-based mesh representations. The design procedure consists of constructing a fine-to-coarse hierarchy of Delaunay meshes, using image- and shape-based criteria for mesh geometry simplifi- cation. Hierarchical tracking employs a coarse-to-fine strategy with mesh-based motion vector optimization. We introduce new techniques to maintain the initial mesh hierarchy and topology during tracking by imposing certain constraints at each stage of the procedure. The hierarchical compression technique is based on a nearest neighbor ordering of mesh node points. This ordering serves to identify the mesh boundary nodes as well as establish spatial predictors for differential coding of node coordinates and motion vectors. The proposed hierarchical mesh representation, which has applications in object-based video manipulation, indexing, and compression, provides improved tracking performance (compared to a nonhierarchical representation) and allows progressive (scalable) transmission of the object geometry (including shape) and motion information, as well as variable level-of-detail rendering. Experimental results are presented to compare the tracking and compression performance of hierarchical versus nonhierarchical mesh representations and to demonstrate the tradeoff between image quality and mesh bit rate for 2-D mesh-based video object rendering. RAR 686 кбайт |
|
Douglas Chai, and King N. Ngan, | Face Segmentation Using Skin-Color Map in Videophone Applications |
Abstract—This paper addresses our proposed method to automatically segment out a person’s face from a given image that consists of a head-and-shoulders view of the person and a complex background scene. The method involves a fast, reliable, and effective algorithm that exploits the spatial distribution characteristics of human skin color. A universal skin-color map is derived and used on the chrominance component of the input image to detect pixels with skin-color appearance. Then, based on the spatial distribution of the detected skin-color pixels and their corresponding luminance values, the algorithm employs a set of novel regularization processes to reinforce regions of skincolor pixels that are more likely to belong to the facial regions and eliminate those that are not. The performance of the facesegmentation algorithm is illustrated by some simulation results carried out on various head-and-shoulders test images. The use of face segmentation for video coding in applications such as videotelephony is then presented. We explain how the face-segmentation results can be used to improve the perceptual quality of a videophone sequence encoded by the H.261-compliant coder. RAR 707 кбайт |
|
Soo-Chang Pei, and Ching-Min Cheng | Extracting Color Features and Dynamic Matching for Image Data-Base Retrieval |
Abstract—Color-based indexing is an important tool in image data-base retrieval. Compared with other features of the image, color features are less sensitive to noise and background complication. Based on the human visual system’s perception of color information, this paper presents a dependent scalar quantization approach to extract the characteristic colors of an image as color features. The characteristic colors are suitably arranged in order to obtain a sequence of feature vectors. Using this sequence of feature vectors, a dynamic matching method is then employed to match the query image with data-base images for a nonstationary identification environment. The empirical results show that the characteristic colors are reliable color features for image database retrieval. In addition, the proposed matching method has acceptable accuracy of image retrieval compared with existing methods. RAR 582 кбайт |
|
Hyun Sung Chang, Sanghoon Sull, and Sang Uk Lee, | Efficient Video Indexing Scheme for Content-Based Retrieval |
Abstract—Extracting a small number of key frames that can abstract the content of video is very important for efficient browsing and retrieval in video databases. In this paper, the key frame extraction problem is considered from a set-theoretic point of view, and systematic algorithms are derived to find a compact set of key frames that can represent a video segment for a given degree of fidelity. The proposed extraction algorithms can be hierarchically applied to obtain a tree-structured key frame hierarchy that is a multilevel abstract of the video. The key frame hierarchy enables an efficient content-based retrieval by using the depth-first search scheme with pruning. Intensive experiments on a variety of video sequences are presented to demonstrate the improved performance of the proposed algorithms over the existing approaches. RAR 688 кбайт |
|
Ebroul Izquierdo M. | Disparity/Segmentation Analysis: Matching with an Adaptive Window and Depth-Driven Segmentation |
Abstract— Most of the emerging content-based multimedia technologies are based on efficient methods to solve machine early vision tasks. Among others tasks, object segmentation is perhaps the most important problem in single image processing, whereas pixel-correspondence estimation is the crucial task in multiview image analysis. The solution of these two problems is the key technology for the development of the majority of leading-edge interactive video-communication technologies and telepresence systems. In this paper, we present a robust framework comprised of joined pixel-correspondence estimation and image segmentation in video sequences taken simultaneously from different perspectives. An improved concept for stereo-image analysis based on block matching with a local adaptive window is introduced. The size and shape of the reference window is calculated adaptively according to the degree of reliability of disparities estimated previously. Considerable improvements are obtained just within object borders or image areas that become occluded by applying the proposed block-matching model. An initial object segmentation is obtained by merging neighboring sampling positions with disparity vectors of similar size and direction. Starting from this initial segmentation, true object borders are detected using a contour-matching algorithm. In this process, the contour of the initial segmentation is taken as a reference pattern, and the edges extracted from the original images, by applying a multiscale algorithm, are the candidates for the true object contour. The performance of the introduced methods has been verified by computer simulations using synthetic data and several natural stereo sequences. RAR 1927 кбайт |
|
Alan Hanjalic, Reginald L. Lagendijk, and Jan Biemond, Fellow | Automated High-Level Movie Segmentation for Advanced Video-Retrieval Systems |
Abstract—We present a newly developed strategy for automatically segmenting movies into logical story units. A logical story unit can be understood as an approximation of a movie episode, which is a high-level temporal movie segment, characterized either by a single event (dialog, action scene, etc.) or by several events taking place in parallel. Since we consider a whole event and not a single shot to be the most natural retrieval unit for the movie category of video programs, the proposed segmentation is the crucial first step toward a concise and comprehensive contentbased movie representation for browsing and retrieval purposes. The automation aspect is becoming increasingly important with the rising amount of information to be processed in video archives of the future. The segmentation process is designed to work on MPEG-DC sequences, where we have taken into account that at least a partial decoding is required for performing content-based operations on MPEG compressed video streams. The proposed technique allows for carrying out the segmentation procedure in a single pass through a video sequence. RAR 191 кбайт |
|
Stephan Herrmann, Hubert Mooshofer, Harald Dietrich, and Walter Stechele | A Video Segmentation Algorithm for Hierarchical Object Representations and Its Implementation |
Abstract—This paper describes a segmentation algorithm for generating hierarchical object representations of images and image sequences. Starting from an object model, we describe the structure of the corresponding segmentation algorithm including all analysis methods applied. Besides the well-known color and motion analysis, we also show how to utilize shape information. Furthermore, we discuss the tradeoff between reducing the computational complexity and the quality of the segmentation results. Last, we present the implementation concept for our analysis model, which uses a special toolbox model. The toolbox provides a set of addressing schemes that are needed by low-level video processing tools. The low-level tools are functions that apply a single operation to all pixels in one frame. Using these addressing functions makes it easy to implement new video processing tools, which, when combined, form new analysis methods. The toolbox exists in C-code and is partially transferred into VHDL. search in the area of video processing. Because the applications and their requirements are not well defined, there is a need for modular and flexible segmentation algorithms. In consequence, a set of mid- and low-level tools is required to perform the tasks of image segmentation and feature extraction. The combination of these tools forms a high-level segmentation algorithm. RAR 653 кбайт |
|
Roberto Castagno, Touradj Ebrahimi, and Murat Kunt, Fellow | Video Segmentation Based on Multiple Features for Interactive Multimedia Applications |
Abstract—In this paper, we present a scheme for interactive video segmentation. A key feature of the system is the distinction between two levels of segmentation, namely, regions and object segmentation. Regions are homogeneous areas of the images, which are extracted automatically by the computer. Semantically meaningful objects are obtained through user interaction by grouping of regions according to the specific application. This splitting relieves the computer of ill-posed semantic problems, and allows a higher level of flexibility of the method. The extraction of regions is based on the multidimensional analysis of several image features by a spatially constrained fuzzy C-means algorithm. The local level of reliability of the different features is taken into account in order to adaptively weight the contribution of each feature to the segmentation process. Results on the extraction of regions as well as on the tracking of spatiotemporal objects are presented. RAR 287 кбайт |
|
Fabio Dell’Acqua and Paolo Gamba | Simplified Modal Analysis and Search for Reliable Shape Retrieval |
Abstract—In this work, we present the application of a simplified shape analysis technique based on a modal representation of the object shape and useful for improving the efficiency and effectiveness of shape-driven searches in image databases. The proposed method computes the representation of an object by means of modes very similar to the deformation modes of a mechanical system, but in a numerically more stable way than the usual finite-element method approach. Moreover, to make the technique for the visual search more effective, many different definitions of similarity indexes are introduced and discussed. The problems related to the comparison between objects represented by a very different number of feature points are also discussed. Finally, to prove the effectiveness of the approach, the indexes are studied in a simple case study (a small database of character shapes). However, their performance on a larger image database is also addressed, as well as the ability of the method to efficiently assess the problem of retrieving images similar to a user-defined sketch. RAR 311 кбайт |
|
Ru-Shang Wang and Yao Wang, | Multiview Video Sequence Analysis, Compression, and Virtual Viewpoint Synthesis |
Abstract—This paper considers the problem of structure and motion estimation in multiview teleconferencing-type sequences and its application for video-sequence compression and intermediate- view generation. First, we introduce a new approach for structure estimation from a stereo pair acquired by two parallel cameras. It is based on a 2-D mesh representation of both views of the imaged scene and a parameterization of the structure information by the disparity between corresponding nodes in the image pair. Next, we describe a novel image alignment approach which can convert images captured using nonparallel cameras to coplanar-like images. This approach greatly eases the computational burden incurred by the nonparallel camera geometry, where one must consider both horizontal and vertical disparities. Finally, we present a coder for multiview sequences, which exploits the proposed alignment and structure estimation algorithm. By extracting the foreground objects and estimating the disparity field between a selected view and a reference view, the coder can compress the image pair very efficiently. In the meantime, by using the coded structure information, the decoder can generate virtual viewpoints between decoded views, which can be very helpful for telepresence applications. RAR 1323 кбайт |
|
Thomas Meier, King N. Ngan, and Gregory Crebbin | Reduction of Blocking Artifacts in Image and Video Coding |
Abstract—The discrete cosine transform (DCT) is the most popular transform for image and video compression. Many international standards such as JPEG, MPEG, and H.261 are based on a block-DCT scheme. High compression ratios are obtained by discarding information about DCT coefficients that is considered to be less important. The major drawback is visible discontinuities along block boundaries, commonly referred to as blocking artifacts. These often limit the maximum compression ratios that can be achieved. Various postprocessing techniques have been published that reduce these blocking effects, but most of them introduce unnecessary blurring, ringing, or other artifacts. In this paper, a novel postprocessing algorithm based on Markov random fields (MRF’s) is proposed. It efficiently removes blocking effects while retaining the sharpness of the image and without introducing new artifacts. The degraded image is first segmented into regions, and then each region is enhanced separately to prevent blurring of dominant edges. A novel texture detector allows the segmentation of images containing both texture and monotone areas. It finds all texture regions in the image before the remaining monotone areas are segmented by an MRF segmentation algorithm that has a new edge component incorporated to detect dominant edges more reliably. The proposed enhancement stage then finds the maximum a posteriori estimate of the unknown original image, which is modeled by an MRF and is therefore Gibbs distributed. A very efficient implementation is presented. Experiments demonstrate that our proposed postprocessor gives excellent results compared to other approaches, from both a subjective and an objective viewpoint. Furthermore, it will be shown that our technique also works for wavelet encoded images, which typically contain ringing artifacts. RAR 861 кбайт |
|
Emmanuel Reusens, Touradj Ebrahimi, Corinne Le Buhan, Roberto Castagno, Vincent Vaerman, Laurent Piron, Carmen de Sol`a F`abregas, Sushil Bhattacharjee, Frank Bossen, and Murat Kunt, Fellow | Dynamic Approach to Visual Data Compression |
Abstract—This paper presents the Swiss Federal Institute of Technology (EPFL) proposal to MPEG-4 video coding standardization activity [1]. The proposed technique is based on a novel approach to audio-visual data compression entitled dynamic coding. The newly born multimedia environment supports a plethora of applications which cannot be covered adequately by a single compression technique. Dynamic coding offers the opportunity to combine several compression techniques and segmentation strategies. Given a particular application, these two degrees of freedom can be constrained and assembled in order to produce a particular profile which meets the set of specifications dictated by the application. The basic principles of this approach are presented together with the data representation system. The major characteristics of dynamic coding are reviewed, along with simulation results showing the performance of such an approach in a very low bit-rate video coding environment. RAR 872 кбайт |
|
Сайт о сжатии >> Статьи и исходники >>
Материалы по видео
Смотрите также материалы:
- По цветовым пространствам
- По JPEG
- По JPEG-2000
наверх
Подготовили Сергей Гришин и Дмитрий Ватолин