Do you want the best writing service?
Order now and get the best academic results. The best writing in the hands of the best professionals.
Order now and get the best academic results. The best writing in the hands of the best professionals.
BACKGROUND OF THE SITUATION UNDER STUDY
To search for background information on the target situation of the study, on the design of a distributed video encoder proposal, specialized databases were used. Below is the study background that has been compiled.
Study and simulation of distributed video encoders
The objective of this project was first to introduce the different existing architectures and secondly to explore and propose new coding schemes in the area of distributed video coding, introducing advances in the improvement of coding efficiency and more specifically to improve the generation of the lateral information for the case of multiple views (cameras), for this a study of current technology was carried out and the most appropriate one was selected based on the degree of optimization.
Technological trends in video compression
This research covered from the explanation and understanding of the definition of the CODEC, through its characteristics and attributes, to the explanation and history of commercial video standards, to then explain the new paradigms of Distributed Video Coding such as PRISM, DISCOVER and Stanford, which are based on the Slepian – Wolf and Wyner – Ziv theorems. Once the current video encoder (H.264) and the new video paradigms are determined, they are compared in performance and complexity.
Contribution to the transmission of video in IP networks with quality of service
The problem of efficient transmission of variable rate MPEG compressed video traffic over IP networks with quality of service arises. To do this, first a study of the compression algorithms most commonly used today is carried out. Next, a detailed analysis of the different alternatives for the transmission of video in IP networks is carried out. In the first instance, the problem of video transport in packet-switched networks without service guarantees has been described, discussing the need for the application of rate control and traffic smoothing techniques. This document will serve as a guide for the development of the theoretical framework.
Digital video compression is necessary to perform encoding efficiently, whether for storing or transmitting the video signal. The goal is to maintain application-dependent reconstruction quality while minimizing the amount of data (bits) stored or transmitted. In this sense, video compression is the reduction of the data rate necessary to encode a sequence of frames. Compression methods can be classified as: lossy compression or lossless compression. Lossless pressure means compressing data without rejecting or altering any information present in it. Lossy compression attempts to remove redundant and irrelevant information from video signals. Depending on the desired compression ratio, the amount of information rejected increases or decreases, changing the quality of the signal. Redundancy can be spatial or temporal.
Therefore, video compression generally reduces spatial redundancy by using image compression on each frame. This is known as intraframe compression. On the other hand, the reduction of time redundancy is normally done by motion compensation techniques, known as interframe compression.
Currently, most of the technical video compression standards are designed for streaming applications, where there is a single powerful encoder and several low-complexity decoders. Some technical standards are designed for specific situations, such as low speed coding for video conferencing.
However, new requirements have emerged in digital video encoding. Requirements such as bandwidth fluctuation, power restriction, and quality of service (QoS) can be as important as compression rate. The need to satisfy different requirements for decoding makes transcoding necessary in many situations. Due to the fact that the transcoder requires high computational capacity, the need for an adaptive video codec (encoder and decoder) has increased.
There is an adaptive extension in signal-to-noise ratio (signal-to-noise ratio or SNR) for the H.264 / AVC standard. This extension is known as a Scalable Video Encoder (SVC) and it may meet some of the new encoding requirements. Other encoders with adaptive SN R have been presented previously, and there are also studies to evaluate the theoretical limits of rate-distortion (RD) performance for adaptive bit-rate video compression algorithms.
In this way, and given the problems of understanding and the signal / noise ratio, a distributed video encoder will be made that offers a new DVC architecture, new IS generation methods and mechanisms to omit the return channel.
3.1 General objective
Proposal design for a distributed video encoder.
3.2 Specific objectives
¿Si se diseña un diseñar un codificador de video distribuido que ofrezca una nueva arquitectura DVC, nuevos métodos de generación de IS y mecanismos para omitir el canal de retorno, que beneficios traerá?
The scope of this project is to design a distributed video encoder that offers a new DVC architecture, new IS generation methods and mechanisms to bypass the return channel, which benefits will bring.
6.1 Basic concepts of information theory
Information theory answers the two fundamental questions of data compression and encoding: what is the maximum possible compression and what is the best transmission speed in communications. It is developed on the treatment of random variables, as well as their grouping in stochastic processes. A random variable is a measurable function, which assigns unique numerical values to all possible outcomes of a random experiment under certain conditions. Stochastic processes allow you to mathematically express the relationships between your random variables. An introduction to random variables and stochastic processes can be found in other works.
A basic concept in information theory is the concept of source. The term source is used to indicate a process that generates successive information messages from a given set of possible messages. A font can be modeled as a random variable X that emits symbols of an alphabet χ and with the probability mass function p (x). Associated with each source is its entropy H. Entropy is a measure of the uncertainty of a random variable. In terms of information theory, entropy indicates the average amount of information a source has, in bits per symbol.
6.2 Source encoding
In Shannon’s work there are three main results: the source coding theorem, the rate distortion theorem, and the channel coding theorem. The first is lossless compression from a discrete source; Continuous sources cannot be played without loss. In this theorem, it was shown that a discrete source X can be perfectly reconstructed if, and only if, it was transmitted with a rate of RX not less than the entropy H (X).
6.3 Channel coding
A channel code converts a binary input i into a code. The code rate is defined as RC = nk ≤ 1, which specifies that an input of size k generates a code of size n. The error correction codes help infer the original information, even if the code is corrupted. The coding theorem indicates that for a real value ǫ> 0 and a coding rate RC <C, where C is the channel capacity, there exists a C code such that the probability of error after decoding is less r than ǫ. The definition of the channel capacity is described below.
Channel capacity indicates how much information a channel can transmit with a probability of error close to zero. An extension of the source and channel coding theorems is the source channel coding theorem. It states that there is a source channel codec that allows a source with entropy H (X) to be reliably encoded on a given channel if and only if H (X) <C. In the case of lossy coding, where D is the allowed distortion, it is easy to verify that a code with rate R (D) <C can be obtained, since H (X) = R (0) ≥ R (D).
6.4 Slepian-Wolf coding
Shannon’s source coding theory can easily be expanded to co-coding two sources (X, Y) with joint entropy H (X, Y), if it is treated as the coding of a single source Z with entropy H ( Z) = H (X, Y). Therefore, to get a lossless reconstruction, just use a rate RZ ≥ H (Z). However, the problem can also be viewed as follows: we can transmit source X at a speed RX ≥ H (X), and transmit Y using RY ≥ H (Y | X) bits, based on prior knowledge perfect X.
In 1973, the result of Slepian and Wolf succeeded in expanding Shannon’s theory to separate the encoding of two related sources. According to the Slepian-Wolf theorem, two sources can be coded separately and reconstructed without loss if the statistics are known and RX ≥ H (X | Y), RY ≥ H (Y | X) and RX + RY ≥ H ( X, Y).
This theorem extends the coding region with a possible lossless reconstruction for two related sources. Source coding with complementary information without loss of information is a particular case of Slepian-Wolf coding. According to the Slepian-Wolf theory, if source Y exists only in the decoder, or was transmitted at a rate not less than H (Y), it is possible to encode the related source X at a rate not less than H (X | Y) to obtain a perfect reconstruction X = X The source Y is called the side information.
6.5 Wyner-Ziv encoding
Three years after the publication of Slepian and Wolf, Wyner and Ziv extended the results of lossy coding with lateral information. So, as the source coding theorem can be seen as a special case of the rate distortion theorem, if D = 0, the source coding with complementary Slepian-Wolf information can be seen as the zero distortion case for Wyner-Coding- Ziv.
Let RW Z (D) be the Wyner-Ziv velocity distortion function. If RX (D) is the rate distortion function to encode and decode source X with an expected value of distortion D, and RX | Y (D) is the function associated with the coding of X given the perfect information of Y with an acceptable distortion D. Wyner and Ziv demonstrated the veracity of the relationships: RW Z (D) ≥ RX | Y (D) and RW Z (D) ≤ RX (D).
6.6 Hybrid video encoding
A digital video signal is a three-dimensional signal. This signal is made up of a discrete sequence of images (frames), which are represented by a discrete matrix of luminance and chrominance values. Most video coding standards are based on the same generic video codec model . This codec incorporates motion estimation and compensation functions, a transform and quantization stage, and an entropy encoder. This model is commonly known as hybrid coding, since it combines transform coding, generally the discrete cosine transform (or DCT), and motion compensation coding, called the DPCM (Differential Pulse Code Modulation) model.
The temporal redundancy of a digital video signal is reduced by predictive inter-frame coding. In DPCM encoding, each sample is predicted from one or more previously transmitted samples. Using the same principle, a DPCM video codec generates a pattern of the frame that will be encoded based on previously transmitted (or stored) frames.
The frame to encode is called the source frame, while the frames used to make the prediction are known as the reference frames. Initially, the source frame is divided into blocks. Temporal prediction, or motion estimation, is made independently from these blocks. The objective is to find, in the reference frames, the region most similar to the block that is encoded. Reference frames are usually adjacent frames, for example the frame just before the source frame. The process of generating the resulting frame, by replacing each block in the source frame with its prediction, is called motion compensation. This resulting frame is subtracted from the original source frame to generate the residual frame. The entropy of the residual frame is less than that of the source frame and can be encoded more efficiently.
Therefore, only the information necessary to generate the same compensated frame in the decoder (motion vectors) and the residual frame is encoded. Motion vectors indicate the position of the chosen region in the reference frame relative to the block in the source frame.
6.7 Distributed video encoding
Distributed video coding, DVC, is based on source coding with side information present only at the decoder. In the hybrid coding paradigm, the encoder requires more computational effort than the decoder mainly due to the motion estimation function.
If RD optimization stages are used, the complexity of the encoder increases even more. Therefore, the idea of the DVC paradigm is to transfer part of the complexity to the decoder, generating a complex encoder. Basically, an attempt is made to avoid or minimize operations related to the motion estimation function, transferring this operation to the decoder.
Although distributed coding has its foundations in information theory studies carried out in the 1970s, research for the implementation of practical video codecs is recent. The first Wy ner-Ziv encoding proposal was implemented in 1999, where the asymmetric case of source encoding with lateral information for binary and Gaussian sources was considered.
Later work considered symmetric encoding where fonts are encoded at the same speed. Years after the first DVC architectures were proposed: (i) Stanfordeww2w2 model, and (ii) PRISM. Stanford’s architecture has been widely explored in the literature, generating different works derived from the initial proposal that present continuous performance improvements. The PRISM architecture, performed at Berkeley, has not been explored as much as Stanford’s because it requires greater implementation difficulties. However, it has a different proposal, being a reference for any job in DVC.
This project will be guided by a quantitative methodology, where a coding mode based on key frames of full spatial resolution and intermediate frames coded at reduced resolution using a Wyner-Ziv encoder will be proposed.
Therefore, a good rate distortion performance will be sought by better generation of side information in the decoder and an automatic rate allocation mechanism in the encoder. This mode will reduce the coding complexity of the intermediate frames, followed by Wyner-Ziv coding of the remainder. The quantized coefficients of the residual frame are mapped into side classes without using a return channel. For this, a study of the optimal encoding parameters will be carried out in the creation of side classes without memory. In addition, a mechanism will be developed to estimate the statistical correlation between the signals. This mechanism guides the choice of encoding parameters and rate assignment during the side class creation process.
The generation of lateral information explores the information obtained from the low resolution base layer. In the decoder, the channel decoding of the side classes is performed using the side information to obtain a high-quality version of the decoded intermediate frame. Results for encoding complexity and performance, in terms of speed distortion, are presented using the H.264 / AVC standard. It will be shown that the proposed Wyner-Ziv encoding mode is competitive compared to conventional encoding. The proposed Wyner-Ziv mode will also be adaptive to reduce complexity and support a low complexity decoding mode.