WO1993021734A1 - A coding technique for high definition television signals - Google Patents

A coding technique for high definition television signals Download PDF

Info

Publication number
WO1993021734A1
WO1993021734A1 PCT/US1993/003117 US9303117W WO9321734A1 WO 1993021734 A1 WO1993021734 A1 WO 1993021734A1 US 9303117 W US9303117 W US 9303117W WO 9321734 A1 WO9321734 A1 WO 9321734A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
sub
coding
bands
codebook
Prior art date
Application number
PCT/US1993/003117
Other languages
French (fr)
Inventor
Lin-Nan Lee
Ashok Kolar Rao
Sanjai Bhargava
Original Assignee
Communications Satellite Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Communications Satellite Corporation filed Critical Communications Satellite Corporation
Publication of WO1993021734A1 publication Critical patent/WO1993021734A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/008Vector quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/94Vector quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability

Definitions

  • the present invention relates to the broadcasting of image signals, such as high definition television TV signals.
  • the invention involves a coding technique for transforming analog television signals into a digital bit stream made up of codewords for transmission.
  • the received digital bit stream is decoded back into TV signals.
  • HDTV high definition television
  • Waveform coding techniques such as pulse code modulation (PCM) typically use a scalar quantizer to quantize analog samples. Additional steps may be employed to reduce the information that must be transmitted, such as sending only the difference of
  • PCM samples such a technique is known as differential PCM, or DPCM.
  • Waveform coding is generally simple to implement, but not very efficient as far as bandwidth compression is concerned.
  • An example of waveform coding is the KDD/Canon 140 Mbit/s HDTV Codec.
  • the transform coding techniques transform the image samples to the transform domain, achieving energy compaction.
  • Scalar quantizers are designed for each individual coefficient depending on its energy and the sensitivity of the human visual system (HVS) to that coefficient.
  • Most popular video coding systems at this time are based on variations of the Discrete Cosine Transform (DCT) .
  • DCT Discrete Cosine Transform
  • Well designed transform algorithms achieve effective bandwidth compression while preserving good image quality. Examples of transform technique based codecs are Telettra's 68 Mbit/s HDTV codec and General Instruments•s 15 Mbit/s Digi-Cipher HDTV codec.
  • Sub-band coding is a coding technique which divides the HDTV signal into many small bands. As most of the signal energy is concentrated in the low- frequency bands, more information bits are allocated to the samples in the low-frequency bands. Also, different bands have different signal characteristics, and so different techniques can be employed to encode each individual sub-band. Examples of sub-band coding include the 140 Mbit/s codecs by Bellcore and NTT.
  • Waveform coding, transform coding, and to some extent, sub-band coding are "symmetrical" algorithms in the sense that the complexity of the encoder and decoder is about equal. For most of the video transmission and storage applications, there is a far greater number of decoders required than there is encoders, because each receiver must have a decoder but only the broadcaster need have an encoder. Therefore, "symmetrical" algorithms may not offer the best solution from an overall system cost standpoint.
  • VQ Vector quantization
  • An object of the present invention is to achieve an image signal coding technique which involves high is compression ratios with very little quality degradation, and further has simple hardware construction.
  • a further object of the invention is to achieve considerable bandwidth compression with a coding 20 technique even when the original signal is noisy.
  • a television signal coding method comprising the steps of: 25 (a) dividing a video signal into its luminance and chrominance components;
  • Figure 1 represents a codebook of basic vectors for the SVQ coding technique of the present invention
  • Figure 2 represents the encoder of the present invention
  • Figure 3 represents the decoder of the present invention
  • Figure 4 represents an interframe encoder of the present invention
  • Figure 5 represents an interframe decoder of the present invention.
  • the present invention involves an improved coding technique for use, for example, with analog television signals which are to be broadcast to a plurality of receivers.
  • the analog TV signals are transformed into a digital bit stream made up of codewords by the coder.
  • the coder involves a vector quantization scheme which is different from the conventional vector quantization schemes discussed above, in that in the new vector quantization scheme, a very high quality picture can be encoded with very simple processing at relatively low bit rates.
  • the encoding process is considerably simpler than the traditional vector quantization which requires a larger codebook and thus intensive computation.
  • the quality is also substantially better than what normally can be achieved by traditional vector quantization techniques with reasonably sized codebooks.
  • the codebook is greatly reduced in size.
  • the codebook only includes codewords which relate to image patterns to which the human visual system is highly sensitive and which occur often in typical images.
  • the sample codebook shown in Figure 1 covers the most common edge patterns to be used. Each of these patterns is a separate basic vector used in the quantization. This type of quantization will be called hereinafter “simple vector quantization” (SVQ) .
  • SVQ simple vector quantization
  • a single frame of a video signal can be modeled as a raster of pixels.
  • the pixels are arranged in a two-dimensional array. If the raster is partitioned into a large number of small square blocks, each containing, for example, x 4 pixels, o monochrome images quantized to 8 bits per pixel would normally require 128 bits to represent the block without compression. However, the information of user interest in such a small block can be classified into only a few two-dimensional image patterns.
  • a straightforward compression technique involves the selection of the best approximation of the input block out of the codebook of candidate patterns. The technique involves the following steps:
  • the codeword that is transmitted is as follows. First, a small number of bits, for example, 4 bits,
  • the parameter values for the DC class would be the mean of the block, that is, the mean of the intensity of the block.
  • each block has only two areas of intensity difference.
  • the invention is not limited to this embodiment, however. Specifically, larger codebooks are also contemplated.
  • Each block of an incoming analog TV signal frame is encoded by matching it up with the closest one of the basic vectors in the simplified codebook shown in Figure 1.
  • Classes 1-15 involve two-mean patterns (i.e., the edge is modeled as a step change in intensity) .
  • Classes 1-3 involve horizontal edge patterns in which the two sections of the block having different intensities are laid out horizontally with respect to each other. That is, if a vertical line is drawn through the block, the vertical line will cross two areas, each area having a different intensity.
  • Classes 4-6 of Figure 1 show basic vectors having vertical edges.
  • the change of intensity is the opposite as that described above with respect to the horizontal edges. That is, if a horizontal line is drawn across the block, the horizontal line will intersect two areas, each area having a different intensity.
  • Classes 7-15 involve diagonal edges, and a diagonal line separates areas of different intensities. That is, in order to stay within a single intensity area of the block, a line must follow a diagonal direction.
  • the classes 1-15 require two scalar parameters, each one representing the mean of one of the two disjoint regions of the block. These parameters must be transmitted along with the identifier of the pattern, as discussed above.
  • the first few bits of the codeword involve an identifier to identify which of the 16 classes of the simplified codebook most nearly represents the corresponding block of the input analog video signal frame.
  • the next group of bits is used to represent the mean of one of the areas of the block (assuming that the DC class is not involved) and the second group of bits represents the mean of the other disjoint area of the block.
  • the mean is calculated by adding up all of the intensity levels of the pixels of each disjoint area and dividing this sum by the number of pixels in the disjoint area. If the DC class is involved, only one group of bits will follow the identifier, and this group of bits will represent the estimated DC value of intensity.
  • an edge pattern (classes 1-15) is involved, an additional number describing the step size of the edge will be required to be added to the end of the codeword. Since typical pictures contain a large number of DC blocks, the average number of bits needed to describe a block can be reduced to less than 16 bits with this scheme alone, corresponding to a compression greater than 8 to 1.
  • a standard entropy coding technique such as Huffman coding
  • a 12-16:1 compression can easily be achieved.
  • the block size can be as small as 2 x 2 and can be enlarged to greater than 4 x 4.
  • the smaller the block size the better the quality.
  • the bigger - the block size the higher the coding efficiency.
  • the quality can be improved by adding other visually perceptible, but less significant patterns.
  • entropy coding a much larger codebook consisting of 128 or more patterns can be used without increasing the information rate significantly.
  • the encoding process involves a pattern matching operation and computations of a DC value and a step size (if an edge pattern is selected) .
  • the computation requirement of this encoding process is still less than that required by two-dimensional DCT for the same size block.
  • the decoding process involves only a table look-up of the pattern and simply filling the patterns with the appropriate pixel value(s) .
  • the SVQ technique is a spatial coding technique which utilizes the properties of the human vision system to compress the information bandwidth with a fairly simple encoder and an extremely simple decoder.
  • the technique by itself can achieve a coding efficiency of 1 bit per pixel for a typical video picture with negligible degradation in image quality. It can also be used in conjunction with other coding schemes such as conditional replenishment, motion compensation, or sub-band coding to achieve further bandwidth compression, as will be described later below.
  • the baseline codebook of the SVQ technique consists of edge patterns which makes it eminently suitable for interframe coding and intraframe coding of edge-like patterns. More complex features like textures and gradients can also be coded with high fidelity using an expanded codebook.
  • the codewords are parameterized by adding additional scalars, thereby greatly increasing the codebook size and improving the quality while minimizing encoder complexity.
  • VQ decoder Because of the simplicity of the VQ decoder, vector quantization is an appealing coding technique for the broadcast environment. Since VQ encoder complexity is usually quite high, conventional VQ techniques use a codebook with a limited number of codewords to reduce the complexity of the search for the best codeword. However, codebooks are typically designed using mean square error as a criterion, which causes visually important features such as edges to be degraded.
  • the SVQ technique was developed to reduce transmitter complexity without sacrificing quality.
  • the codewords-patterns chosen are those to which the human visual system is highly sensitive and which occur often in typical images. Techniques like the DCT technique have been observed to be ineffective in coding motion-compensated frame difference signals, which are usually impulsive, edge-like signals. Since SVQ can code edges with high fidelity, it appears to be particularly appropriate for these signals.
  • the codebook is parametric, its size should not be compared to that of a conventional VQ codebook.
  • One of the advantages of SVQ is that it can have a large codebook, as computed above, without the correspondingly large search complexity.
  • the nearest codevector can be found (using an exhaustive search) by computing only 16 distortions (the number of classes shown in Figure 1) .
  • an input 2 2 block which is a portion of the video signal frame of the analog TV signals being input to the coder, has values of 2, 4, 6 and 8 as it enters the coder, and it corresponds most closely to the DC class of Figure 1, the SVQ coder will find the mean of this block, which is 5, and place a 5 in each of the four areas of the block and transmit the block with all 5's in it.
  • codebook described above is only a sample codebook described for illustrative purposes.
  • the HDTV codec actually uses a codebook with 128 basic two-mean edges. More complex codebooks (i.e., with codevectors parameterized by more than two values) can also be designed.
  • the fast search uses a look-up table which is 20 created from the pattern codebook in an initial training phase.
  • the look-up table is stored in a memory in the encoder. It is assumed that the pattern codebook is ordered according to the frequency of occurrence of the patterns in the training sequence. 25 The generation of the look-up table is first described.
  • the codebook now consists of 16 bit codewords.
  • the look-up table consists of 2 16 input value entries corresponding to 2 possible 16 bit codewords. For each of the 2 16 16 bit words, the codewords from the codebook which are closest in Hamming distance to it are found. In case there are multiple codewords at the minimum Hamming distance, the codeword which occurs first in the codebook is chosen. This codeword becomes the output entry in the look-up table corresponding to the 16 bit word. Therefore, the input entries of the look-up table are mutually exclusive and there are 2 of them and they represent all possible bit combinations of a 16 bit word.
  • the output entries of the look-up table can assume 16 different values, the values of the 16 bit words generated from the codebook.
  • the fast search is performed as follows.
  • the fast search may fail in finding the closest single edge pattern. If the input block has an ill defined edge, then the thresholding operation performed by the fast search may yield a word which is very different from the best codeword for the block. This is not the primary reason for the sub-optimality of the algorithm because, in any case, ill defined edges cannot be coded adequately by a single edge pattern. A more important reason is that the word obtained after thresholding may be at the minimum Hamming distance to several codewords and since the first codeword in the codebook is the one chosen (i.e. statistically more probably codewords are given preference) , it may not be the best codeword. There are two ways to alleviate this problem.
  • the dimensionality of the codebook can be increased so that every 16 bit word is associated with up to, say, 4 codewords which are at the minimum Hamming distance from it.
  • the encoder would then evaluate the distortion incurred by coding the block with each one of the 4 codeword patterns and would select the pattern which yields the minimum distortion.
  • the other approach which does not increase the size of the codebook or the complexity of the encoder, is to choose the best codeword (out of all those at the minimum Hamming distance) at the look-up table creation stage using additional criteria.
  • the 4x4 block nature of the codewords may help in finding the best codeword.
  • SVQ can code the low-frequency bands more efficiently once the signal is divided into various sub-bands. Also, specific SVQ patterns can be designed to better fit the characteristics of the individual sub-bands.
  • Sub-band coding involves partitioning the input video signal into a number of disjoint sub-bands, each of which can be coded separately to optimally exploit the properties of the human visual system. Good results were obtained by Applicants by partitioning the image into 7 sub-bands and using differential pulse code modulation (DPCM) for the lowest frequency band and PCM with a coarse quantizer for the other bands.
  • DPCM differential pulse code modulation
  • HDTV sequences have significant noise power at high frequencies and coding schemes based on sub-band coding can be made robust to source noise fairly easily. Robustness can be achieved simply by using coarse quantization in the higher sub-bands.
  • the HDTV encoder of this embodiment will now be described with respect to Figure 2.
  • the input HDTV RGB video signal is input to a matrix 10.
  • the analog HDTV signal enters the HDTV encoder in the form of red, green, blue (RGB) components.
  • RGB red, green, blue
  • the input signal is separated into a luminance signal Y and chrominance signals U, V.
  • the signals are also band-limited. In a specific design example, these components were limited to 24 MHz and 6 MHz, respectively, then sampled at 54 MHz and 13.5 MHz, respectively, and quantized into 8 bits each.
  • the luminance signal Y is then divided into four sub-bands, the low-horizontal low-vertical (LL) , low- horizontal high-vertical (LH) , high-horizontal low- vertical (HL) and high-horizontal high-vertical (HH) bands.
  • Each of the sub-bands are then coded using SVQ.
  • a sub-band encoder 11 is used to separate the luminance signal Y into the four sub-bands described above.
  • Each of the sub-bands are fed into a separate SVQ encoder 12.
  • the outputs of the respective SVQ encoders 12 are sent to a multiplexer (MUX) 13.
  • MUX multiplexer
  • the LL band is coded with patterns in 2 x 2 blocks, whereas all other bands are coded with 4 x 4 blocks. Use of the small 2 x 2 blocks ensures that the most critical LL band is coded with minimum distortion.
  • the LL band is normally coded with 4 x 4 blocks. When coding with 4 x 4 blocks yields unacceptable distortion, 2 x 2 blocks are used. Because of significant noise presence in the HL and HH band, a noise coring will be applied to the signal before coding. Noise coring is equivalent to the use of a dead zone around 0. Since signals in those bands have negligible DC values, noise coring introduces little distortion while reducing the noise level significantly. Also, because the signal characteristics of each individual sub-band are different, SVQ entropy encoders are individually optimized to take advantage of the unique characteristics of the sub-bands.
  • the chrominance signals are not divided into sub- bands. These signals are vertically filtered and decimated, and then coded by the SVQ technique using the chrominance encoder 14. The output of the chrominance encoder is also fed to the multiplexer 13. A digital audio signal and other data, to be transmitted simultaneously with the HDTV signal, is also input to the multiplexer 13. The output of the multiplexer 13 is subjected to forward error correction by unit 15 and then transmitted to the modem for broadcasting.
  • the receive data will first be put into a buffer and then subjected to forward error correcting decoding by decoder 30 in Figure 3.
  • decoder 30 The output of decoder 30 is sent to the multiplexer 31 which separates the received signal into the four separate luminance sub-bands, a chrominance signal, and digital audio and data.
  • the chrominance signal is sent to chrominance decoder 32 and the four luminance sub-bands are sent to four respective SVQ decoders 33 where they are decoded back into the sub-bands HH, HL,
  • the signals are entropy decoded and SVQ decoded into individual 4 x 4 (or 2 x
  • the HDTV signal will then be reconstructed by interpolating the individual sub-band signals and superimposing them together.
  • Sub-band decoder 34 superimposes the signals back into the luminance signal Y which is sent to matrix 35.
  • the chrominance signal output from the chrominance decoder 32 is also input to the matrix 35 and the HDTV RGB output signal is reconstructed at the output of matrix 35.
  • This general scheme can be used with more than four sub-bands.
  • an important advantage of using four sub-bands is that an NTSC-equivalent signal is automatically available in the form of the LL band luminance signal. Therefore, it is easy to provide an NTSC signal directly from the received signal. This is useful in situations when it is desired to transmit both an HDTV signal and an NTSC signal simultaneously. For example, some users may not be able to view the HDTV signal but can only view the NTSC signal.
  • DPCM in most cases, it is particularly susceptible to high-frequency noise.
  • Combining sub-band coding and SVQ coding results in a robust and efficient coding scheme.
  • the SVQ codebooks, fast search look-up tables, and entropy coders are individually optimized to exploit the unique characteristics of each sub-band.
  • the first and most straight-forward method is to perform the motion estimation and compensation prior to sub-band filtering.
  • the sub- band SVQ coder will then have as its inputs either a block of the motion-compensated frame difference or a block of the current frame if the motion estimation has failed for any reason (scene change, image area occlusion or uncovering, nonuniform motion, etc.).
  • the disadvantage of this approach is that the motion estimator must operate at an input rate equal to the HDTV sampling rate of 54 MHz or greater.
  • Currently available chips cannot operate at this rate and several chips will have to be time multiplexed to perform motion estimation. In this chips, the search area is limited to about + 8 pels in both directions; this displacement may be insufficient for HDTV frames.
  • the second approach is to perform the motion estimation and compensation on each of the bands separately.
  • Single-chip motion estimators can be used in each band since the sampling rate is one-fourth of the HDTV sampling rate.
  • the search area will be increased by a factor of 2 in the horizontal and vertical directions.
  • each frequency band will require a one-fourth frame memory to store the previous frame. The memory requirement will therefore be the same as in the first case.
  • a more appealing approach may be to perform motion estimation only in the LL band. This will require only one motion estimation chip to be used. Another advantage is that a significant portion of the source noise will have been filtered out, thereby increasing the accuracy of the motion estimates.
  • For the other bands there are two alternatives. The first is to perform only intraframe coding on these bands. This will reduce the memory required and will simplify the implementation at the expense of a slight increase in the bit rate. Since intraframe coding is performed in the higher bands, the coding distortion in these bands in static image areas will not decrease with time. The second alternative is to use the motion vectors from the LL band to compensate the frames in the other bands, thus performing interframe coding in all the bands. The assumption is that the motion vectors obtained from the LL band are usually valid for the other bands. Although this approach has the potential of reducing bit rate and improving quality, it is more complex than the first alternative.
  • the U and V color difference frames will use the motion estimates obtained in the LL band.
  • Each 8 x 8 block in the LL band corresponds spacially to one 4 x 8 chrominance block because the chrominance signal is subsampled horizontally by a factor of two as compared to the LL signal. Consequently, a horizontal displacement of 1 pixel in the LL band corresponds to 0.5 pixel displacement in the chrominance bands.
  • the horizontal displacement estimates from the LL band will be halved, truncated to the nearest integer, and used to compensate the U and V frames.
  • the information bit rate for digital television signals can be reduced.
  • Applicants' experiments indicate that the motion estimation for motion compensation suffers from inaccuracy in the presence of source noise.
  • the simple technique discussed above in which motion detection only in the LL sub-band is used solves this problem.
  • the resultant motion vector will be used to compensate all sub-bands, if needed. This also allows for the use of existing VLSI circuits designed for normal television signals. Although the motion vector detected will have + 1 pixel uncertainty, the results are generally better than those obtained using the original HDTV signals.
  • each sub- band will have its own frame memory in the encoder to derive the frame difference signals for the individual bands after motion compensation.
  • the motion compensation is derived from the motion estimate of the LL band.
  • each component will need frame memory to reconstruct the current frame in its own band from the output of the look-up table.
  • a high-definition television coding scheme based on a newly developed simplified vector quantization (SVQ) technique used in conjunction with sub-band coding has been described above.
  • the coding algorithm is robust to source noise and capable of providing very high quality at a range of transmission data rates practical for satellite transmission.
  • an NTSC compatible channel is automatically available as one of the sub-bands.
  • the decoding algorithm is extremely simple and well-suited for point-to-muItipoint applications.
  • each sub-band can be subjected to a different type of SVQ coding to take advantage of the qualities of each sub- band. Specifically, since the higher frequency sub- bands consist primarily of edge-like signals, the SVQ coding scheme is particularly effective. The all important low-frequency band is coded with high fidelity using the baseline codebook and additional codewords when required.
  • the sub-band SVQ coding algorithm has been simulated on HDTV motion sequences up to 100 fields long using an HP-700 workstation coupled to an in-house-developed HDTV motion sequence capture and display facility.
  • the source is a SONY analog HDTV tape recorder which produces 1,125-line interlaced HDTV frames at a frame rate of 30 Hz.
  • the analog HDTV luminance and chrominance signals are sampled at rates of 54 and 13.5 MHz, respectively.
  • the revolving toy sequence has many intricate details and rapid motion.
  • the source S/N is only about 36 dB, and the results indicate the robustness of the coding algorithm.
  • the coded sequences are free from motion artifacts and the edges are coded with fidelity.
  • the active sampling rate is about 41 MHz.
  • the rate contribution due to the chrominance signals is less than 0.1 bit/pel.
  • the MIT sequence is an artificially generated zoom and pan of several highly detailed objects. Very good quality images were obtained at less than 0.5 bit/pel for luminance with no coding or motion artifacts.

Abstract

A coding technique for high-definition television signals involves splitting the luminance signal into four sub-bands using a sub-band encoder (11) and separately encoding (12) each sub-band using a simplified vector quantization technique. The characteristics of each respective simplified vector quantization encoder can be matched to the characteristics of the respective sub-bands.

Description

A CODING TECHNIQUE FOR HIGH DEFINITION TELEVISION SIGNALS
FIELD OF THE INVENTION
The present invention relates to the broadcasting of image signals, such as high definition television TV signals. Specifically, the invention involves a coding technique for transforming analog television signals into a digital bit stream made up of codewords for transmission. At the receiver end, the received digital bit stream is decoded back into TV signals.
BACKGROUND OF THE INVENTION
There are many ways to encode high definition television (HDTV) signals. These techniques can generally be classified as waveform coding, transform coding and vector quantization techniques.
Waveform coding techniques such as pulse code modulation (PCM) typically use a scalar quantizer to quantize analog samples. Additional steps may be employed to reduce the information that must be transmitted, such as sending only the difference of
PCM samples, such a technique is known as differential PCM, or DPCM. Waveform coding is generally simple to implement, but not very efficient as far as bandwidth compression is concerned. An example of waveform coding is the KDD/Canon 140 Mbit/s HDTV Codec.
The transform coding techniques transform the image samples to the transform domain, achieving energy compaction. Scalar quantizers are designed for each individual coefficient depending on its energy and the sensitivity of the human visual system (HVS) to that coefficient. Most popular video coding systems at this time are based on variations of the Discrete Cosine Transform (DCT) . Well designed transform algorithms achieve effective bandwidth compression while preserving good image quality. Examples of transform technique based codecs are Telettra's 68 Mbit/s HDTV codec and General Instruments•s 15 Mbit/s Digi-Cipher HDTV codec.
Sub-band coding is a coding technique which divides the HDTV signal into many small bands. As most of the signal energy is concentrated in the low- frequency bands, more information bits are allocated to the samples in the low-frequency bands. Also, different bands have different signal characteristics, and so different techniques can be employed to encode each individual sub-band. Examples of sub-band coding include the 140 Mbit/s codecs by Bellcore and NTT.
Waveform coding, transform coding, and to some extent, sub-band coding are "symmetrical" algorithms in the sense that the complexity of the encoder and decoder is about equal. For most of the video transmission and storage applications, there is a far greater number of decoders required than there is encoders, because each receiver must have a decoder but only the broadcaster need have an encoder. Therefore, "symmetrical" algorithms may not offer the best solution from an overall system cost standpoint.
Vector quantization (VQ) basically quantizes a group of samples at one time. VQ has the advantage that only a table look-up operation is needed to decode the signal, leading to an extremely simple decoder. However, it generally suffers from relatively poor quality. In addition, the encoder, which must search through a codebook to find the best
♦ vector to represent a group of samples, is generally 5 very computationally intensive. For high-quality television signals, HDTV in particular, real-time encoding requires extremely complicated hardware. Examples of vector quantization are disclosed in copending, commonly assigned Applications Serial Nos. 10 07/732,024 and 07/759,361, which are herein incorporated by reference.
SUMMARY OF THE INVENTION
An object of the present invention is to achieve an image signal coding technique which involves high is compression ratios with very little quality degradation, and further has simple hardware construction.
A further object of the invention is to achieve considerable bandwidth compression with a coding 20 technique even when the original signal is noisy.
The above objects are attained by creating a coding technique having the following characteristics.
A television signal coding method comprising the steps of: 25 (a) dividing a video signal into its luminance and chrominance components;
(b) dividing the luminance component into a plurality of sub-bands;
(c) simultaneously and independently performing p 30 the following coding on each sub-band: (i) dividing an image frame into a plurality of two-dimensional blocks, each block being composed of a predetermined number of pixels;
(ii) forming a codebook containing a plurality of basic vectors corresponding to the most common edge patterns to which the human visual system is highly sensitive;
(iii) comparing a block with said codebook; (iv) identifying the edge pattern having the closest correlation to the block;
(v) generating a digital signal including a codeword corresponding to the results of said step (iv) ; (vi) repeating said steps (iii) through
(v) for each of said plurality of blocks;
(d) simultaneously coding said luminance component; and
(e) combining the thus-coded luminance and chrominance components into a single bit stream for trans ission.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 represents a codebook of basic vectors for the SVQ coding technique of the present invention;
Figure 2 represents the encoder of the present invention;
Figure 3 represents the decoder of the present invention;
Figure 4 represents an interframe encoder of the present invention; and Figure 5 represents an interframe decoder of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The preferred embodiments of the present invention will now be described with reference to the above-mentioned figures.
Simplified Vector Quantization (SVQ) :
The present invention involves an improved coding technique for use, for example, with analog television signals which are to be broadcast to a plurality of receivers. The analog TV signals are transformed into a digital bit stream made up of codewords by the coder. The coder involves a vector quantization scheme which is different from the conventional vector quantization schemes discussed above, in that in the new vector quantization scheme, a very high quality picture can be encoded with very simple processing at relatively low bit rates. The encoding process is considerably simpler than the traditional vector quantization which requires a larger codebook and thus intensive computation. The quality is also substantially better than what normally can be achieved by traditional vector quantization techniques with reasonably sized codebooks.
According to the present invention, the codebook is greatly reduced in size. Specifically, the codebook only includes codewords which relate to image patterns to which the human visual system is highly sensitive and which occur often in typical images. The sample codebook shown in Figure 1, covers the most common edge patterns to be used. Each of these patterns is a separate basic vector used in the quantization. This type of quantization will be called hereinafter "simple vector quantization" (SVQ) .
The basic approach of the SVQ technique will now s be described. A single frame of a video signal can be modeled as a raster of pixels. The pixels are arranged in a two-dimensional array. If the raster is partitioned into a large number of small square blocks, each containing, for example, x 4 pixels, o monochrome images quantized to 8 bits per pixel would normally require 128 bits to represent the block without compression. However, the information of user interest in such a small block can be classified into only a few two-dimensional image patterns.
s In a typical television signal, Applicants have found that the information of perceptual significance to humans in such a block can essentially be represented by 16 patterns and up to 2 gray scale values (see Figure 1) . These patterns include a 0 simple square labelled 0 in Figure 1 which is all of the same color. This represents a DC value and is a class 0 pattern of constant intensity. This is also referred to as the DC class. This pattern represents low-frequency areas in the image.
5 A straightforward compression technique involves the selection of the best approximation of the input block out of the codebook of candidate patterns. The technique involves the following steps:
1. Start with the first pattern in the 30 codebook. 2. In accordance with the pattern orientation of this first pattern, compute two means from the 16 sample intensities (for all classes except the DC class - only one mean
5 is calculated for the DC class) .
3. Create a 4x4 block in which each of the 16 pixels has an intensity equal to the appropriate mean. For example, for class 1, the top four pixels will be filled with one
10 mean value and the bottom 12 blocks will be filled with the other mean value.
4. Compute the distortion between the created block and the actual input block.
5. Repeat steps 2-4 for all the patterns in the 15 codebook.
6. Choose the pattern which yields the minimum distortion.
The codeword that is transmitted is as follows. First, a small number of bits, for example, 4 bits,
20 would be used to designate which of the 16 patterns of the codebook is closest to the block of the input analog TV signal. For example, for the DC class pattern, a particular 4 bit sequence would be chosen for the beginning of the codeword make-up. Next,
25 parameter values would be added to the codeword to complete the make-up. The parameter values for the DC class would be the mean of the block, that is, the mean of the intensity of the block. For classes 1 to 15 two means will be calculated, a mean for each of
30 the two areas of the block. The different areas of the blocks are designated by different colors in Figure 1. As shown in Figure 1, for the simplified codebook involved with the present SVQ method, each block has only two areas of intensity difference. The invention is not limited to this embodiment, however. Specifically, larger codebooks are also contemplated.
In a situation in which there is an edge in the image, the mean value of each area separated by the edge will be calculated and added as parameters to the end of the first 4 bit segment making up the codeword.
Each block of an incoming analog TV signal frame is encoded by matching it up with the closest one of the basic vectors in the simplified codebook shown in Figure 1.
The non-DC classes of basic vectors of Figure 1 will now be described, these are classes 1-15 and involve two-mean patterns (i.e., the edge is modeled as a step change in intensity) . Classes 1-3 involve horizontal edge patterns in which the two sections of the block having different intensities are laid out horizontally with respect to each other. That is, if a vertical line is drawn through the block, the vertical line will cross two areas, each area having a different intensity.
Classes 4-6 of Figure 1 show basic vectors having vertical edges. For these classes, the change of intensity is the opposite as that described above with respect to the horizontal edges. That is, if a horizontal line is drawn across the block, the horizontal line will intersect two areas, each area having a different intensity. Classes 7-15 involve diagonal edges, and a diagonal line separates areas of different intensities. That is, in order to stay within a single intensity area of the block, a line must follow a diagonal direction.
For the classes 1-15, these classes require two scalar parameters, each one representing the mean of one of the two disjoint regions of the block. These parameters must be transmitted along with the identifier of the pattern, as discussed above. To summarize, the first few bits of the codeword involve an identifier to identify which of the 16 classes of the simplified codebook most nearly represents the corresponding block of the input analog video signal frame. The next group of bits is used to represent the mean of one of the areas of the block (assuming that the DC class is not involved) and the second group of bits represents the mean of the other disjoint area of the block. The mean is calculated by adding up all of the intensity levels of the pixels of each disjoint area and dividing this sum by the number of pixels in the disjoint area. If the DC class is involved, only one group of bits will follow the identifier, and this group of bits will represent the estimated DC value of intensity.
Further, if an edge pattern (classes 1-15) is involved, an additional number describing the step size of the edge will be required to be added to the end of the codeword. Since typical pictures contain a large number of DC blocks, the average number of bits needed to describe a block can be reduced to less than 16 bits with this scheme alone, corresponding to a compression greater than 8 to 1. In conjunction with a standard entropy coding technique such as Huffman coding, a 12-16:1 compression can easily be achieved. The block size can be as small as 2 x 2 and can be enlarged to greater than 4 x 4. The smaller the block size, the better the quality. The bigger - the block size, the higher the coding efficiency. Also, the quality can be improved by adding other visually perceptible, but less significant patterns. With entropy coding, a much larger codebook consisting of 128 or more patterns can be used without increasing the information rate significantly.
The high resolution information on the edges is preserved by the SVQ technique. Also, the encoding process involves a pattern matching operation and computations of a DC value and a step size (if an edge pattern is selected) . The computation requirement of this encoding process is still less than that required by two-dimensional DCT for the same size block. The decoding process involves only a table look-up of the pattern and simply filling the patterns with the appropriate pixel value(s) .
The SVQ technique is a spatial coding technique which utilizes the properties of the human vision system to compress the information bandwidth with a fairly simple encoder and an extremely simple decoder. The technique by itself can achieve a coding efficiency of 1 bit per pixel for a typical video picture with negligible degradation in image quality. It can also be used in conjunction with other coding schemes such as conditional replenishment, motion compensation, or sub-band coding to achieve further bandwidth compression, as will be described later below. The baseline codebook of the SVQ technique consists of edge patterns which makes it eminently suitable for interframe coding and intraframe coding of edge-like patterns. More complex features like textures and gradients can also be coded with high fidelity using an expanded codebook. The codewords are parameterized by adding additional scalars, thereby greatly increasing the codebook size and improving the quality while minimizing encoder complexity.
Because of the simplicity of the VQ decoder, vector quantization is an appealing coding technique for the broadcast environment. Since VQ encoder complexity is usually quite high, conventional VQ techniques use a codebook with a limited number of codewords to reduce the complexity of the search for the best codeword. However, codebooks are typically designed using mean square error as a criterion, which causes visually important features such as edges to be degraded.
The SVQ technique was developed to reduce transmitter complexity without sacrificing quality. The codewords-patterns chosen are those to which the human visual system is highly sensitive and which occur often in typical images. Techniques like the DCT technique have been observed to be ineffective in coding motion-compensated frame difference signals, which are usually impulsive, edge-like signals. Since SVQ can code edges with high fidelity, it appears to be particularly appropriate for these signals.
The size of the SVQ codebook is 15 x (256) 2 + 256 = 2 . However, since the codebook is parametric, its size should not be compared to that of a conventional VQ codebook. One of the advantages of SVQ is that it can have a large codebook, as computed above, without the correspondingly large search complexity. For the codebook shown in Figure 1, the nearest codevector can be found (using an exhaustive search) by computing only 16 distortions (the number of classes shown in Figure 1) .
An example will now be given of the SVQ coding. If an input 2 2 block, which is a portion of the video signal frame of the analog TV signals being input to the coder, has values of 2, 4, 6 and 8 as it enters the coder, and it corresponds most closely to the DC class of Figure 1, the SVQ coder will find the mean of this block, which is 5, and place a 5 in each of the four areas of the block and transmit the block with all 5's in it.
Note that the codebook described above is only a sample codebook described for illustrative purposes. The HDTV codec actually uses a codebook with 128 basic two-mean edges. More complex codebooks (i.e., with codevectors parameterized by more than two values) can also be designed.
A fast search technique will now be described for quickly determining which of the 16 basic vectors in Figure 1 is most similar to an input block of a video signal frame of the analog TV signal. This search technique is much quicker and more efficient than the exhaustive search technique described above.
As mentioned above, 128 classes are used in
Applicants1 experiments, which would require computing 128 distortions if an exhaustive search is used to find the best codevector and parameters. While the complexity of an exhaustive search is much lower for SVQ than for conventional VQ, it still is high. A s fast search method has been developed which requires
* the computation of only a few distortions per 4 x 4 block.
The advantages of the fast search algorithm are that it is extremely simple and the computational
10 complexity is independent of the codebook size. In general, its performance is almost as good as that of an exhaustive search. The fast search algorithm finds a two-mean edge pattern which can match the 4 x 4 input block; however, it does not guarantee that the
15 closest pattern will be found. In those infrequent cases when the closest pattern is not found, distortion is usually quite close to the global minimum.
The fast search uses a look-up table which is 20 created from the pattern codebook in an initial training phase. The look-up table is stored in a memory in the encoder. It is assumed that the pattern codebook is ordered according to the frequency of occurrence of the patterns in the training sequence. 25 The generation of the look-up table is first described.
1. Take a 4x4 pattern in the codebook. Replace all the pixels which have one value with a •0' and all the other pixels with a •1'.
30 2. Create a 16 bit word by scanning the block from left to right and top to bottom. 3. Repeat steps 1 and 2 for all the patterns in the codebook.
The codebook now consists of 16 bit codewords. The look-up table consists of 216 input value entries corresponding to 2 possible 16 bit codewords. For each of the 216 16 bit words, the codewords from the codebook which are closest in Hamming distance to it are found. In case there are multiple codewords at the minimum Hamming distance, the codeword which occurs first in the codebook is chosen. This codeword becomes the output entry in the look-up table corresponding to the 16 bit word. Therefore, the input entries of the look-up table are mutually exclusive and there are 2 of them and they represent all possible bit combinations of a 16 bit word. The output entries of the look-up table can assume 16 different values, the values of the 16 bit words generated from the codebook.
Once the look-up table is formed, the fast search is performed as follows.
1. For a given input block, subtract the block mean from every sample of the block.
2. Replace the positive samples by l's and the negative samples by O's and construct a 16 bit word by scanning the input block from left to right and top to bottom.
3. Find the output entry in the look-up table which corresponds to this input 16 bit word. This codeword will be the pattern used to code the block.
There are two reasons why the fast search may fail in finding the closest single edge pattern. If the input block has an ill defined edge, then the thresholding operation performed by the fast search may yield a word which is very different from the best codeword for the block. This is not the primary reason for the sub-optimality of the algorithm because, in any case, ill defined edges cannot be coded adequately by a single edge pattern. A more important reason is that the word obtained after thresholding may be at the minimum Hamming distance to several codewords and since the first codeword in the codebook is the one chosen (i.e. statistically more probably codewords are given preference) , it may not be the best codeword. There are two ways to alleviate this problem. The dimensionality of the codebook can be increased so that every 16 bit word is associated with up to, say, 4 codewords which are at the minimum Hamming distance from it. The encoder would then evaluate the distortion incurred by coding the block with each one of the 4 codeword patterns and would select the pattern which yields the minimum distortion. The other approach which does not increase the size of the codebook or the complexity of the encoder, is to choose the best codeword (out of all those at the minimum Hamming distance) at the look-up table creation stage using additional criteria. The 4x4 block nature of the codewords may help in finding the best codeword.
Now that the SVQ technique has been explained, a second embodiment of the invention will be described in which the SVQ technique is combined with sub-band coding in order to provide a composite coding scheme which provides further advantages as will be described below.
SVQ combined with sub-band coding;
Applying the SVQ technique directly to an HDTV signal does not yield the most satisfactory results due to the high noise level present in most HDTV source signals. Thus, Applicants have found that by combining the sub-band coding scheme and the SVQ technique for HDTV coding, a higher efficiency and higher quality coding can be achieved.
As most of the noise in the HDTV signal is in the higher frequency bands, SVQ can code the low-frequency bands more efficiently once the signal is divided into various sub-bands. Also, specific SVQ patterns can be designed to better fit the characteristics of the individual sub-bands.
Sub-band coding involves partitioning the input video signal into a number of disjoint sub-bands, each of which can be coded separately to optimally exploit the properties of the human visual system. Good results were obtained by Applicants by partitioning the image into 7 sub-bands and using differential pulse code modulation (DPCM) for the lowest frequency band and PCM with a coarse quantizer for the other bands. HDTV sequences have significant noise power at high frequencies and coding schemes based on sub-band coding can be made robust to source noise fairly easily. Robustness can be achieved simply by using coarse quantization in the higher sub-bands. The HDTV encoder of this embodiment will now be described with respect to Figure 2. In Figure 2, the input HDTV RGB video signal is input to a matrix 10. The analog HDTV signal enters the HDTV encoder in the form of red, green, blue (RGB) components. At the matrix 10, the input signal is separated into a luminance signal Y and chrominance signals U, V. The signals are also band-limited. In a specific design example, these components were limited to 24 MHz and 6 MHz, respectively, then sampled at 54 MHz and 13.5 MHz, respectively, and quantized into 8 bits each.
The luminance signal Y is then divided into four sub-bands, the low-horizontal low-vertical (LL) , low- horizontal high-vertical (LH) , high-horizontal low- vertical (HL) and high-horizontal high-vertical (HH) bands. Each of the sub-bands are then coded using SVQ. A sub-band encoder 11 is used to separate the luminance signal Y into the four sub-bands described above. Each of the sub-bands are fed into a separate SVQ encoder 12. The outputs of the respective SVQ encoders 12 are sent to a multiplexer (MUX) 13.
In one embodiment, the LL band is coded with patterns in 2 x 2 blocks, whereas all other bands are coded with 4 x 4 blocks. Use of the small 2 x 2 blocks ensures that the most critical LL band is coded with minimum distortion. In a second embodiment, the LL band is normally coded with 4 x 4 blocks. When coding with 4 x 4 blocks yields unacceptable distortion, 2 x 2 blocks are used. Because of significant noise presence in the HL and HH band, a noise coring will be applied to the signal before coding. Noise coring is equivalent to the use of a dead zone around 0. Since signals in those bands have negligible DC values, noise coring introduces little distortion while reducing the noise level significantly. Also, because the signal characteristics of each individual sub-band are different, SVQ entropy encoders are individually optimized to take advantage of the unique characteristics of the sub-bands.
The chrominance signals are not divided into sub- bands. These signals are vertically filtered and decimated, and then coded by the SVQ technique using the chrominance encoder 14. The output of the chrominance encoder is also fed to the multiplexer 13. A digital audio signal and other data, to be transmitted simultaneously with the HDTV signal, is also input to the multiplexer 13. The output of the multiplexer 13 is subjected to forward error correction by unit 15 and then transmitted to the modem for broadcasting.
At the receive side, the receive data will first be put into a buffer and then subjected to forward error correcting decoding by decoder 30 in Figure 3.
The output of decoder 30 is sent to the multiplexer 31 which separates the received signal into the four separate luminance sub-bands, a chrominance signal, and digital audio and data. The chrominance signal is sent to chrominance decoder 32 and the four luminance sub-bands are sent to four respective SVQ decoders 33 where they are decoded back into the sub-bands HH, HL,
LH and LL described above. The signals are entropy decoded and SVQ decoded into individual 4 x 4 (or 2 x
2) blocks using look-up tables. The HDTV signal will then be reconstructed by interpolating the individual sub-band signals and superimposing them together. Sub-band decoder 34 superimposes the signals back into the luminance signal Y which is sent to matrix 35. The chrominance signal output from the chrominance decoder 32 is also input to the matrix 35 and the HDTV RGB output signal is reconstructed at the output of matrix 35.
This general scheme can be used with more than four sub-bands. However, an important advantage of using four sub-bands is that an NTSC-equivalent signal is automatically available in the form of the LL band luminance signal. Therefore, it is easy to provide an NTSC signal directly from the received signal. This is useful in situations when it is desired to transmit both an HDTV signal and an NTSC signal simultaneously. For example, some users may not be able to view the HDTV signal but can only view the NTSC signal.
A variety of sub-band filter banks (quadrature modulations filters (QMFs) and perfect reconstruction filter banks) were used by Applicants in experiments, and are described in COMSAT Laboratories, "Phase 1 Final Report for the Flexible Rate HDTV Codec", NASA Contract NASW 4512, May 23, 1991. Both 2- and 16-tap QMF filters were found to be suitable, the 2-tap because of its simplicity and reasonable performance and the 16-tap for its performance. The higher tap filter would be preferable for a noisy source, since the sub-bands would be more isolated (less aliasing). A higher tap filter would also be preferable if the low band were to be used for extracting an NTSC compatible channel, again because of less aliasing.
Because of their complexity, perfect reconstruction filter banks were not chosen. The perfect reconstruction property is not very important since the coding distortion is significantly higher than the QMF filter bank distortion. Moreover, the QMF filter banks can be implemented much more efficiently than the perfect reconstruction filter banks.
While SVQ is more efficient that either PCM and
DPCM in most cases, it is particularly susceptible to high-frequency noise. Combining sub-band coding and SVQ coding results in a robust and efficient coding scheme.
Because the signal characteristics of each individual sub-band (HH, HL, LH and LL) are different, the SVQ codebooks, fast search look-up tables, and entropy coders are individually optimized to exploit the unique characteristics of each sub-band.
Motion estimation and compensation will now be described. With image signals, there is a certain amount of repetition of data, for example, if an image does not move for a long time, the data will remain constant. Also, there are certain instances when certain parts of the data of an image remain constant while other parts move. In these situations, it is efficient to only send information relative to the moving parts, there is no need to continuously send information as to parts that remain the same.
Because of sub-band coding there are several methods of motion estimation and compensation, as discussed below. The first and most straight-forward method is to perform the motion estimation and compensation prior to sub-band filtering. The sub- band SVQ coder will then have as its inputs either a block of the motion-compensated frame difference or a block of the current frame if the motion estimation has failed for any reason (scene change, image area occlusion or uncovering, nonuniform motion, etc.). The disadvantage of this approach is that the motion estimator must operate at an input rate equal to the HDTV sampling rate of 54 MHz or greater. Currently available chips cannot operate at this rate and several chips will have to be time multiplexed to perform motion estimation. In this chips, the search area is limited to about + 8 pels in both directions; this displacement may be insufficient for HDTV frames.
The second approach is to perform the motion estimation and compensation on each of the bands separately. Single-chip motion estimators can be used in each band since the sampling rate is one-fourth of the HDTV sampling rate. The search area will be increased by a factor of 2 in the horizontal and vertical directions. At the encoder and decoder, each frequency band will require a one-fourth frame memory to store the previous frame. The memory requirement will therefore be the same as in the first case.
A more appealing approach may be to perform motion estimation only in the LL band. This will require only one motion estimation chip to be used. Another advantage is that a significant portion of the source noise will have been filtered out, thereby increasing the accuracy of the motion estimates. For the other bands, there are two alternatives. The first is to perform only intraframe coding on these bands. This will reduce the memory required and will simplify the implementation at the expense of a slight increase in the bit rate. Since intraframe coding is performed in the higher bands, the coding distortion in these bands in static image areas will not decrease with time. The second alternative is to use the motion vectors from the LL band to compensate the frames in the other bands, thus performing interframe coding in all the bands. The assumption is that the motion vectors obtained from the LL band are usually valid for the other bands. Although this approach has the potential of reducing bit rate and improving quality, it is more complex than the first alternative.
In order to realize a simpler motion estimation module, motion estimation was performed only in the LL band, as shown in Figures 4 and 5, which are interframe encoder and decoder block diagrams, respectively. A block size of 8 x 8 was chosen as a reasonable compromise between reducing the overhead for transmitting the motion vectors and obtaining more accurate motion estimates. Simulations were performed to evaluate the two alternatives, intraframe coding or interframe coding in the higher bands. It was found that interframe coding in the high bands did not yield a significant reduction in bit rate for the original HDTV sequences tested. This was due to the high- frequency noise in the image sequence which was accentuated by the frame differencing operation. The observation that the interframe coding approach could reduce the bit rate if the input images had higher signal-to-noise ratios was confirmed for the MIT sequence, where coding the frame difference signal was instrumental in decreasing the bit rate by a factor of two or more in all the bands. The U and V color difference frames will use the motion estimates obtained in the LL band. Each 8 x 8 block in the LL band corresponds spacially to one 4 x 8 chrominance block because the chrominance signal is subsampled horizontally by a factor of two as compared to the LL signal. Consequently, a horizontal displacement of 1 pixel in the LL band corresponds to 0.5 pixel displacement in the chrominance bands. To avoid the complication of interpolation, the horizontal displacement estimates from the LL band will be halved, truncated to the nearest integer, and used to compensate the U and V frames.
By combining the coding approach with motion compensation and conditional replenishment, and other frequently used techniques, the information bit rate for digital television signals can be reduced. Applicants' experiments indicate that the motion estimation for motion compensation suffers from inaccuracy in the presence of source noise. The simple technique discussed above in which motion detection only in the LL sub-band is used solves this problem. The resultant motion vector will be used to compensate all sub-bands, if needed. This also allows for the use of existing VLSI circuits designed for normal television signals. Although the motion vector detected will have + 1 pixel uncertainty, the results are generally better than those obtained using the original HDTV signals. Since most of the information is in the LL band, applying motion compensation and conditional replenishment to only the LL band will have the advantage of hardware savings at the expense of a possibly slight increase in the information bit rate. Also, the information in the high-frequency bands is generally less correlated temporally. When applying motion compensation and conditional replenishment to a combination of sub-bands, each sub- band will have its own frame memory in the encoder to derive the frame difference signals for the individual bands after motion compensation. In the preferred embodiment, the motion compensation is derived from the motion estimate of the LL band. Similarly, at the receiver, each component will need frame memory to reconstruct the current frame in its own band from the output of the look-up table.
A high-definition television coding scheme based on a newly developed simplified vector quantization (SVQ) technique used in conjunction with sub-band coding has been described above. The coding algorithm is robust to source noise and capable of providing very high quality at a range of transmission data rates practical for satellite transmission. In addition, an NTSC compatible channel is automatically available as one of the sub-bands. The decoding algorithm is extremely simple and well-suited for point-to-muItipoint applications.
Recent advances in digital image coding in very large scale integration technology (VLSI) have enabled the compression of HDTV signals to very low bit rates. The low-complexity vector quantization based HDTV coding scheme discussed above yields very high-quality HDTV transmission at bit rates of 20 Mbit/s and above.
By combining SVQ coding and sub-band coding, each sub-band can be subjected to a different type of SVQ coding to take advantage of the qualities of each sub- band. Specifically, since the higher frequency sub- bands consist primarily of edge-like signals, the SVQ coding scheme is particularly effective. The all important low-frequency band is coded with high fidelity using the baseline codebook and additional codewords when required.
Simulation results will now be described. The sub-band SVQ coding algorithm has been simulated on HDTV motion sequences up to 100 fields long using an HP-700 workstation coupled to an in-house-developed HDTV motion sequence capture and display facility. The source is a SONY analog HDTV tape recorder which produces 1,125-line interlaced HDTV frames at a frame rate of 30 Hz. The analog HDTV luminance and chrominance signals are sampled at rates of 54 and 13.5 MHz, respectively. The revolving toy sequence has many intricate details and rapid motion. The source S/N is only about 36 dB, and the results indicate the robustness of the coding algorithm. The coded sequences are free from motion artifacts and the edges are coded with fidelity. Since the sampling rate is 54 MHz and the active total pel ratio is approximately 0.76, the active sampling rate is about 41 MHz. The rate contribution due to the chrominance signals is less than 0.1 bit/pel. The MIT sequence is an artificially generated zoom and pan of several highly detailed objects. Very good quality images were obtained at less than 0.5 bit/pel for luminance with no coding or motion artifacts.
The results for intraframe coding and interframe coding using 2- and 16-tap filters are shown in Table 1. Table 1. Results for the Sub-Band-SSVOC Scheme
Figure imgf000028_0001
The invention is not to be limited by the above- described embodiments, but only by the spirit and scope of the appended claims.

Claims

WHAT IS CLAIMED IS : 1. A television signal coding method comprising the steps of : (i) dividing an image frame into a plurality of two-dimensional blocks, each block being composed of a predetermined number of pixels; (ii) forming a codebook containing a plurality of basic vectors corresponding to the most common patterns to which the human visual system is highly sensitive; (iii) comparing a block with said codebook; (iv) identifying the pattern having the closest correlation to the block; (v) generating a digital signal including a codeword corresponding to the results of said step (iv); (vi) repeating said steps (iii) through (v) for each of said plurality of blocks.
2. A television signal coding method comprising the steps of: (a) dividing a video signal into its luminance and chrominance components; (b) dividing the luminance component into a plurality of sub-bands; (c) simultaneously and independently performing the following coding on each sub-band: (i) dividing an image frame into a plurality of two-dimensional blocks, each block being composed of a predetermined number of pixels; (ii) forming a codebook containing a plurality of basic vectors corresponding to the most common patterns to which the human visual system is highly sensitive; (iii) comparing a block with said codebook; (iv) identifying the pattern having the closest correlation to the block; (v) generating a digital signal including a codeword corresponding to the results of said step (iv) ; (vi) repeating said steps (iii) through (v) for each of said plurality of blocks . (d) simultaneously coding said luminance component; (e) combining the thus-coded luminance and chrominance components into a single bit stream for transmission.
3. A method according to claim 2 wherein said plurality in step (b) is equal to four.
4. A method according to claim 2 wherein said plurality in step (c) (ii) is less than or equal to 128.
5. A method according to claim 3 wherein the sub-band having the lowest frequency range of the four sub-bands contains the majority of the information to be transmitted.
1 6. A method according to claim 1 wherein said
2 predetermined number of pixels in step (c) (i) is made 3. to be one value for some sub-bands and another value 4 for other sub-bands.
1 7. A method according to claim 2 wherein said
2 codebook in step (c) (ii) involves edge patterns having
3 horizontal edges, vertical edges and diagonal edges.
8. A method according to claim 2 wherein said digital signal generated in step (c) (v) includes at least one segment representative of a mean value of the intensity of a portion of the pattern identified in step (c) (iv) .
9. A method according to claim 2 wherein said patterns involve edges related to step changes in intensity.
10. A method according to claim 1 wherein said step (iii) involves
(a) calculating the block mean and indicating the block mean from every pixel sample of the block;
(b) replacing the positive samples by one logic level and the negative samples of another logic level;
(c) constructing a digital word by scanning the block from left to right and top to bottom; and
(d) finding an output entry in a look-up-table which corresponds to the digital word constructed in step (c) .
PCT/US1993/003117 1992-04-10 1993-04-09 A coding technique for high definition television signals WO1993021734A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/866,851 US5337085A (en) 1992-04-10 1992-04-10 Coding technique for high definition television signals
US866,851 1992-04-10

Publications (1)

Publication Number Publication Date
WO1993021734A1 true WO1993021734A1 (en) 1993-10-28

Family

ID=25348567

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1993/003117 WO1993021734A1 (en) 1992-04-10 1993-04-09 A coding technique for high definition television signals

Country Status (4)

Country Link
US (1) US5337085A (en)
CN (1) CN1081052A (en)
AU (1) AU4277793A (en)
WO (1) WO1993021734A1 (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0502545B1 (en) * 1991-03-07 1996-09-11 Mitsubishi Denki Kabushiki Kaisha Encoding apparatus for encoding a digital image signal
EP0576763A1 (en) * 1992-06-30 1994-01-05 International Business Machines Corporation Improved method for sub-band coding video signals and device for implementing said method
US5459585A (en) * 1992-09-09 1995-10-17 Hitachi, Ltd. Apparatus and method of storing image signals
JPH06284392A (en) * 1993-03-30 1994-10-07 Toshiba Corp Video signal transmitter and receiver
JP3614448B2 (en) * 1993-07-22 2005-01-26 日本放送協会 Image signal encoding and multiplexing method and apparatus
US5929913A (en) * 1993-10-28 1999-07-27 Matsushita Electrical Industrial Co., Ltd Motion vector detector and video coder
JP3385077B2 (en) * 1993-10-28 2003-03-10 松下電器産業株式会社 Motion vector detection device
GB2287602B (en) * 1994-03-16 1998-06-03 Hyundai Electronics Ind Motion vector decoding apparatus and method
US5477110A (en) * 1994-06-30 1995-12-19 Motorola Method of controlling a field emission device
EP0704836B1 (en) * 1994-09-30 2002-03-27 Kabushiki Kaisha Toshiba Vector quantization apparatus
JP3036392B2 (en) * 1995-02-28 2000-04-24 日本電気株式会社 Motion compensator on subband signal
JP3249729B2 (en) * 1995-10-24 2002-01-21 シャープ株式会社 Image encoding device and image decoding device
US5686963A (en) * 1995-12-26 1997-11-11 C-Cube Microsystems Method for performing rate control in a video encoder which provides a bit budget for each frame while employing virtual buffers and virtual buffer verifiers
US6157746A (en) 1997-02-12 2000-12-05 Sarnoff Corporation Apparatus and method for encoding wavelet trees generated by a wavelet-based coding method
US6584226B1 (en) * 1997-03-14 2003-06-24 Microsoft Corporation Method and apparatus for implementing motion estimation in video compression
US6154216A (en) * 1997-04-30 2000-11-28 Ati Technologies Method and apparatus for decompression of a two dimensional video texture map
US6208692B1 (en) 1997-12-31 2001-03-27 Sarnoff Corporation Apparatus and method for performing scalable hierarchical motion estimation
US7158681B2 (en) * 1998-10-01 2007-01-02 Cirrus Logic, Inc. Feedback scheme for video compression system
US6625216B1 (en) 1999-01-27 2003-09-23 Matsushita Electic Industrial Co., Ltd. Motion estimation using orthogonal transform-domain block matching
FR2805429B1 (en) * 2000-02-21 2002-08-16 Telediffusion Fse DISTRIBUTED DIGITAL QUALITY CONTROL METHOD BY DETECTING FALSE CONTOURS
JP3802422B2 (en) 2000-04-24 2006-07-26 ロッキード・マーティン・コーポレイション Passive coherent positioning system and method of using the system
GB0016838D0 (en) * 2000-07-07 2000-08-30 Forbidden Technologies Plc Improvements relating to representations of compressed video
IL155513A0 (en) * 2000-10-20 2003-11-23 Lockheed Corp Civil aviation passive coherent location system and method
US7936814B2 (en) * 2002-03-28 2011-05-03 International Business Machines Corporation Cascaded output for an encoder system using multiple encoders
US8385670B2 (en) * 2008-08-20 2013-02-26 Microsoft Corporation Image restoration by vector quantization utilizing visual patterns
CN111899746B (en) * 2016-03-21 2022-10-18 华为技术有限公司 Adaptive quantization of weighting matrix coefficients

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4558350A (en) * 1982-06-11 1985-12-10 Mitsubishi Denki Kabushiki Kaisha Vector quantizer
US5136374A (en) * 1990-04-03 1992-08-04 At&T Bell Laboratories Geometric vector quantization
US5172228A (en) * 1991-11-19 1992-12-15 Utah State University Foundation Image compression method and apparatus employing distortion adaptive tree search vector quantization

Family Cites Families (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4394774A (en) * 1978-12-15 1983-07-19 Compression Labs, Inc. Digital video compression system and methods utilizing scene adaptive coding with rate buffer feedback
US4477829A (en) * 1981-08-21 1984-10-16 Institut Kosmicheskikh Issledovany Akademii Nauk Sssr Method of obtaining multizone images of objects and multizone system therefor
US4493105A (en) * 1982-03-31 1985-01-08 General Electric Company Method and apparatus for visual image processing
DE3239273A1 (en) * 1982-10-23 1984-04-26 Bayer Ag, 5090 Leverkusen TETRAHYDROPYRIDINE, METHOD FOR THE PRODUCTION THEREOF AND THEIR USE IN MEDICINAL PRODUCTS
DE3317115A1 (en) * 1983-05-10 1984-11-15 Siemens AG, 1000 Berlin und 8000 München METHOD FOR TRANSMITTING DIGITAL LUMINANCE AND CHROMINANCE SIGNALS FROM TELEVISION
US4670851A (en) * 1984-01-09 1987-06-02 Mitsubishi Denki Kabushiki Kaisha Vector quantizer
JPS60186179A (en) * 1984-03-06 1985-09-21 Nec Corp System and device for predictive coding of picture signal
JPH0772769B2 (en) * 1984-09-26 1995-08-02 ソニー株式会社 Solid-state imaging device
EP0193185B1 (en) * 1985-02-28 1992-05-13 Mitsubishi Denki Kabushiki Kaisha Interframe adaptive vector quantization encoding apparatus
CA1251276A (en) * 1985-03-20 1989-03-14 Toshio Koga Method and arrangement of coding digital image signals utilizing interframe correlation
US4665436A (en) * 1985-12-20 1987-05-12 Osborne Joseph A Narrow bandwidth signal transmission
JP2506332B2 (en) * 1986-03-04 1996-06-12 国際電信電話株式会社 High-efficiency coding method for moving image signals
US4698689A (en) * 1986-03-28 1987-10-06 Gte Laboratories Incorporated Progressive image transmission
DE3774314D1 (en) * 1986-04-04 1991-12-12 Siemens Ag METHOD FOR REDUCING DATA FROM DIGITAL IMAGE SIGNALS BY VECTOR QUANTIZATION OF COEFFICIENTS GIVEN BY ORTHONORMAL TRANSFORMATION BY MEANS OF A SYMMETRICAL FAST-CYCLE HADAMARD MATRIX.
US4947447A (en) * 1986-04-24 1990-08-07 Hitachi, Ltd. Method for data coding
KR910000707B1 (en) * 1986-05-26 1991-01-31 미쓰비시덴기 가부시기가이샤 Method and apparatus for encoding transmitting
US4887151A (en) * 1986-06-30 1989-12-12 Canon Kabushiki Kaisha Encoding apparatus for color image data with block-by-block individual quantizing or individual encoding of luminosity, structure, and color information
US4704628A (en) * 1986-07-16 1987-11-03 Compression Labs, Inc. Combined intraframe and interframe transform coding system
WO1988002975A1 (en) * 1986-10-16 1988-04-21 Mitsubishi Denki Kabushiki Kaisha Amplitude-adapted vector quantizer
US4920426A (en) * 1986-11-10 1990-04-24 Kokusai Denshin Denwa Co., Ltd. Image coding system coding digital image signals by forming a histogram of a coefficient signal sequence to estimate an amount of information
GB8627787D0 (en) * 1986-11-20 1986-12-17 British Telecomm Pattern processing
CA1261069A (en) * 1986-12-08 1989-09-26 Mohamed S. Sabri Two-channel coding of digital signals
DE3856461T2 (en) * 1987-04-28 2001-10-31 Mitsubishi Electric Corp Image coding and decoding system
DE3855114D1 (en) * 1987-05-06 1996-04-25 Philips Patentverwaltung System for the transmission of video images
US4791654A (en) * 1987-06-05 1988-12-13 American Telephone And Telegraph Company, At&T Bell Laboratories Resisting the effects of channel noise in digital transmission of information
US4969039A (en) * 1987-07-01 1990-11-06 Nec Corporation Image processing system operable in cooperation with a recording medium
DE3877105D1 (en) * 1987-09-30 1993-02-11 Siemens Ag, 8000 Muenchen, De
US4868653A (en) * 1987-10-05 1989-09-19 Intel Corporation Adaptive digital video compression system
GB8724789D0 (en) * 1987-10-19 1987-11-25 British Telecomm Signal coding
EP0330455A3 (en) * 1988-02-22 1990-07-04 Kabushiki Kaisha Toshiba Image encoding apparatus
US4821119A (en) * 1988-05-04 1989-04-11 Bell Communications Research, Inc. Method and apparatus for low bit-rate interframe video coding
US5010401A (en) * 1988-08-11 1991-04-23 Mitsubishi Denki Kabushiki Kaisha Picture coding and decoding apparatus using vector quantization
US4953023A (en) * 1988-09-29 1990-08-28 Sony Corporation Coding apparatus for encoding and compressing video data
DE68922578T2 (en) * 1988-11-08 1996-01-11 Philips Electronics Nv Encoding, decoding and transmission system for television pictures.
FR2639739B1 (en) * 1988-11-25 1991-03-15 Labo Electronique Physique METHOD AND DEVICE FOR COMPRESSING IMAGE DATA USING A NEURON NETWORK
US4910608A (en) * 1988-12-09 1990-03-20 Harris Corporation Imagery data compression mechanism employing adaptive vector quantizer
US5003377A (en) * 1989-01-12 1991-03-26 Massachusetts Institute Of Technology Extended definition television systems
JPH0797753B2 (en) * 1989-01-24 1995-10-18 日本ビクター株式会社 Encoding output data amount control method
US4979039A (en) * 1989-01-30 1990-12-18 Information Technologies Research Inc. Method and apparatus for vector quantization by hashing
JPH02301280A (en) * 1989-05-15 1990-12-13 Nec Corp Coding/decoding system for moving picture signal
US4987480A (en) * 1989-07-11 1991-01-22 Massachusetts Institute Of Technology Multiscale coding of images
US4963030A (en) * 1989-11-29 1990-10-16 California Institute Of Technology Distributed-block vector quantization coder
US5021891A (en) * 1990-02-27 1991-06-04 Qualcomm, Inc. Adaptive block size image compression method and system
US5030953A (en) * 1990-07-11 1991-07-09 Massachusetts Institute Of Technology Charge domain block matching processor
US5134475A (en) * 1990-12-11 1992-07-28 At&T Bell Laboratories Adaptive leak hdtv encoder

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4558350A (en) * 1982-06-11 1985-12-10 Mitsubishi Denki Kabushiki Kaisha Vector quantizer
US5136374A (en) * 1990-04-03 1992-08-04 At&T Bell Laboratories Geometric vector quantization
US5172228A (en) * 1991-11-19 1992-12-15 Utah State University Foundation Image compression method and apparatus employing distortion adaptive tree search vector quantization

Also Published As

Publication number Publication date
US5337085A (en) 1994-08-09
AU4277793A (en) 1993-11-18
CN1081052A (en) 1994-01-19

Similar Documents

Publication Publication Date Title
US5337085A (en) Coding technique for high definition television signals
US5485210A (en) Digital advanced television systems
Zafar et al. Multiscale video representation using multiresolution motion compensation and wavelet decomposition
US5128756A (en) High definition television coding arrangement with graceful degradation
EP0630157B1 (en) Systems and methods for coding alternate fields of interlaced video sequences
US4987480A (en) Multiscale coding of images
US6414992B1 (en) Optimal encoding of motion compensated video
EP0490538B1 (en) An adaptive leak HDTV encoder
EP0734164B1 (en) Video signal encoding method and apparatus having a classification device
Ho et al. Classified transform coding of images using vector quantization
KR100591114B1 (en) Video processing
US6621864B1 (en) Motion vector based frame insertion process for increasing the frame rate of moving images
US5361098A (en) Methods and apparatus for generating a picture-in-picture digital television frame by inserting a mean-only frame into a full-size frame
CA2020008A1 (en) Method of processing video image data for use in the storage or transmission of moving digital images
US6628714B1 (en) Down converting MPEG encoded high definition sequences to lower resolution with reduced memory in decoder loop
Mohsenian et al. Edge-based subband VQ techniques for images and video
Voukelatos et al. Very low bit-rate color video coding using adaptive subband vector quantization with dynamic bit allocation
US5790207A (en) Motion compensation method for use in an image encoding system
US5907360A (en) Coder/decoder for television image sub-band compatible coding, and its application to hierarchical motion coding by tree structures
Gharavi Differential sub-band coding of video signals
US5760845A (en) Method for determining motion vectors based on a block matching motion estimation technique
Apostolopoulos et al. Video compression for digital advanced television systems
Netravali et al. A high quality digital HDTV codec
Apostolopoulos Video compression
Shishikui et al. Region support DCT (RS-DCT) for coding of arbitrarily shaped texture

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP KR

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA