Recherche Images Maps Play YouTube Actualités Gmail Drive Plus »
Connexion
Les utilisateurs de lecteurs d'écran peuvent cliquer sur ce lien pour activer le mode d'accessibilité. Celui-ci propose les mêmes fonctionnalités principales, mais il est optimisé pour votre lecteur d'écran.

Brevets

  1. Recherche avancée dans les brevets
Numéro de publicationUS20030156646 A1
Type de publicationDemande
Numéro de demandeUS 10/322,351
Date de publication21 août 2003
Date de dépôt17 déc. 2002
Date de priorité17 déc. 2001
Autre référence de publicationCN101448162A, CN101448162B, US7116830, US7120197, US7263232, US7266149, US7577305, US8743949, US8817868, US8908768, US9258570, US9432686, US9456216, US20030138150, US20030152146, US20030156648, US20060126955, US20060215919, US20080049834, US20130301704, US20130301732, US20140307776, US20140334534, US20150063459, US20160227215, US20160366443, US20160373780
Numéro de publication10322351, 322351, US 2003/0156646 A1, US 2003/156646 A1, US 20030156646 A1, US 20030156646A1, US 2003156646 A1, US 2003156646A1, US-A1-20030156646, US-A1-2003156646, US2003/0156646A1, US2003/156646A1, US20030156646 A1, US20030156646A1, US2003156646 A1, US2003156646A1
InventeursPohsiang Hsu, Chih-Lung Lin, Ming-Chieh Lee
Cessionnaire d'origineMicrosoft Corporation
Exporter la citationBiBTeX, EndNote, RefMan
Liens externes: USPTO, Cession USPTO, Espacenet
Multi-resolution motion estimation and compensation
US 20030156646 A1
Résumé
Techniques and tools for motion estimation and compensation are described. For example, a video encoder adaptively switches between different motion resolutions, which allows the encoder to select a suitable resolution for a particular video source or coding circumstances.
Images(12)
Previous page
Next page
Revendications(37)
We claim:
1. In a computer system, a computer-implemented method of exploiting temporal redundancy between plural video frames, the method comprising:
selecting a fractional pixel motion resolution from among plural different fractional pixel motion resolutions, wherein each of the plural different fractional pixel motion resolutions is less than single integer-pixel motion resolution, and wherein each of the plural different fractional pixel motion resolutions is associated with a different reference frame sub-pixel interpolation technique; and
applying one or more motion vectors at the selected fractional pixel motion resolution to predict one or more pixels in a current frame of the plural video frames relative to one or more corresponding pixels in a reference frame of the plural video frames.
2. The method of claim 1 wherein the plural different fractional pixel motion resolutions include a quarter-pixel motion resolution.
3. The method of claim 1 wherein each of the one or more motion vectors is for a macroblock.
4. The method of claim 1 wherein the selecting occurs on a per frame basis.
5. The method of claim 1 wherein the selecting occurs on a per sequence basis.
6. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 1 during video encoding.
7. The method of claim 1 wherein a video encoder performs the selecting based upon evaluation of the plural different fractional pixel motion resolutions.
8. The method of claim 1 wherein the selecting depends at least in part on a quantization factor.
9. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 1 during video decoding.
10. The method of claim 1 wherein a video decoder performs the selecting based upon information received from the encoder.
11. A computer-readable medium storing computer-executable instructions for causing a computer system programmed thereby to perform a method of exploiting temporal redundancy between video frames from a video source during video encoding, the method comprising:
selecting a motion resolution from among plural different motion resolutions, wherein the selecting depends at least in part upon evaluation of the plural different motion resolutions, wherein the evaluation includes a first motion estimation for a first motion resolution of the plural different motion resolutions and a second motion estimation for a second motion resolution of the plural different motion resolutions, and wherein the selecting further depends at least in part upon noise level of the video frames from the video source; and
applying motion information at the selected motion resolution to predict one or more pixels in a current frame of the plural video frames relative to one or more corresponding pixels in a reference frame of the plural video frames.
12. The computer-readable medium of claim 11 wherein the evaluation is a closed loop evaluation of the plural different motion resolutions.
13. The computer-readable medium of claim 11 wherein the evaluation is an open loop evaluation of the plural different motion resolutions.
14. The computer-readable medium of claim 11 wherein the selecting further depends at least in part upon a quantization factor.
15. A computer-readable medium storing computer-executable instructions for causing a computer system programmed thereby to perform a method of exploiting temporal redundancy between video frames during video encoding, the method comprising:
selecting a motion resolution from among plural different motion resolutions after different motion estimation for each of the plural different motion resolutions, wherein the selecting depends at least in part upon a quantization factor; and
applying motion information at the selected motion resolution to predict one or more pixels in a current frame of the plural video frames relative to one or more corresponding pixels in a reference frame of the plural video frames.
16. A computer-readable medium storing computer-executable instructions for causing a computer system programmed thereby to perform a method of exploiting temporal redundancy between video frames, the method comprising:
in a first evaluation for a first motion resolution, computing intermediate motion evaluation results for the first motion resolution; and
in a second evaluation for a second motion resolution, using the intermediate motion evaluation results in computation of final motion evaluation results for the second motion resolution.
17. The computer-readable medium of claim 16 wherein the first motion resolution is a half-pixel resolution, and wherein the second motion resolution is a quarter-pixel resolution.
18. The computer-readable medium of claim 16 wherein the intermediate motion evaluation results include an integer-pixel motion vector used as an intermediate motion vector result for the first motion resolution and for the second motion resolution.
19. The computer-readable medium of claim 18 wherein an encoder uses the integer-pixel motion vector when the integer-pixel motion vector falls within a search range for the second evaluation.
20. The computer-readable medium of claim 16 wherein the intermediate motion evaluation results include interpolated pixel values at half-pixel locations.
21. In a computer system, a computer-implemented method of exploiting temporal redundancy between plural video frames, the method comprising:
computing a pixel value at each of plural half-pixel sample positions in a reference frame of the plural video frames, the reference frame including plural pixel values at integer-pixel sample positions organized by row and column, wherein:
for each of the plural half-pixel sample positions in either an integer-pixel row or an integer-pixel column, the computed pixel value is a function of the pixel values at at least four integer-pixel sample positions, and
for each of the plural half-pixel sample positions in both a half-pixel row and a half-pixel column, the computed pixel value is a function of the pixel values at at least four half-pixel sample positions; and
computing a pixel value at each of plural quarter-pixel sample positions in the reference frame, wherein:
for each of the plural quarter-pixel sample positions, the computed pixel value is a function of the pixel values at two sample positions adjacent the quarter-pixel sample position and on opposites sides of the quarter-pixel sample position, wherein each of the two sample positions is either an integer-pixel sample position or a half-pixel sample position.
22. The method of claim 21 wherein the pixel values are luminance values.
23. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 21 during video encoding.
24. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 21 during video decoding.
25. In a computer system, a computer-implemented method of exploiting temporal redundancy between plural video frames, the method comprising:
defining a motion prediction range in a reference frame of the plural video frames, wherein the defined motion prediction range has a horizontal motion resolution and a vertical motion resolution, and wherein the horizontal motion resolution is different than the vertical motion resolution; and
applying motion information for one or more pixels of a current frame of the plural video frames relative to one or more corresponding pixels in the defined motion prediction range in the reference frame.
26. The method of claim 25 wherein the horizontal motion resolution is finer than the vertical motion resolution.
27. The method of claim 26 wherein the horizontal motion resolution is quarter pixel and the vertical motion resolution is half pixel.
28. The method of claim 25 further comprising computing the motion information in a video encoder, wherein the video encoder interpolates pixel values at different numbers of sub-pixel locations horizontally and vertically.
29. The method of claim 25 wherein the motion information comprises motion vector information with different horizontal and vertical component resolutions.
30. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 25 during video encoding.
31. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 25 during video decoding.
32. In a computer system, a computer-implemented method of exploiting temporal redundancy between plural video frames, the method comprising:
defining a motion prediction range in a reference frame of the plural video frames, including using plural different sub-pixel interpolation filters in the reference frame; and
applying motion information for one or more pixels of a current frame of the plural video frames relative to one or more corresponding pixels in the defined motion prediction range in the reference frame, wherein the motion information includes a horizontal motion component and a vertical motion component, and wherein motion resolution of the vertical motion component is different than motion resolution of the horizontal motion component.
33. The method of claim 32 wherein an extra bit associated with the motion information in a bitstream increases the motion resolution of the horizontal motion component by a factor of 2.
34. The method of claim 32 further comprising checking whether differential motion resolution is enabled for the current frame and if so performing the applying.
35. The method of claim 32 further comprising checking whether differential motion resolution applies for the motion information and if so performing the applying.
36. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 32 during video encoding.
37. A computer-readable medium storing computer-executable instructions for causing the computer system to perform the method of claim 32 during video encoding.
Description
    RELATED APPLICATION INFORMATION
  • [0001]
    The present application claims the benefit of U.S. Provisional Patent Application Serial No. 60/341,674, entitled “Techniques and Tools for Video Encoding and Decoding,” filed Dec. 17, 2001, the disclosure of which is incorporated by reference. The following concurrently filed U.S. patent applications relate to the present application: 1) U.S. patent application Ser. No. aa/bbb,ccc, entitled, “Motion Compensation Loop With Filtering,” filed concurrently herewith; 2) U.S. patent application Ser. No. aa/bbb,ccc, entitled, “Spatial Extrapolation of Pixel Values in Intraframe Video Coding and Decoding,” filed concurrently herewith; and 3) U.S. patent application Ser. No. aa/bbb,ccc, entitled, “Sub-Block Transform Coding of Prediction Residuals,” filed concurrently herewith.
  • TECHNICAL FIELD
  • [0002]
    Techniques and tools for motion estimation and compensation are described. For example, a video encoder adaptively switches between different motion resolutions, which allows the encoder to select a suitable resolution for a particular video source or coding circumstances.
  • BACKGROUND
  • [0003]
    Digital video consumes large amounts of storage and transmission capacity. A typical raw digital video sequence includes 15 or 30 frames per second. Each frame can include tens or hundreds of thousands of pixels (also called pels). Each pixel represents a tiny element of the picture. In raw form, a computer commonly represents a pixel with 24 bits. Thus, the number of bits per second, or bitrate, of a typical raw digital video sequence can be 5 million bits/second or more.
  • [0004]
    Most computers and computer networks lack the resources to process raw digital video. For this reason, engineers use compression (also called coding or encoding) to reduce the bitrate of digital video. Compression can be lossless, in which quality of the video does not suffer but decreases in bitrate are limited by the complexity of the video. Or, compression can be lossy, in which quality of the video suffers but decreases in bitrate are more dramatic. Decompression reverses compression.
  • [0005]
    In general, video compression techniques include intraframe compression and interframe compression. Intraframe compression techniques compress individual frames, typically called I-frames, or key frames. Interframe compression techniques compress frames with reference to preceding and/or following frames, and are called typically called predicted frames, P-frames, or B-frames.
  • [0006]
    Microsoft Corporation's Windows Media Video, Version 7 [“WMV7”] includes a video encoder and a video decoder. The WMV7 encoder uses intraframe and interframe compression, and the WMV7 decoder uses intraframe and interframe decompression.
  • [0007]
    A. Intraframe Compression in WMV7
  • [0008]
    [0008]FIG. 1 illustrates block-based intraframe compression (100) of a block (105) of pixels in a key frame in the WMV7 encoder. A block is a set of pixels, for example, an 8×8 arrangement of pixels. The WMV7 encoder splits a key video frame into 8×8 blocks of pixels and applies an 8×8 Discrete Cosine Transform [“DCT”] (110) to individual blocks such as the block (105). A DCT is a type of frequency transform that converts the 8×8 block of pixels (spatial information) into an 8×8 block of DCT coefficients (115), which are frequency information. The DCT operation itself is lossless or nearly lossless. Compared to the original pixel values, however, the DCT coefficients are more efficient for the encoder to compress since most of the significant information is concentrated in low frequency coefficients (conventionally, the upper left of the block (115)) and many of the high frequency coefficients (conventionally, the lower right of the block (115)) have values of zero or close to zero.
  • [0009]
    The encoder then quantizes (120) the DCT coefficients, resulting in an 8×8 block of quantized DCT coefficients (125). For example, the encoder applies a uniform, scalar quantization step size to each coefficient, which is analogous to dividing each coefficient by the same value and rounding. For example, if a DCT coefficient value is 163 and the step size is 10, the quantized DCT coefficient value is 16. Quantization is lossy. The reconstructed DCT coefficient value will be 160, not 163. Since low frequency DCT coefficients tend to have higher values, quantization results in loss of precision but not complete loss of the information for the coefficients. On the other hand, since high frequency DCT coefficients tend to have values of zero or close to zero, quantization of the high frequency coefficients typically results in contiguous regions of zero values. In addition, in some cases high frequency DCT coefficients are quantized more coarsely than low frequency DCT coefficients, resulting in greater loss of precision/information for the high frequency DCT coefficients.
  • [0010]
    The encoder then prepares the 8×8 block of quantized DCT coefficients (125) for entropy encoding, which is a form of lossless compression. The exact type of entropy encoding can vary depending on whether a coefficient is a DC coefficient (lowest frequency), an AC coefficient (other frequencies) in the top row or left column, or another AC coefficient.
  • [0011]
    The encoder encodes the DC coefficient (126) as a differential from the DC coefficient (136) of a neighboring 8×8 block, which is a previously encoded neighbor (e.g., top or left) of the block being encoded. (FIG. 1 shows a neighbor block (135) that is situated to the left of the block being encoded in the frame.) The encoder entropy encodes (140) the differential.
  • [0012]
    The entropy encoder can encode the left column or top row of AC coefficients as a differential from a corresponding column or row of the neighboring 8×8 block. FIG. 1 shows the left column (127) of AC coefficients encoded as a differential (147) from the left column (137) of the neighboring (to the left) block (135). The differential coding increases the chance that the differential coefficients have zero values. The remaining AC coefficients are from the block (125) of quantized DCT coefficients.
  • [0013]
    The encoder scans (150) the 8×8 block (145) of predicted, quantized AC DCT coefficients into a one-dimensional array (155) and then entropy encodes the scanned AC coefficients using a variation of run length coding (160). The encoder selects an entropy code from one or more run/level/last tables (165) and outputs the entropy code.
  • [0014]
    A key frame contributes much more to bitrate than a predicted frame. In low or mid-bitrate applications, key frames are often critical bottlenecks for performance, so efficient compression of key frames is critical.
  • [0015]
    [0015]FIG. 2 illustrates a disadvantage of intraframe compression such as shown in FIG. 1. In particular, exploitation of redundancy between blocks of the key frame is limited to prediction of a subset of frequency coefficients (e.g., the DC coefficient and the left column (or top row) of AC coefficients) from the left (220) or top (230) neighboring block of a block (210). The DC coefficient represents the average of the block, the left column of AC coefficients represents the averages of the rows of a block, and the top row represents the averages of the columns. In effect, prediction of DC and AC coefficients as in WMV7 limits extrapolation to the row-wise (or column-wise) average signals of the left (or top) neighboring block. For a particular row (221) in the left block (220), the AC coefficients in the left DCT coefficient column for the left block (220) are used to predict the entire corresponding row (211) of the block (210). The disadvantages of this prediction include:
  • [0016]
    1) Since the prediction is based on averages, the far edge of the neighboring block has the same influence on the predictor as the adjacent edge of the neighboring block, whereas intuitively the far edge should have a smaller influence.
  • [0017]
    2) Only the average pixel value across the row (or column) is extrapolated.
  • [0018]
    3) Diagonally oriented edges or lines that propagate from either predicting block (top or left) to the current block are not predicted adequately.
  • [0019]
    4) When the predicting block is to the left, there is no enforcement of continuity between the last row of the top block and the first row of the extrapolated block.
  • [0020]
    B. Interframe Compression in WMV7
  • [0021]
    Interframe compression in the WMV7 encoder uses block-based motion compensated prediction coding followed by transform coding of the residual error.
  • [0022]
    [0022]FIGS. 3 and 4 illustrate the block-based interframe compression for a predicted frame in the WMV7 encoder. In particular, FIG. 3 illustrates motion estimation for a predicted frame (310) and FIG. 4 illustrates compression of a prediction residual for a motion-estimated block of a predicted frame.
  • [0023]
    The WMV7 encoder splits a predicted frame into 8×8 blocks of pixels. Groups of 4 8×8 blocks form macroblocks. For each macroblock, a motion estimation process is performed. The motion estimation approximates the motion of the macroblock of pixels relative to a reference frame, for example, a previously coded, preceding frame. In FIG. 3, the WMV7 encoder computes a motion vector for a macroblock (315) in the predicted frame (310). To compute the motion vector, the encoder searches in a search area (335) of a reference frame (330). Within the search area (335), the encoder compares the macroblock (315) from the predicted frame (310) to various candidate macroblocks in order to find a candidate macroblock that is a good match. The encoder can check candidate macroblocks every pixel or every ½ pixel in the search area (335), depending on the desired motion estimation resolution for the encoder. Other video encoders check at other increments, for example, every ¼ pixel. For a candidate macroblock, the encoder checks the difference between the macroblock (315) of the predicted frame (310) and the candidate macroblock and the cost of encoding the motion vector for that macroblock. After the encoder finds a good matching macroblock, the block matching process ends. The encoder outputs the motion vector (entropy coded) for the matching macroblock so the decoder can find the matching macroblock during decoding. When decoding the predicted frame (310), a decoder uses the motion vector to compute a prediction macroblock for the macroblock (315) using information from the reference frame (330). The prediction for the macroblock (315) is rarely perfect, so the encoder usually encodes 8×8 blocks of pixel differences (also called the error or residual blocks) between the prediction macroblock and the macroblock (315) itself.
  • [0024]
    Motion estimation and compensation are effective compression techniques, but various previous motion estimation/compensation techniques (as in WMV7 and elsewhere) have several disadvantages, including:
  • [0025]
    1) The resolution of the motion estimation (i.e., pixel, ½ pixel, ¼ pixel increments) does not adapt to the video source. For example, for different qualities of video source (clean vs. noisy), the video encoder uses the same resolution of motion estimation, which can hurt compression efficiency.
  • [0026]
    2) For ¼ pixel motion estimation, the search strategy fails to adequately exploit previously completed computations to speed up searching.
  • [0027]
    3) For ¼ pixel motion estimation, the search range is too large and inefficient. In particular, the horizontal resolution is the same as the vertical resolution in the search range, which does not match the motion characteristics of many video signals.
  • [0028]
    4) For ¼ pixel motion estimation, the representation of motion vectors is inefficient to the extent bit allocation for horizontal movement is the same as bit allocation for vertical resolution.
  • [0029]
    [0029]FIG. 4 illustrates the computation and encoding of an error block (435) for a motion-estimated block in the WMV7 encoder. The error block (435) is the difference between the predicted block (415) and the original current block (425). The encoder applies a DCT (440) to error block (435), resulting in 8×8 block (445) of coefficients. Even more than was the case with DCT coefficients for pixel values, the significant information for the error block (435) is concentrated in low frequency coefficients (conventionally, the upper left of the block (445)) and many of the high frequency coefficients have values of zero or close to zero (conventionally, the lower right of the block (445)).
  • [0030]
    The encoder then quantizes (450) the DCT coefficients, resulting in an 8×8 block of quantized DCT coefficients (455). The quantization step size is adjustable. Again, since low frequency DCT coefficients tend to have higher values, quantization results in loss of precision, but not complete loss of the information for the coefficients. On the other hand, since high frequency DCT coefficients tend to have values of zero or close to zero, quantization of the high frequency coefficients results in contiguous regions of zero values. In addition, in some cases high frequency DCT coefficients are quantized more coarsely than low frequency DCT coefficients, resulting in greater loss of precision/information for the high frequency DCT coefficients.
  • [0031]
    The encoder then prepares the 8×8 block (455) of quantized DCT coefficients for entropy encoding. The encoder scans (460) the 8×8 block (455) into a one dimensional array (465) with 64 elements, such that coefficients are generally ordered from lowest frequency to highest frequency, which typical creates long runs of zero values.
  • [0032]
    The encoder entropy encodes the scanned coefficients using a variation of run length coding (470). The encoder selects an entropy code from one or more run/level/last tables (475) and outputs the entropy code.
  • [0033]
    [0033]FIG. 5 shows the decoding process (500) for an inter-coded block. Due to the quantization of the DCT coefficients, the reconstructed block (575) is not identical to the corresponding original block. The compression is lossy.
  • [0034]
    In summary of FIG. 5, a decoder decodes (510, 520) entropy-coded information representing a prediction residual using variable length decoding and one or more run/level/last tables (515). The decoder inverse scans (530) a one-dimensional array (525) storing the entropy-decoded information into a two-dimensional block (535). The decoder inverse quantizes and inverse discrete cosine transforms (together, 540) the data, resulting in a reconstructed error block (545). In a separate path, the decoder computes a predicted block (565) using motion vector information (555) for displacement from a reference frame. The decoder combines (570) the predicted block (555) with the reconstructed error block (545) to form the reconstructed block (575).
  • [0035]
    The amount of change between the original and reconstructed frame is termed the distortion and the number of bits required to code the frame is termed the rate. The amount of distortion is roughly inversely proportional to the rate. In other words, coding a frame with fewer bits (greater compression) will result in greater distortion and vice versa. One of the goals of a video compression scheme is to try to improve the rate-distortion—in other words to try to achieve the same distortion using fewer bits (or the same bits and lower distortion).
  • [0036]
    Compression of prediction residuals as in WMV7 can dramatically reduce bitrate while slightly or moderately affecting quality, but the compression technique is less than optimal in some circumstances. The size of the frequency transform is the size of the prediction residual block (e.g., an 8×8 DCT for an 8×8 prediction residual). In some circumstances, this fails to exploit localization of error within the prediction residual block.
  • [0037]
    C. Post-processing with a Deblocking Filter in WMV7
  • [0038]
    For block-based video compression and decompression, quantization and other lossy processing stages introduce distortion that commonly shows up as blocky artifacts—perceptible discontinuities between blocks.
  • [0039]
    To reduce the perceptibility of blocky artifacts, the WMV7 decoder can process reconstructed frames with a deblocking filter. The deblocking filter smoothes the boundaries between blocks.
  • [0040]
    While the deblocking filter in WMV7 improves perceived video quality, it has several disadvantages. For example, the smoothing occurs only on reconstructed output in the decoder. Therefore, prediction processes such as motion estimation cannot take advantage of the smoothing. Moreover, the smoothing by the post-processing filter can be too extreme.
  • [0041]
    D. Standards for Video Compression and Decompression
  • [0042]
    Aside from WMV7, several international standards relate to video compression and decompression. These standards include the Motion Picture Experts Group [“MPEG”] 1, 2, and 4 standards and the H.261, H.262, and H.263 standards from the International Telecommunication Union [“ITU”]. Like WMV7, these standards use a combination of intraframe and interframe compression, although the standards typically differ from WMV7 in the details of the compression techniques used. For additional detail about the standards, see the standards' specifications themselves.
  • [0043]
    Given the critical importance of video compression and decompression to digital video, it is not surprising that video compression and decompression are richly developed fields. Whatever the benefits of previous video compression and decompression techniques, however, they do not have the advantages of the following techniques and tools.
  • SUMMARY
  • [0044]
    In summary, the detailed description is directed to various techniques and tools for motion estimation and compensation. These techniques and tools address several of the disadvantages of motion estimation and compensation according to the prior art. The various techniques and tools can be used in combination or independently.
  • [0045]
    According to a first set of techniques and tools, a video encoder adaptively switches between multiple different motion resolutions, which allows the encoder to select a suitable resolution for a particular video source or coding circumstances. For example, the encoder adaptively switches between pixel, half-pixel, and quarter-pixel resolutions. The encoder can switch based upon a closed-loop decision involving actual coding with the different options, or based upon an open-loop estimation. The encoder switches resolutions on a frame-by-frame basis or other basis.
  • [0046]
    According to a second set of techniques and tools, a video encoder uses previously computed results from a first resolution motion estimation to speed up another resolution motion estimation. For example, in some circumstances, the encoder searches for a quarter-pixel motion vector around an integer-pixel motion vector that was also used in half-pixel motion estimation. Or, the encoder uses previously computed half-pixel location values in computation of quarter-pixel location values.
  • [0047]
    According to a third set of techniques and tools, a video encoder uses a search range with different directional resolutions. This allows the encoder and decoder to place greater emphasis on directions likely to have more motion, and to eliminate the calculation of numerous sub-pixel values in the search range. For example, the encoder uses a search range with quarter-pixel increments and resolution horizontally, and half-pixel increments and resolution vertically. The search range is effectively quarter the size of a full quarter-by-quarter-pixel search range, and the encoder eliminates calculation of many of the quarter-pixel location points.
  • [0048]
    According to a fourth set of techniques and tools, a video encoder uses a motion vector representation with different bit allocation for horizontal and vertical motion. This allows the encoder to reduce bitrate by eliminating resolution that is less essential to quality. For example, the encoder represents a quarter-pixel motion vector by adding 1 bit to a half-pixel motion vector code to indicate a corresponding quarter-pixel location.
  • [0049]
    Additional features and advantages will be made apparent from the following detailed description of different embodiments that proceeds with reference to the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • [0050]
    [0050]FIG. 1 is a diagram showing block-based intraframe compression of an 8×8 block of pixels according to prior art.
  • [0051]
    [0051]FIG. 2 is a diagram showing prediction of frequency coefficients according to the prior art.
  • [0052]
    [0052]FIG. 3 is a diagram showing motion estimation in a video encoder according to the prior art.
  • [0053]
    [0053]FIG. 4 is a diagram showing block-based interframe compression for an 8×8 block of prediction residuals in a video encoder according to the prior art.
  • [0054]
    [0054]FIG. 5 is a diagram showing block-based intraframe decompression for an 8×8 block of prediction residuals according to the prior art.
  • [0055]
    [0055]FIG. 6 is a block diagram of a suitable computing environment in which several described embodiments may be implemented.
  • [0056]
    [0056]FIG. 7 is a block diagram of a generalized video encoder system used in several described embodiments.
  • [0057]
    [0057]FIG. 8 is a block diagram of a generalized video decoder system used in several described embodiments.
  • [0058]
    [0058]FIG. 9 is a flowchart showing a technique for selecting a motion estimation resolution for a predicted frame in a video encoder.
  • [0059]
    [0059]FIGS. 1Oa and 1Ob are flowcharts showing techniques for computing and evaluating motion vectors of a predicted frame in a video encoder.
  • [0060]
    [0060]FIG. 11 is a chart showing search locations for sub-pixel motion estimation.
  • [0061]
    [0061]FIG. 12 is a chart showing sub-pixel locations with values computed by interpolation in sub-pixel motion estimation.
  • [0062]
    [0062]FIG. 13 is a flowchart showing a technique for entropy decoding motion vectors of different resolutions in a video decoder.
  • DETAILED DESCRIPTION
  • [0063]
    The present application relates to techniques and tools for video encoding and decoding. In various described embodiments, a video encoder incorporates techniques that improve the efficiency of interframe coding, a video decoder incorporates techniques that improve the efficiency of interframe decoding, and a bitstream format includes flags and other codes to incorporate the techniques.
  • [0064]
    The various techniques and tools can be used in combination or independently. Different embodiments implement one or more of the described techniques and tools.
  • I. Computing Environment
  • [0065]
    [0065]FIG. 6 illustrates a generalized example of a suitable computing environment (600) in which several of the described embodiments may be implemented. The computing environment (600) is not intended to suggest any limitation as to scope of use or functionality, as the techniques and tools may be implemented in diverse general-purpose or special-purpose computing environments.
  • [0066]
    With reference to FIG. 6, the computing environment (600) includes at least one processing unit (610) and memory (620). In FIG. 6, this most basic configuration (630) is included within a dashed line. The processing unit (610) executes computer-executable instructions and may be a real or a virtual processor. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power. The memory (620) may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two. The memory (620) stores software (680) implementing a video encoder or decoder.
  • [0067]
    A computing environment may have additional features. For example, the computing environment (600) includes storage (640), one or more input devices (650), one or more output devices (660), and one or more communication connections (670). An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the components of the computing environment (600). Typically, operating system software (not shown) provides an operating environment for other software executing in the computing environment (600), and coordinates activities of the components of the computing environment (600).
  • [0068]
    The storage (640) may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, DVDs, or any other medium which can be used to store information and which can be accessed within the computing environment (600). The storage (640) stores instructions for the software (680) implementing the video encoder or decoder.
  • [0069]
    The input device(s) (650) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or another device that provides input to the computing environment (600). For audio or video encoding, the input device(s) (650) may be a sound card, video card, TV tuner card, or similar device that accepts audio or video input in analog or digital form, or a CD-ROM or CD30 RW that reads audio or video samples into the computing environment (600). The output device(s) (660) may be a display, printer, speaker, CD-writer, or another device that provides output from the computing environment (600).
  • [0070]
    The communication connection(s) (670) enable communication over a communication medium to another computing entity. The communication medium conveys information such as computer-executable instructions, audio or video input or output, or other data in a modulated data signal. A modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media include wired or wireless techniques implemented with an electrical, optical, RF, infrared, acoustic, or other carrier.
  • [0071]
    The techniques and tools can be described in the general context of computer-executable readable media. Computer-readable media are any available media that can be accessed within a computing environment. By way of example, and not limitation, with the computing environment (600), computer-readable media include memory (620), storage (640), communication media, and combinations of any of the above.
  • [0072]
    The techniques and tools can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computing environment on a target real or virtual processor. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or split between program modules as desired in various embodiments. Computer-executable instructions for program modules may be executed within a local or distributed computing environment.
  • [0073]
    For the sake of presentation, the detailed description uses terms like “determine,” “select,” “adjust,” and “apply” to describe computer operations in a computing environment. These terms are high-level abstractions for operations performed by a computer, and should not be confused with acts performed by a human being. The actual computer operations corresponding to these terms vary depending on implementation.
  • II. Generalized Video Encoder and Decoder
  • [0074]
    [0074]FIG. 7 is a block diagram of a generalized video encoder (700) and FIG. 8 is a block diagram of a generalized video decoder (800).
  • [0075]
    The relationships shown between modules within the encoder and decoder indicate the main flow of information in the encoder and decoder; other relationships are not shown for the sake of simplicity. In particular, FIGS. 7 and 8 usually do not show side information indicating the encoder settings, modes, tables, etc. used for a video sequence, frame, macroblock, block, etc. Such side information is sent in the output bitstream, typically after entropy encoding of the side information. The format of the output bitstream can be Windows Media Video version 8 format or another format.
  • [0076]
    The encoder (700) and decoder (800) are block-based and use a 4:2:0 macroblock format with each macroblock including 4 luminance 8×8 luminance blocks (at times treated as one 16×16 macroblock) and two 8×8 chrominance blocks. Alternatively, the encoder (700) and decoder (800) are object-based, use a different macroblock or block format, or perform operations on sets of pixels of different size or configuration than 8×8 blocks and 16×16 macroblocks.
  • [0077]
    Depending on implementation and the type of compression desired, modules of the encoder or decoder can be added, omitted, split into multiple modules, combined with other modules, and/or replaced with like modules. In alternative embodiments, encoder or decoders with different modules and/or other configurations of modules perform one or more of the described techniques.
  • [0078]
    A. Video Encoder
  • [0079]
    [0079]FIG. 7 is a block diagram of a general video encoder system (700). The encoder system (700) receives a sequence of video frames including a current frame (705), and produces compressed video information (795) as output. Particular embodiments of video encoders typically use a variation or supplemented version of the generalized encoder (700).
  • [0080]
    The encoder system (700) compresses predicted frames and key frames. For the sake of presentation, FIG. 7 shows a path for key frames through the encoder system (700) and a path for forward-predicted frames. Many of the components of the encoder system (700) are used for compressing both key frames and predicted frames. The exact operations performed by those components can vary depending on the type of information being compressed.
  • [0081]
    A predicted frame [also called p-frame, b-frame for bi-directional prediction, or inter-coded frame] is represented in terms of prediction (or difference) from one or more other frames. A prediction residual is the difference between what was predicted and the original frame. In contrast, a key frame [also called i-frame, intra-coded frame] is compressed without reference to other frames.
  • [0082]
    If the current frame (705) is a forward-predicted frame, a motion estimator (710) estimates motion of macroblocks or other sets of pixels of the current frame (705) with respect to a reference frame, which is the reconstructed previous frame (725) buffered in the frame store (720). In alternative embodiments, the reference frame is a later frame or the current frame is bi-directionally predicted. The motion estimator (710) can estimate motion by pixel, ½ pixel, ¼ pixel, or other increments, and can switch the resolution of the motion estimation on a frame-by-frame basis or other basis. The resolution of the motion estimation can be the same or different horizontally and vertically. The motion estimator (710) outputs as side information motion information (715) such as motion vectors. A motion compensator (730) applies the motion information (715) to the reconstructed previous frame (725) to form a motion-compensated current frame (735). The prediction is rarely perfect, however, and the difference between the motion-compensated current frame (735) and the original current frame (705) is the prediction residual (745). Alternatively, a motion estimator and motion compensator apply another type of motion estimation/compensation.
  • [0083]
    A frequency transformer (760) converts the spatial domain video information into frequency domain (i.e., spectral) data. For block-based video frames, the frequency transformer (760) applies a discrete cosine transform [“DCT”] or variant of DCT to blocks of the pixel data or prediction residual data, producing blocks of DCT coefficients. Alternatively, the frequency transformer (760) applies another conventional frequency transform such as a Fourier transform or uses wavelet or subband analysis. In embodiments in which the encoder uses spatial extrapolation (not shown in FIG. 7) to encode blocks of key frames, the frequency transformer (760) can apply a re-oriented frequency transform such as a skewed DCT to blocks of prediction residuals for the key frame. In other embodiments, the frequency transformer (760) applies an 8×8, 8×4, 4×8, or other size frequency transforms (e.g., DCT) to prediction residuals for predicted frames.
  • [0084]
    A quantizer (770) then quantizes the blocks of spectral data coefficients. The quantizer applies uniform, scalar quantization to the spectral data with a step-size that varies on a frame-by-frame basis or other basis. Alternatively, the quantizer applies another type of quantization to the spectral data coefficients, for example, a non-uniform, vector, or non-adaptive quantization, or directly quantizes spatial domain data in an encoder system that does not use frequency transformations. In addition to adaptive quantization, the encoder (700) can use frame dropping, adaptive filtering, or other techniques for rate control.
  • [0085]
    When a reconstructed current frame is needed for subsequent motion estimation/compensation, an inverse quantizer (776) performs inverse quantization on the quantized spectral data coefficients. An inverse frequency transformer (766) then performs the inverse of the operations of the frequency transformer (760), producing a reconstructed prediction residual (for a predicted frame) or a reconstructed key frame. If the current frame (705) was a key frame, the reconstructed key frame is taken as the reconstructed current frame (not shown). If the current frame (705) was a predicted frame, the reconstructed prediction residual is added to the motion-compensated current frame (735) to form the reconstructed current frame. The frame store (720) buffers the reconstructed current frame for use in predicting the next frame. In some embodiments, the encoder applies a deblocking filter to the reconstructed frame to adaptively smooth discontinuities in the blocks of the frame.
  • [0086]
    The entropy coder (780) compresses the output of the quantizer (770) as well as certain side information (e.g., motion information (715), spatial extrapolation modes, quantization step size). Typical entropy coding techniques include arithmetic coding, differential coding, Huffman coding, run length coding, LZ coding, dictionary coding, and combinations of the above. The entropy coder (780) typically uses different coding techniques for different kinds of information (e.g., DC coefficients, AC coefficients, different kinds of side information), and can choose from among multiple code tables within a particular coding technique.
  • [0087]
    The entropy coder (780) puts compressed video information (795) in the buffer (790). A buffer level indicator is fed back to bitrate adaptive modules.
  • [0088]
    The compressed video information (795) is depleted from the buffer (790) at a constant or relatively constant bitrate and stored for subsequent streaming at that bitrate. Therefore, the level of the buffer (790) is primarily a function of the entropy of the filtered, quantized video information, which affects the efficiency of the entropy coding. Alternatively, the encoder system (700) streams compressed video information immediately following compression, and the level of the buffer (790) also depends on the rate at which information is depleted from the buffer (790) for transmission.
  • [0089]
    Before or after the buffer (790), the compressed video information (795) can be channel coded for transmission over the network. The channel coding can apply error detection and correction data to the compressed video information (795).
  • [0090]
    B. Video Decoder
  • [0091]
    [0091]FIG. 8 is a block diagram of a general video decoder system (800). The decoder system (800) receives information (895) for a compressed sequence of video frames and produces output including a reconstructed frame (805). Particular embodiments of video decoders typically use a variation or supplemented version of the generalized decoder (800).
  • [0092]
    The decoder system (800) decompresses predicted frames and key frames. For the sake of presentation, FIG. 8 shows a path for key frames through the decoder system (800) and a path for forward-predicted frames. Many of the components of the decoder system (800) are used for compressing both key frames and predicted frames. The exact operations performed by those components can vary depending on the type of information being compressed.
  • [0093]
    A buffer (890) receives the information (895) for the compressed video sequence and makes the received information available to the entropy decoder (880). The buffer (890) typically receives the information at a rate that is fairly constant over time, and includes a jitter buffer to smooth short-term variations in bandwidth or transmission. The buffer (890) can include a playback buffer and other buffers as well. Alternatively, the buffer (890) receives information at a varying rate. Before or after the buffer (890), the compressed video information can be channel decoded and processed for error detection and correction.
  • [0094]
    The entropy decoder (880) entropy decodes entropy-coded quantized data as well as entropy-coded side information (e.g., motion information (815), spatial extrapolation modes, quantization step size), typically applying the inverse of the entropy encoding performed in the encoder. Entropy decoding techniques include arithmetic decoding, differential decoding, Huffman decoding, run length decoding, LZ decoding, dictionary decoding, and combinations of the above. The entropy decoder (880) frequently uses different decoding techniques for different kinds of information (e.g., DC coefficients, AC coefficients, different kinds of side information), and can choose from among multiple code tables within a particular decoding technique.
  • [0095]
    If the frame (805) to be reconstructed is a forward-predicted frame, a motion compensator (830) applies motion information (815) to a reference frame (825) to form a prediction (835) of the frame (805) being reconstructed. For example, the motion compensator (830) uses a macroblock motion vector to find a macroblock in the reference frame (825). A frame buffer (820) stores previous reconstructed frames for use as reference frames. The motion compensator (830) can compensate for motion at pixel, ½ pixel, ¼ pixel, or other increments, and can switch the resolution of the motion compensation on a frame-by-frame basis or other basis. The resolution of the motion compensation can be the same or different horizontally and vertically. Alternatively, a motion compensator applies another type of motion compensation. The prediction by the motion compensator is rarely perfect, so the decoder (800) also reconstructs prediction residuals.
  • [0096]
    When the decoder needs a reconstructed frame for subsequent motion compensation, the frame store (820) buffers the reconstructed frame for use in predicting the next frame. In some embodiments, the encoder applies a deblocking filter to the reconstructed frame to adaptively smooth discontinuities in the blocks of the frame.
  • [0097]
    An inverse quantizer (870) inverse quantizes entropy-decoded data. In general, the inverse quantizer applies uniform, scalar inverse quantization to the entropy-decoded data with a step-size that varies on a frame-by-frame basis or other basis. Alternatively, the inverse quantizer applies another type of inverse quantization to the data, for example, a non-uniform, vector, or non-adaptive quantization, or directly inverse quantizes spatial domain data in a decoder system that does not use inverse frequency transformations.
  • [0098]
    An inverse frequency transformer (860) converts the quantized, frequency domain data into spatial domain video information. For block-based video frames, the inverse frequency transformer (860) applies an inverse DCT [“IDCT”] or variant of IDCT to blocks of the DCT coefficients, producing pixel data or prediction residual data for key frames or predicted frames, respectively. Alternatively, the frequency transformer (860) applies another conventional inverse frequency transform such as a Fourier transform or uses wavelet or subband synthesis. In embodiments in which the decoder uses spatial extrapolation (not shown in FIG. 8) to decode blocks of key frames, the inverse frequency transformer (860) can apply a re-oriented inverse frequency transform such as a skewed IDCT to blocks of prediction residuals for the key frame. In other embodiments, the inverse frequency transformer (860) applies an 8×8, 8×4, 4×8, or other size inverse frequency transforms (e.g., IDCT) to prediction residuals for predicted frames.
  • III. Intraframe Encoding and Decoding
  • [0099]
    In one or more embodiments, a video encoder exploits redundancies in typical still images in order to code the I-frame information using a smaller number of bits. For additional detail about intraframe encoding and decoding in some embodiments, see U.S. patent application Ser. No. aa/bbb,ccc, entitled “Spatial Extrapolation of Pixel Values in Intraframe Video Coding and Decoding,” filed concurrently herewith.
  • IV. Interframe Encoding and Decoding
  • [0100]
    Inter-frame coding exploits temporal redundancy between frames to achieve compression. Temporal redundancy reduction uses previously coded frames as predictors when coding the current frame.
  • [0101]
    A. Motion Estimation
  • [0102]
    In one or more embodiments, a video encoder exploits temporal redundancies in typical video sequences in order to code the information using a smaller number of bits. The video encoder uses motion estimation/compensation of a macroblock or other set of pixels of a current frame with respect to a reference frame. A video decoder uses corresponding motion compensation. Various features of the motion estimation/compensation can be used in combination or independently. These features include, but are not limited to:
  • [0103]
    1a) Adaptive switching of the resolution of motion estimation/compensation. For example, the resolution switches between quarter-pixel and half-pixel resolutions.
  • [0104]
    1b) Adaptive switching of the resolution of motion estimation/compensation depending on a video source with a closed loop or open loop decision.
  • [0105]
    1c) Adaptive switching of the resolution of motion estimation/compensation on a frame-by-frame basis or other basis.
  • [0106]
    2a) Using previously computed results of a first motion resolution evaluation to speed up a second motion resolution evaluation.
  • [0107]
    2b) Selectively using integer-pixel motion information from a first motion resolution evaluation to speed up a second motion resolution evaluation.
  • [0108]
    2c) Using previously computed sub-pixel values from a first motion resolution evaluation to speed up a second motion resolution evaluation.
  • [0109]
    3) Using a search range with different directional resolution for motion estimation. For example, the horizontal resolution of the search range is quarter pixel and the vertical resolution is half pixel. This speeds up motion estimation by skipping certain quarter-pixel locations.
  • [0110]
    4) Using a motion information representation with different bit allocation for horizontal and vertical motion. For example, a video encoder uses an additional bit for motion information in the horizontal direction, compared to the vertical direction.
  • [0111]
    5a) Using a resolution bit with a motion information representation for additional resolution of motion estimation/compensation. For example, a video encoder adds a bit to half-pixel motion information to differentiate between a half-pixel increment and a quarter-pixel increment. A video decoder receives the resolution bit.
  • [0112]
    5b) Selectively using a resolution bit with a motion information representation for additional resolution of motion estimation/compensation. For example, a video encoder adds a bit to half-pixel motion information to differentiate between a half-pixel increment and a quarter-pixel increment only for half-pixel motion information, not integer-pixel motion information. A video decoder selectively receives the resolution bit.
  • [0113]
    For motion estimation, the video encoder establishes a search range within the reference frame. The video encoder can center the search range around a predicted location that is set based upon the motion information for neighboring sets of pixels. In some embodiments, the encoder uses a reduced coverage range for the higher resolution motion estimation (e.g., quarter-pixel motion estimation) to balance between the bits used to signal the higher resolution motion information and distortion reduction due to the higher resolution motion estimation. Most motions observed in TV and movie content tends to be dominated by finer horizontal motion than vertical motion. This is probably due to the fact that most camera movements tend to be more horizontal, since rapid vertical motion seems to make viewers dizzy. Taking advantage of this characteristic, the encoder uses higher resolution motion estimation/compensation that covers more horizontal locations than vertical locations. This strikes a balance between rate and distortion, and lowers the computational complexity of the motion information search process as well. In alternative embodiments, the search range has the same resolution horizontally and vertically.
  • [0114]
    Within the search range, the encoder finds a motion vector that parameterizes the motion of a macroblock or other set of pixels in the predicted frame. In some embodiments, with an efficient and low complexity method, the encoder computes and switches between higher sub-pixel accuracy and lower sub-pixel accuracy. In alternative embodiments, the encoder does not switch between resolutions for motion estimation/compensation. Instead of motion vectors (translations), the encoder can compute other types motion information to parameterize motion of a set of pixels between frames.
  • [0115]
    In one implementation, the encoder switches between quarter-pixel accuracy using a combination of four taps/two taps filter, and half-pixel accuracy using a two-tap filter. The encoder switches resolution of motion estimation/compensation on a per frame basis, per sequence basis, or other basis. The rationale behind this is that quarter-pixel motion compensation works well for very clean video sources (i.e., no noise), while half-pixel motion compensation handles noisy video sources (e.g., video from a cable feed) much better. This is due to the fact that the two-tap filter of the half-pixel motion compensation acts as a lowpass filter and tends to attenuate the noise. In contrast, the four-tap filter of the quarter-pixel motion compensation has some highpass effects so it can preserve the edges, but, unfortunately, it also tends to accentuate the noise. Other implementations use different filters.
  • [0116]
    After the encoder finds a motion vector or other motion information, the encoder outputs the information. For example, the encoder outputs entropy-coded data for the motion vector, motion vector differentials, or other motion information. In some embodiments, the encoder uses a motion vector with different bit allocation for horizontal and vertical motion. An extra bit adds quarter-pixel resolution horizontally to a half-pixel motion vector. The encoder saves bits by coding vertical motion vector at half-pixel accuracy. The encoder can add the bit only for half-pixel motion vectors, not for integer-pixel motion vectors, which further reduces the overall bitrate. In alternative embodiments, the encoder uses the same bit allocation for horizontal and vertical motions.
  • [0117]
    1. Resolution Switching
  • [0118]
    In some embodiments, a video encoder switches resolution of motion estimation/compensation. FIG. 9 shows a technique for selecting a motion estimation resolution for a predicted video frame. The encoder selects between half-pixel resolution and quarter-pixel resolution for motion vectors on a per frame basis. For the sake of simplicity, FIG. 9 does not show the various ways in which the technique (900) can be used in conjunction with other techniques. In alternative embodiments, the encoder switches between resolutions other than quarter and half-pixel and/or switches at a frequency other than per frame.
  • [0119]
    The encoder gets (910) a macroblock for a predicted frame and computes (920) a half-pixel motion vector for the macroblock. The encoder also computes (930) a quarter-pixel motion vector for the macroblock. The encoder evaluates (940) the motion vectors. For example, for each of the motion vectors, the encoder computes an error measure such as sum of absolute differences [“SAD”], mean square error [“MSE”], a perceptual distortion measure, or another measure for the prediction residual.
  • [0120]
    In one implementation, the encoder computes and evaluates motion vectors as shown in FIG 10 a. For a macroblock, the encoder computes (1010) a half-pixel motion vector MVh in integer-pixel accuracy. For example, the encoder finds a motion vector by searching at integer increments within the search range. The encoder then computes (1020) MVh to half-pixel accuracy in a region around the first computed MVh.
  • [0121]
    In a separate path, the encoder computes (1050) a quarter-pixel motion vector MVq in integer-pixel accuracy and then computes (1070) MVq to quarter-pixel accuracy in a region around the first computed MVq. The encoder then evaluates (1090) the final MVh and MVq. Alternatively, the encoder evaluates the motion vectors later.
  • [0122]
    In another implementation, the encoder eliminates a computation of a motion vector at integer-pixel accuracy in many cases by computing motion vectors as shown in FIG. 10b. The encoder computes (1010) MVh to integer-pixel accuracy.
  • [0123]
    Most of the time the integer-pixel portion of the MVq is the same as the integer-pixel portion of MVh. Thus, instead of computing the MVq to integer-pixel accuracy every time as in FIG. 1Oa, the encoder checks (1030) whether the integer-pixel accurate MVh can be used for MVq. Specifically, the encoder checks whether integer-pixel accurate MVh lies within the motion vector search range for the set of quarter-pixel motion vectors. The motion vector search range for a given macroblock is set to be ±16 (R in FIG. 10) of a motion vector predictor for the quarter-pixel motion vector. The motion vector predictor for a macroblock is the component-wise median of the macroblock's left, top, and top-right neighboring macroblocks' motion vectors, and can be different for MVh and MVq. Alternatively, the range, motion vector predictor, or conditional bypass is computed differently.
  • [0124]
    If the integer-pixel MVh lies within the range then the encoder skips the computation of the integer-pixel MVq, and simply sets (1040) MVq to MVh. Otherwise, the encoder computes (1050) MVq to integer-pixel accuracy. The encoder computes (1020) MVh to half-pixel accuracy, computes (1070) MVq to quarter-pixel accuracy, and evaluates (1070) the motion vectors. Alternatively, the encoder computes the quarter-pixel motion vector at integer-pixel accuracy first, and selectively bypasses the computation of the half-pixel motion vector at integer-pixel accuracy.
  • [0125]
    Returning to FIG. 9, the encoder determines (950) whether there are any more macroblocks in the frame. If so, the encoder gets (960) the next macroblock and computes motion vectors for it.
  • [0126]
    Otherwise, the encoder selects (970) the motion vector resolution for the predicted frame. In one implementation, the encoder uses a rate-distortion criterion to select the set of MVh's or the set of MVq's. The encoder compares the cost of choosing half-pixel resolution versus quarter-pixel resolution and picks the minimum of the two.
  • [0127]
    The cost functions are defined as follows:
  • J q=SADq+QP*iMvBitOverhead
  • J h=SADh
  • [0128]
    where Jh and Jq are the cost of choosing half-pixel resolution and quarter-pixel resolution, respectively. SADh and SADq are the sums of the residual error from prediction using the half-pixel and quarter-pixel motion vectors, respectively. QP is a quantization parameter. The effect of QP is to bias the selection in favor of half-pixel resolution in cases where QP is high and distortion in residuals would offset gains in quality from the higher resolution motion estimation. iMvBitOverhead is the extra bits for coding quarter-pixel motion vectors compared to the half-pixel motion vectors. In an implementation in which half-pixel motion vectors (but not integer-pixel motion vectors) have an extra resolution bit, iMvBitOverhead is the number of non-integer-pixel motion vectors in the set of MVqs. Alternatively, the encoder uses other costs functions, for example, cost functions that directly compare the bits spent for different resolutions of motion vectors.
  • [0129]
    2. Different Horizontal and Vertical Resolutions
  • [0130]
    In some embodiments, a video encoder uses a search range with different horizontal and vertical resolutions. For example, the horizontal resolution of the search range is quarter pixel and the vertical resolution of the search range is half pixel.
  • [0131]
    The encoder finds an integer-pixel accurate motion vector in a search range, for example, by searching at integer increments within the search range. In a region around the integer-pixel accurate motion vector, the encoder computes a sub-pixel accurate motion vector by evaluating motion vectors at sub-pixel locations in the region.
  • [0132]
    [0132]FIG. 11 shows a location I that is pointed to by an integer-pixel accurate motion vector. The encoder computes a half-pixel motion vector by searching for the best match among all eight half-pixel locations H0 to H7 surrounding the integer position I. On the other hand, the encoder computes the quarter-pixel motion vector by searching for the best match among the eight half-pixel locations H0 to H7 and eight quarter-pixel locations Q0 to Q7. The searched quarter-pixel locations are placed horizontally between adjacent half-pixel locations. The searched quarter-pixel locations are not placed vertically between adjacent half-pixel locations. Thus, the search density increases on horizontal quarter-pixel locations, but not vertical quarter-pixel locations. This feature improves performance by speeding up the motion estimation process compared to a search in each direction by quarter-pixel increments, which would also require the computation of values for additional quarter-pixel locations.
  • [0133]
    In an implementation in which quarter-pixel resolution is indicated by adding an extra bit to half-pixel motion vectors, the quarter-pixel location to the right of the integer-pixel location is not searched as a valid location for a quarter-pixel motion vector, although a sub-pixel value is computed there for matching purposes. In other implementations, that quarter-pixel location is also searched and a different scheme is used to represent quarter-pixel motion vectors. In alternative embodiments, the encoder uses a different search pattern for quarter-pixel motion vectors.
  • [0134]
    The encoder generates values for sub-pixel locations by interpolation. In one implementation, for each searched location, the interpolation filter differs depending on the resolution chosen. For half-pixel resolution, the encoder uses a two-tap bilinear filter to generate the match, while for quarter-pixel resolution, the encoder uses a combination of four-tap and two-tap filters to generate the match. FIG. 12 shows sub-pixel locations H0, H1, H2with values computed by interpolation of integer-pixel values a, b, c, . . . , p.
  • [0135]
    For half-pixel resolution, the interpolation used in the three distinct half-pixel locations H0, H1, H2 is:
  • H 0=(f+g+1−iRndCtrl)>>1.
  • H 1=(f+j+1−iRndCtrl)>>1.
  • H 2=(f+g+j+k+2−iRndCtrl)>>2.
  • [0136]
    where iRndCtrl indicates rounding control and varies between 0 and 1 from frame to frame.
  • [0137]
    For quarter-pixel resolution, the interpolation used for the three distinct half-pixel locations H0, H1, H2 is:
  • H 0=(−e+9f+9g−h+8)>>4.
  • H 1=(−b+9f+9j−n+8)>>4.
  • H 2=(−t 0+9t 1+9t 2−t3+8)>>4.
  • [0138]
    where t0, t1, t2, t3 are computed as follows:
  • t 0=(−a+9b+9c−d+8)>>4
  • t 1=(−e+9f+9g−h+8)>>4
  • t 2=(−i+9j+9k−l+8)>>4
  • t 3=(−m+9n+9o−p+8)>>4
  • [0139]
    For the quarter-pixel resolution, the encoder also searches some of the quarter-pixel locations, as indicated by Q0 to Q7 in FIG. 11. These quarter-pixel locations are situated horizontally in between either two half-pixel locations or an integer-pixel location and a half-pixel location. For these quarter-pixel locations, the encoder uses bilinear interpolation (i.e., (x+y+1)>>1) using the two horizontally neighboring half-pixel/integer-pixel locations without rounding control. Using bicubic interpolation followed by bilinear interpolation balances computational complexity and information preservation, giving good results for reasonable computational complexity.
  • [0140]
    Alternatively, the encoder uses filters with different numbers or magnitudes of taps. In general, bilinear interpolation smoothes the values, attenuating high frequency information, whereas bicubic interpolation preserves more high frequency information but can accentuate noise. Using two bilinear steps (one for half-pixel locations, the second for quarter-pixel locations) is simple, but can smooth the pixels too much for efficient motion estimation.
  • [0141]
    3. Encoding and Decoding Motion Vector Information
  • [0142]
    In some embodiments, a video encoder uses different bit allocation for horizontal and vertical motion vectors. For example, the video encoder uses one or more extra bits to represent motion in one direction with finer resolution that motion in another direction. This allows the encoder to reduce bitrate for vertical resolution information that is less useful for compression, compared to systems that code motion information at quarter-pixel resolution both horizontally and vertically.
  • [0143]
    In one implementation, a video encoder uses an extra bit for quarter-pixel resolution of horizontal component motion vectors for macroblocks. For vertical component motion vectors, the video encoder uses half-pixel vertical component motion vectors. The video encoder can also use integer-pixel motion vectors. For example, the encoder outputs one or more entropy codes or another representation for a horizontal component motion vector and a vertical component motion vector. The encoder also outputs an additional bit that indicates a quarter-pixel horizontal increment. A value of 0 indicates no quarter-pixel increment and a value of 1 indicates a quarter-pixel increment, or vice versa. In this implementation, the use of the extra bit avoids the use of separate entropy code tables for quarter-pixel MVs/DMVs and half-pixel MVs/DMVs, and also adds little to bitrate.
  • [0144]
    In another implementation, a video encoder selectively uses the extra bit for quarter-pixel resolution of horizontal component motion vectors for macroblocks. The encoder adds the extra bit only if 1) quarter-pixel resolution is used for the frame and 2) at least one of the horizontal or vertical component motion vectors for a macroblock has half-pixel resolution. Thus, the extra bit is not used when quarter-pixel resolution is not used for a frame or when the motion vector for the macroblock is integer-pixel resolution, which reduces overall bitrate. Alternatively, the encoder adds the extra bit based upon other criteria.
  • [0145]
    [0145]FIG. 13 shows a technique for decoding information for motion vectors at selective resolution. For the sake of simplicity, FIG. 13 does not show the various ways in which the technique (1300) can be used in conjunction with other techniques.
  • [0146]
    A decoder gets (1310) motion vector information for a macroblock, for example, receiving one or more entropy codes or other information for a motion vector, component motion vectors, differential motion vectors (“DMVs”), or differential component motion vectors.
  • [0147]
    The decoder determines (1330) whether it has received all of the motion vector information for the macroblock. For example, the decoder determines whether additional resolution is enabled for the macroblock (e.g., at a frame level). Or, the decoder determines from decoding of the already received motion vector information whether to expect additional information. Or, the encoder considers both whether the additional resolution is enabled and whether to expect it based upon previously decoded information.
  • [0148]
    If the decoder expects additional motion vector resolution information, the decoder gets (1340) the additional information. For example, the decoder gets one or more additional resolution bits for the motion vector information for the macroblock.
  • [0149]
    The decoder then reconstructs (1350) the macroblock using the motion vector information and determines (1360) whether there are other macroblocks in the frame. If not, the technique ends. Otherwise, the decoder gets (1370) the motion vector information for the next macroblock and continues.
  • [0150]
    B. Coding of Prediction Residuals
  • [0151]
    Motion estimation is rarely perfect, and the video encoder uses prediction residuals to represent the differences between the original video information and the video information predicted using motion estimation. In one or more embodiments, a video encoder exploits redundancies in prediction residuals in order to code the information using a smaller number of bits. For additional detail about coding of prediction residuals in some embodiments, see U.S. patent application Ser. No. aa/bbb,ccc, entitled “Sub-Block Transform Coding of Prediction Residuals,” filed concurrently herewith.
  • [0152]
    C. Loop Filtering
  • [0153]
    Quantization and other lossy processing of prediction residuals can cause blocky artifacts in reference frames that are used for motion estimation/compensation for subsequent predicted frames. In one or more embodiments, a video encoder processes a reconstructed frame to reduce blocky artifacts prior to motion estimation using the reference frame. A video decoder processes the reconstructed frame to reduce blocky artifacts prior to motion compensation using the reference frame. With deblocking, a reference frame becomes a better reference candidate to encode the following frame. Thus, using the deblocking filter improves the quality of motion estimation/compensation, resulting in better prediction and lower bitrate for prediction residuals. For additional detail about using a deblocking filter in motion estimation/compensation in some embodiments, see U.S. patent application Ser. No. aa/bbb,ccc, entitled “Motion Compensation Loop With Filtering,” filed concurrently herewith.
  • [0154]
    Having described and illustrated the principles of our invention with reference to various embodiments, it will be recognized that the various embodiments can be modified in arrangement and detail without departing from such principles. It should be understood that the programs, processes, or methods described herein are not related or limited to any particular type of computing environment, unless indicated otherwise. Various types of general purpose or specialized computing environments may be used with or perform operations in accordance with the teachings described herein. Elements of embodiments shown in software may be implemented in hardware and vice versa.
  • [0155]
    In view of the many possible embodiments to which the principles of our invention may be applied, we claim as our invention all such embodiments as may come within the scope and spirit of the following claims and equivalents thereto.
Citations de brevets
Brevet cité Date de dépôt Date de publication Déposant Titre
US4862267 *31 mai 198829 août 1989Sony Corp.Motion compensated interpolation of digital television images
US5103306 *28 mars 19907 avr. 1992Transitions Research CorporationDigital image compression employing a resolution gradient
US5491523 *31 mai 199413 févr. 1996Matsushita Electric Industrial Co. Ltd.Image motion vector detecting method and motion vector coding method
US5594813 *19 févr. 199214 janv. 1997Integrated Information Technology, Inc.Programmable architecture and methods for motion estimation
US5623311 *28 oct. 199422 avr. 1997Matsushita Electric Corporation Of AmericaMPEG video decoder having a high bandwidth memory
US5659365 *5 juin 199619 août 1997Sony CorporationVideo compression
US5692063 *19 janv. 199625 nov. 1997Microsoft CorporationMethod and system for unrestricted motion estimation for video
US5784175 *4 juin 199621 juil. 1998Microsoft CorporationPixel block correlation process
US5787203 *19 janv. 199628 juil. 1998Microsoft CorporationMethod and system for filtering compressed video images
US5796855 *4 juin 199618 août 1998Microsoft CorporationPolygon block matching method
US5799113 *19 janv. 199625 août 1998Microsoft CorporationMethod for expanding contracted video images
US5825929 *4 juin 199620 oct. 1998Microsoft CorporationTransformation block optimization method
US5844613 *30 juin 19971 déc. 1998Microsoft CorporationGlobal motion estimator for motion video signal encoding
US5874995 *2 sept. 199723 févr. 1999Matsuhita Electric Corporation Of AmericaMPEG video decoder having a high bandwidth memory for use in decoding interlaced and progressive signals
US5901248 *6 août 19964 mai 19998X8, Inc.Programmable architecture and methods for motion estimation
US5929940 *24 oct. 199627 juil. 1999U.S. Philips CorporationMethod and device for estimating motion between images, system for encoding segmented images
US5970173 *4 juin 199619 oct. 1999Microsoft CorporationImage compression and affine transformation for image motion compensation
US5973755 *4 avr. 199726 oct. 1999Microsoft CorporationVideo encoder and decoder using bilinear motion compensation and lapped orthogonal transforms
US5982438 *29 oct. 19979 nov. 1999Microsoft CorporationOverlapped motion compensation for object coding
US5991447 *6 oct. 199723 nov. 1999General Instrument CorporationPrediction and coding of bi-directionally predicted video object planes for interlaced digital video
US6005980 *21 juil. 199721 déc. 1999General Instrument CorporationMotion estimation and compensation of video object planes for interlaced digital video
US6067322 *4 juin 199723 mai 2000Microsoft CorporationHalf pixel motion estimation in motion video signal encoding
US6130963 *22 nov. 199610 oct. 2000C-Cube Semiconductor Ii, Inc.Memory efficient decoding of video frame chroma
US6219464 *24 août 199917 avr. 2001Genesis Microchip Inc.Source data interpolation method and apparatus
US6233017 *30 juin 199715 mai 2001Microsoft CorporationMultimedia compression system with adaptive block sizes
US6259741 *18 févr. 199910 juil. 2001General Instrument CorporationMethod of architecture for converting MPEG-2 4:2:2-profile bitstreams into main-profile bitstreams
US6266091 *29 août 199724 juil. 2001Lsi Logic CorporationSystem and method for low delay mode operation video decoding
US6281942 *11 août 199728 août 2001Microsoft CorporationSpatial and temporal filtering mechanism for digital motion video signals
US6282243 *18 nov. 199728 août 2001Fujitsu LimitedApparatus and method for interframe predictive video coding and decoding with capabilities to avoid rounding error accumulation
US6295376 *8 juin 199825 sept. 2001Hitachi, Ltd.Image sequence coding method and decoding method
US6310918 *29 août 199730 oct. 2001Lsi Logic CorporationSystem and method for motion vector extraction and computation meeting 2-frame store and letterboxing requirements
US6320593 *20 avr. 199920 nov. 2001Agilent Technologies, Inc.Method of fast bi-cubic interpolation of image information
US6337881 *23 mars 20008 janv. 2002Microsoft CorporationMultimedia compression system with adaptive block sizes
US6377628 *15 déc. 199723 avr. 2002Thomson Licensing S.A.System for maintaining datastream continuity in the presence of disrupted source data
US6381279 *30 juin 200030 avr. 2002Hewlett-Packard CompanyMethod for providing motion-compensated multi-field enhancement of still images from video
US6396876 *3 août 199828 mai 2002Thomson Licensing S.A.Preprocessing process and device for motion estimation
US6418166 *30 nov. 19989 juil. 2002Microsoft CorporationMotion estimation and block matching pattern
US6496608 *15 janv. 199917 déc. 2002Picsurf, Inc.Image data interpolation system and method
US6539056 *21 juil. 199925 mars 2003Sony CorporationPicture decoding method and apparatus
US6650781 *8 juin 200118 nov. 2003Hitachi, Ltd.Image decoder
US6661470 *25 mars 19989 déc. 2003Matsushita Electric Industrial Co., Ltd.Moving picture display method and apparatus
US6728317 *7 avr. 200027 avr. 2004Dolby Laboratories Licensing CorporationMoving image compression quality enhancement using displacement filters with negative lobes
US6950469 *17 sept. 200127 sept. 2005Nokia CorporationMethod for sub-pixel value interpolation
US20010050957 *8 juin 200113 déc. 2001Yuichiro NakayaImage decoding method
US20020186890 *3 mai 200112 déc. 2002Ming-Chieh LeeDynamic filtering for lossy compression
US20030095603 *16 nov. 200122 mai 2003Koninklijke Philips Electronics N.V.Reduced-complexity video decoding using larger pixel-grid motion compensation
US20030112864 *17 sept. 200119 juin 2003Marta KarczewiczMethod for sub-pixel value interpolation
US20030152146 *17 déc. 200214 août 2003Microsoft CorporationMotion compensation loop with filtering
US20030202705 *24 févr. 200330 oct. 2003Sharp Laboratories Of America, Inc.System and method for lossless video coding
US20050036700 *30 juil. 200417 févr. 2005Yuichiro NakayaEncoding and decoding method and apparatus using plus and/or minus rounding of images
USRE38563 *19 nov. 200110 août 2004Gen Instrument CorpPrediction and coding of bi-directionally predicted video object planes for interlaced digital video
Référencé par
Brevet citant Date de dépôt Date de publication Déposant Titre
US6944224 *14 août 200313 sept. 2005Intervideo, Inc.Systems and methods for selecting a macroblock mode in a video encoder
US71104594 mars 200319 sept. 2006Microsoft CorporationApproximate bicubic filter
US71168314 mars 20033 oct. 2006Microsoft CorporationChrominance motion vector rounding
US722473128 juin 200229 mai 2007Microsoft CorporationMotion estimation/compensation for screen capture video
US73050344 mars 20034 déc. 2007Microsoft CorporationRounding control for multi-stage interpolation
US7346109 *26 avr. 200418 mars 2008Genesis Microchip Inc.Motion vector computation for video sequences
US734622612 oct. 200418 mars 2008Genesis Microchip Inc.Method and apparatus for MPEG artifacts reduction
US737907615 juil. 200427 mai 2008Microsoft CorporationUsing pixel homogeneity to improve the clarity of images
US740898613 juin 20035 août 2008Microsoft CorporationIncreasing motion smoothness using frame interpolation with motion analysis
US74211294 sept. 20032 sept. 2008Microsoft CorporationImage compression and synthesis for video effects
US745743814 juin 200425 nov. 2008Genesis Microchip Inc.Robust camera pan vector estimation using iterative center of mass
US748033426 avr. 200420 janv. 2009Genesis Microchip Inc.Temporal motion vector filtering
US749949410 juin 20043 mars 2009Genesis Microchip Inc.Vector selection decision for pixel interpolation
US7555043 *8 avr. 200330 juin 2009Sony CorporationImage processing apparatus and method
US755832013 juin 20037 juil. 2009Microsoft CorporationQuality control in frame interpolation with motion analysis
US76201094 mars 200317 nov. 2009Microsoft CorporationSub-pixel interpolation in motion estimation and compensation
US76468109 déc. 200512 janv. 2010Microsoft CorporationVideo coding
US7660471 *29 nov. 20049 févr. 2010Tandberg Telecom AsMethod for correcting interpolated pixel values
US768018515 sept. 200416 mars 2010Microsoft CorporationSelf-referencing bi-directionally predicted frames
US769760819 nov. 200413 avr. 2010Sony CorporationScalable MPEG video/macro block rate control
US773855417 juil. 200415 juin 2010Microsoft CorporationDC coefficient signaling at small quantization step sizes
US785293615 sept. 200414 déc. 2010Microsoft CorporationMotion vector prediction in bi-directionally predicted interlaced field-coded pictures
US79249202 sept. 200412 avr. 2011Microsoft CorporationMotion vector coding and decoding in interlaced frame coded pictures
US801912423 oct. 200813 sept. 2011Tamiras Per Pte. Ltd., LlcRobust camera pan vector estimation using iterative center of mass
US802738425 févr. 201027 sept. 2011Sony CorporationScalable MPEG video/macro block rate control
US806452029 juin 200422 nov. 2011Microsoft CorporationAdvanced bi-directional predictive coding of interlaced video
US81551957 avr. 200610 avr. 2012Microsoft CorporationSwitching distortion metrics during motion estimation
US81896662 févr. 200929 mai 2012Microsoft CorporationLocal picture identifier and computation of co-located information
US821862725 févr. 201010 juil. 2012Sony CorporationScalable MPEG video/macro block rate control
US825445530 juin 200728 août 2012Microsoft CorporationComputing collocated macroblock information for direct mode macroblocks
US83154367 juil. 201120 nov. 2012Tamiras Per Pte. Ltd., LlcRobust camera pan vector estimation using iterative center of mass
US83352579 déc. 200818 déc. 2012Tamiras Per Pte. Ltd., LlcVector selection decision for pixel interpolation
US837424520 sept. 200612 févr. 2013Microsoft CorporationSpatiotemporal prediction for bidirectionally predictive(B) pictures and motion vector prediction for multi-picture reference motion compensation
US837972221 août 200619 févr. 2013Microsoft CorporationTimestamp-independent motion vector prediction for predictive (P) and bidirectionally predictive (B) pictures
US8385420 *27 déc. 200726 févr. 2013Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US840630029 mai 200926 mars 2013Microsoft CorporationVideo coding
US84940527 avr. 200623 juil. 2013Microsoft CorporationDynamic selection of motion estimation search ranges and extended motion vector ranges
US849405425 oct. 200723 juil. 2013Genesis Microchip, Inc.Motion vector computation for video sequences
US849833726 oct. 201030 juil. 2013Lg Electronics Inc.Method for decoding and encoding a video signal
US85883064 nov. 200819 nov. 2013Tamiras Per Pte. Ltd., LlcTemporal motion vector filtering
US862566911 mars 20097 janv. 2014Microsoft CorporationPredicting motion vectors for fields of forward-predicted interlaced video frames
US862567430 janv. 20137 janv. 2014Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US86388534 févr. 201028 janv. 2014Microsoft CorporationVideo coding
US868769724 avr. 20121 avr. 2014Microsoft CorporationCoding of motion vector information
US8761258 *16 juin 200624 juin 2014The Hong Kong University Of Science And TechnologyEnhanced block-based motion estimation algorithms for video compression
US877428029 janv. 20138 juil. 2014Microsoft CorporationTimestamp-independent motion vector prediction for predictive (P) and bidirectionally predictive (B) pictures
US8831105 *31 janv. 20139 sept. 2014Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US88736306 févr. 201328 oct. 2014Microsoft CorporationSpatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
US891776821 nov. 200823 déc. 2014Microsoft CorporationCoding of motion vector information
US9078007 *29 avr. 20097 juil. 2015Qualcomm IncorporatedDigital video coding with interpolation filters and offsets
US9094686 *5 sept. 200728 juil. 2015Broadcom CorporationSystems and methods for faster throughput for compressed video data decoding
US9113110 *31 janv. 201318 août 2015Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US911311131 janv. 201318 août 2015Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US911311231 janv. 201318 août 2015Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US914866813 févr. 201429 sept. 2015Microsoft Technology Licensing, LlcCoding of motion vector information
US918542730 sept. 201410 nov. 2015Microsoft Technology Licensing, LlcSpatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
US9191678 *16 juil. 200717 nov. 2015Samsung Electronics Co., Ltd.Apparatus and method of restoring image
US9264725 *25 juin 201216 févr. 2016Google Inc.Selection of phase offsets for interpolation filters for motion compensation
US931350914 juin 201012 avr. 2016Microsoft Technology Licensing, LlcDC coefficient signaling at small quantization step sizes
US931351814 juil. 201512 avr. 2016Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US93135199 mars 201212 avr. 2016Google Technology Holdings LLCInterpolation filter selection using prediction unit (PU) size
US9319711 *2 juil. 201219 avr. 2016Google Technology Holdings LLCJoint sub-pixel interpolation filter for temporal prediction
US9332264 *30 déc. 20073 mai 2016Intel CorporationConfigurable performance motion estimation for video encoding
US936973114 juil. 201514 juin 2016Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US95718545 oct. 201514 févr. 2017Microsoft Technology Licensing, LlcSpatiotemporal prediction for bidirectionally predictive (B) pictures and motion vector prediction for multi-picture reference motion compensation
US9628793 *26 sept. 201418 avr. 2017Polycom, Inc.Motion estimation
US9648340 *15 juil. 20139 mai 2017Sk Telecom Co., Ltd.Method and device for encoding/decoding motion vector
US974964213 oct. 201429 août 2017Microsoft Technology Licensing, LlcSelection of motion vector precision
US977488131 oct. 201426 sept. 2017Microsoft Technology Licensing, LlcRepresenting motion vectors in an encoded bitstream
US20030194011 *4 mars 200316 oct. 2003Microsoft CorporationRounding control for multi-stage interpolation
US20030198294 *23 avr. 200223 oct. 2003Andre ZaccarinMethods and apparatuses for selecting encoding options based on decoding energy requirements
US20030202607 *4 mars 200330 oct. 2003Microsoft CorporationSub-pixel interpolation in motion estimation and compensation
US20040008899 *13 juin 200315 janv. 2004Alexandros TourapisOptimization techniques for data compression
US20040131261 *4 sept. 20038 juil. 2004Microsoft CorporationImage compression and synthesis for video effects
US20040213470 *8 avr. 200328 oct. 2004Sony CorporationImage processing apparatus and method
US20050013497 *18 juil. 200320 janv. 2005Microsoft CorporationIntraframe and interframe interlace coding and decoding
US20050025249 *14 août 20033 févr. 2005Lifeng ZhaoSystems and methods for selecting a macroblock mode in a video encoder
US20050053295 *2 sept. 200410 mars 2005Microsoft CorporationChroma motion vector derivation for interlaced forward-predicted fields
US20050056618 *15 sept. 200317 mars 2005Schmidt Kenneth R.Sheet-to-tube welded structure and method
US20050117810 *29 nov. 20042 juin 2005Gisle BjontegaardMethod for correcting interpolated pixel values
US20050129330 *12 oct. 200416 juin 2005Genesis Microchip Inc.Method and apparatus for MPEG artifacts reduction
US20050135482 *26 avr. 200423 juin 2005Genesis Microchip Inc.Motion vector computation for video sequences
US20050135483 *26 avr. 200423 juin 2005Genesis Microchip Inc.Temporal motion vector filtering
US20050135485 *10 juin 200423 juin 2005Genesis Microchip Inc.Vector selection decision for pixel interpolation
US20050169369 *17 sept. 20044 août 2005Sony CorporationScalable MPEG video/macro block rate control
US20050169370 *19 nov. 20044 août 2005Sony Electronics Inc.Scalable MPEG video/macro block rate control
US20050195278 *14 juin 20048 sept. 2005Genesis Microchip Inc.Robust camera pan vector estimation using iterative center of mass
US20050207496 *14 mars 200522 sept. 2005Daisaku KomiyaMoving picture coding apparatus
US20050226323 *30 mars 200513 oct. 2005Mitsubishi Denki Kabushiki KaishaDirection-adaptive scalable motion parameter coding for scalable video coding
US20060012610 *15 juil. 200419 janv. 2006Karlov Donald DUsing pixel homogeneity to improve the clarity of images
US20060133507 *5 déc. 200522 juin 2006Matsushita Electric Industrial Co., Ltd.Picture information decoding method and picture information encoding method
US20060280252 *14 juin 200614 déc. 2006Samsung Electronics Co., Ltd.Method and apparatus for encoding video signal with improved compression efficiency using model switching in motion estimation of sub-pixel
US20070086518 *5 oct. 200619 avr. 2007Byeong-Moon JeonMethod and apparatus for generating a motion vector
US20070147493 *5 oct. 200628 juin 2007Byeong-Moon JeonMethods and apparatuses for constructing a residual data stream and methods and apparatuses for reconstructing image blocks
US20070154103 *16 juin 20065 juil. 2007Au Oscar C LEnhanced block-based motion estimation algorithms for video compression
US20070195879 *5 oct. 200623 août 2007Byeong-Moon JeonMethod and apparatus for encoding a motion vection
US20070237226 *7 avr. 200611 oct. 2007Microsoft CorporationSwitching distortion metrics during motion estimation
US20070237232 *7 avr. 200611 oct. 2007Microsoft CorporationDynamic selection of motion estimation search ranges and extended motion vector ranges
US20070237239 *5 oct. 200611 oct. 2007Byeong-Moon JeonMethods and apparatuses for encoding and decoding a video data stream
US20070253486 *5 oct. 20061 nov. 2007Byeong-Moon JeonMethod and apparatus for reconstructing an image block
US20070268964 *22 mai 200622 nov. 2007Microsoft CorporationUnit co-location-based motion estimation
US20080019450 *16 juil. 200724 janv. 2008Samsung Electronics Co., Ltd.Apparatus and method of restoring image
US20080043850 *25 oct. 200721 févr. 2008Genesis Microchip Inc.Motion vector computation for video sequences
US20080159401 *27 déc. 20073 juil. 2008Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US20090086103 *23 oct. 20082 avr. 2009Nair Hari NRobust camera pan vector estimation using iterative center of mass
US20090116557 *4 nov. 20087 mai 2009Nair Hari NTemporal motion vector filtering
US20090135913 *9 déc. 200828 mai 2009Nair Hari NVector selection decision for pixel interpolation
US20090168883 *30 déc. 20072 juil. 2009Ning LuConfigurable performance motion estimation for video encoding
US20090323807 *30 juin 200831 déc. 2009Nicholas MastronardeEnabling selective use of fractional and bidirectional video motion estimation
US20100111182 *29 avr. 20096 mai 2010Qualcomm IncorporatedDigital video coding with interpolation filters and offsets
US20100150227 *25 févr. 201017 juin 2010Sony CorporationScalable mpeg video/macro block rate control
US20100150228 *25 févr. 201017 juin 2010Sony CorporationScalable mpeg video/macro block rate control
US20100296582 *19 mai 200925 nov. 2010Kenji ShimizuImage coding device and image coding method
US20110069763 *21 janv. 201024 mars 2011Electronics And Telecommunications Research InstituteScalable video encoding/decoding method and apparatus for increasing image quality of base layer
US20110110434 *26 oct. 201012 mai 2011Seung Wook ParkMethod for decoding and encoding a video signal
US20120069906 *3 juin 201022 mars 2012Kazushi SatoImage processing apparatus and method (as amended)
US20120183041 *6 déc. 201119 juil. 2012Sony CorporationInterpolation filter for intra prediction of hevc
US20130003841 *2 juil. 20123 janv. 2013General Instrument CorporationJoint sub-pixel interpolation filter for temporal prediction
US20130051463 *25 juin 201228 févr. 2013General Instrument CorporationSelection of phase offsets for interpolation filters for motion compensation
US20130136186 *31 janv. 201330 mai 2013Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US20130142265 *31 janv. 20136 juin 2013Samsung Electronics Co., Ltd.Method and apparatus for estimating motion vector using plurality of motion vector predictors, encoder, decoder, and decoding method
US20130294518 *15 juil. 20137 nov. 2013Sk Telecom Co., Ltd.Method and device for encoding/decoding motion vector
US20140307793 *5 sept. 200716 oct. 2014Alexander MacInnisSystems and Methods for Faster Throughput for Compressed Video Data Decoding
CN1681291B22 déc. 200425 sept. 2013创世纪微芯片公司Method for computing motion vector of digital video sequence
CN102804779A *3 juin 201028 nov. 2012索尼公司Image processing device and method
EP1549054A2 *16 déc. 200429 juin 2005Genesis Microchip, Inc.Motion compensated frame rate conversion
EP1549054A3 *16 déc. 200422 févr. 2006Genesis Microchip, Inc.Motion compensated frame rate conversion
WO2006108654A2 *13 avr. 200619 oct. 2006Universität HannoverMethod and apparatus for enhanced video coding
WO2006108654A3 *13 avr. 200610 mai 2007Univ HannoverMethod and apparatus for enhanced video coding
Classifications
Classification aux États-Unis375/240.16, 375/240.13, 375/E07.26, 375/E07.176, 375/E07.125, 375/E07.19, 375/E07.145, 375/E07.113, 375/E07.154, 375/E07.199, 375/E07.142, 375/E07.181, 375/E07.133, 375/E07.153, 375/E07.194, 375/E07.129
Classification internationaleH04N19/895, G06T9/00
Classification coopérativeH04N19/625, H04N19/182, H04N19/15, H04N19/523, H04N19/124, H04N19/513, H04N19/154, H04N19/122, H04N19/115, H04N19/137, H04N19/80, H04N19/147, H04N19/593, H04N19/82, H04N19/18, H04N19/86, H04N19/57, H04N19/117, H04N19/63, H04N19/136, H04N19/61, H04N19/895, H04N19/46, H04N19/132, H04N19/105, H04N19/119, H04N19/146, H04N19/70, H04N19/172, H04N19/129, H04N19/176, G06T3/40, H04N19/547, H04N19/59, H04N19/52, H04N19/50, H04N19/533, H04N19/527, G06T7/0012
Classification européenneH04N7/26A4T, H04N7/34B, H04N7/26F, H04N7/50, H04N7/26P4, H04N7/26A4Z, H04N7/26A4B, H04N7/26M6E2, H04N7/26A8B, H04N7/26A6E, H04N7/26A4S, H04N7/26A10S, H04N7/36C4, H04N7/26F2, H04N7/26Y, H04N7/26M2S, H04N7/26A8P, H04N7/26A6D
Événements juridiques
DateCodeÉvénementDescription
21 avr. 2003ASAssignment
Owner name: MICROSOFT CORPORATION, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HSU, POHSIANG;LIN, CHIH-LUNG;LEE, MING-CHIEH;REEL/FRAME:013976/0638
Effective date: 20030407
15 janv. 2015ASAssignment
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034766/0001
Effective date: 20141014