US20060227871A1 - Video thumbnail method - Google Patents
Video thumbnail method Download PDFInfo
- Publication number
- US20060227871A1 US20060227871A1 US11/095,286 US9528605A US2006227871A1 US 20060227871 A1 US20060227871 A1 US 20060227871A1 US 9528605 A US9528605 A US 9528605A US 2006227871 A1 US2006227871 A1 US 2006227871A1
- Authority
- US
- United States
- Prior art keywords
- frames
- video
- resolution
- intra
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000000605 extraction Methods 0.000 abstract description 3
- 241000023320 Luma <angiosperm> Species 0.000 description 6
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 6
- 238000013139 quantization Methods 0.000 description 5
- 238000003491 array Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 3
- 230000002123 temporal effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/59—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/12—Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
- H04N19/122—Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/18—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/33—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/37—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability with arrangements for assigning different transmission priorities to video input data or to video coded data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
- H04N19/64—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by ordering of coefficients or of bits for transmission
- H04N19/647—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets characterised by ordering of coefficients or of bits for transmission using significance based coding, e.g. Embedded Zerotrees of Wavelets [EZW] or Set Partitioning in Hierarchical Trees [SPIHT]
Definitions
- the present invention relates to digital video storage, and more particularly to methods and structures for browsing stored video.
- Video thumbnails of video clips stored in an archive can be created to aid in the process of visual browsing of the archive. That is, video thumbnails are the extension of the popular concept of image thumbnails in that video thumbnails are lower spatial and/or temporal resolution versions of the original video clip which can be easily decoded and viewed to assess the contents of the corresponding full video clip.
- Known methods for video thumbnails include the Macromedia flash MX 2004 which provides a scheme to create video thumbnails.
- a user selects a set of reference frames from a given video clip, and then the user encodes the selected set of frames at a lower resolution as a separate file.
- the video thumbnail could be a 2-seconds-long excerpt with resolution of 84 ⁇ 68 pixels/frame but the same frame rate of 15 frames/sec.
- the video thumbnail would have a file size of roughly 3% of the original video clip file size.
- both the original video clip file and the associated video thumbnail file can be compressed using a standard video coding method.
- various international standards for video coding have been and are continuing to be developed.
- Current standards such as H.263, MPEG-2, MPEG-4, and H.264, use a hybrid of block motion compensation and transform coding for compression.
- Block motion compensation decomposes a picture into blocks for prediction by blocks of preceding pictures; this relies upon removal of temporal redundancies.
- Transform of blocks to a spatial frequency domain (and quantization) for coding relies upon removal of spatial redundancies.
- I frames Intra-coded pictures
- P and B frames Inter-coded pictures
- An I frame has all Intra-coded blocks which proceeds by transforming the block to the frequency domain and (quantizing and) encoding; for example, a 16 ⁇ 16 macroblock may have its 8 ⁇ 8 blocks (4 luma blocks and 2 chroma blocks) transformed with a discrete cosine transformation (DCT) or may have is 4 ⁇ 4 blocks (16 luma blocks and 8 chroma blocks) transformed with an integer transform which approximates a DCT.
- a P or B frame has at least one Inter-coded block which proceeds by finding the best prediction block in prior pictures (thereby defining its motion vector) and then transforming the residual block (i.e., the difference block between the current block and its prediction block) to the frequency domain for (quantizing and) encoding.
- a non-block transform such as a discrete wavelet transform (DWT) could be used in place of the block transform; MPEG-4 and JPEG2000 provide for DWT.
- DWT discrete wavelet transform
- FIG. 2 depicts the functions in typical block motion compensation video encoding using DCT and variable length coding (VLC) of the quantized transform coefficients (Q).
- VLC variable length coding
- IQ inverse-quantization
- IDCT inverse DCT
- the rate-control unit in FIG. 2 is responsible for producing the quantizer scale (quantizer parameter, QP) according to the target bit-rate and buffer-fullness to control the DCT-coefficients quantization unit.
- QP quantizer parameter
- the present invention provides video thumbnails which include the use of intra-coded reference frames to create video thumbnails which may be embedded in the original video clips.
- FIGS. 1 a - 1 b are flow diagrams.
- FIG. 2 illustrates the functions of hybrid block-based motion compensation plus DCT transform video encoding.
- FIGS. 3-4 show decoding.
- video thumbnail methods use I frames (I-vops) of an encoded video clip to extract (at least in part) a video thumbnail; and when the I frames have scalable encoding, the methods can use it for zoom.
- the video thumbnail may have an initial resolution determined by the low resolution of the extracted scalable encoded I frames, and the thumbnail zoom simply uses higher resolution decodings of the scalable encoded I frame.
- the thumbnail frame rate depends upon whether fast-forward or normal-speed motion is desired; for normal-speed motion, the video thumbnail frame rate will just be rate of available I frames. In contrast, for fast forward with a high fraction of I frames (e.g., all I frames or IPIP . . . type clips), the methods may skip I frames to approximate the target frame rate.
- FIG. 1 a illustrates the flow (where “next” frame may be the same frame if paused and zoomed), and FIG. 3 shows decoding for either the full video clip or for its preferred embodiment video thumbnail.
- FIG. 1 b illustrates a preferred embodiment hybrid of extraction of thumbnail frames from intra-coded frames of a video clip for zoom together with a separate low-resolution thumbnail file. This hybrid is useful when the video clip does not have scalable intra-encoding.
- Preferred embodiment systems perform preferred embodiment methods with digital signal processors (DSPs) or general-purpose programmable processors or application specific circuitry or systems on a chip (SoC) such as both a DSP and RISC processor on the same chip with the RISC processor controlling.
- DSPs digital signal processors
- SoC system on a chip
- Programs could be stored in memory in an onboard ROM or external flash EEPROM for a DSP or programmable processor to perform the signal processing of the preferred embodiment methods.
- First preferred embodiment methods extract video thumbnails from video clips encoded with a zero-tree wavelet transform for I frames; this would include encoding methods such as a motion JPEG2000 (sequence of I frames) and MPEG-4 with the encoding of still texture video objects by wavelet transform plus zero-tree coding.
- encoding methods such as a motion JPEG2000 (sequence of I frames) and MPEG-4 with the encoding of still texture video objects by wavelet transform plus zero-tree coding.
- DWT discrete wavelet transform
- each 16 ⁇ 16 macroblock of samples roughly yields 3 8 ⁇ 8 blocks (the HL1, LH1, and HH1 subbands), 3 4 ⁇ 4 blocks (the HL2, LH2, and HH2 subbands), 3 2 ⁇ 2 blocks (the HL3, LH3, and HH3 subbands), and 4 1 ⁇ 1 blocks (the LH4, HL4, HH4, and LL4 subbands) of wavelet coefficients; that is, the 256 pixels are transformed into 256 coefficients.
- the LL4 coefficient is termed the DC coefficient because it is the result of four repeated lowpass filterings plus decimations.
- a frame of N ⁇ M macroblocks yields an N ⁇ M array of DC coefficients which is a low-resolution version of the frame plus 3 N ⁇ M zero trees.
- the DC subband of a 640 ⁇ 480-pixel (VGA) I frame is a 40 ⁇ 30-pixel low-resolution version of the I frame. Note that this array of DC coefficients is predictively encoded in MPEG-4; see FIG. 4 illustrating decoding with the top branch for the DC coefficients.
- the first preferred embodiment methods parse the video clip to select the required I frames and initially decode just the DC coefficients to give a sequence of low-resolution frames.
- the length and frame rate of the video thumbnail are selectable; the length is determined by where the extraction of I frames begins and ends in the video clip and the playback rate (normal-speed or fast forward).
- the video thumbnail frame rate would be 5 frames/sec for a somewhat discontinuous-appearing playback at normal speed. However, if these same I frames are used in a 25 frames/sec playback, then this would appear as a 5 ⁇ fast forward.
- the I frame rate is greater than the desired playback rate, such as for fast forward with a video clip of IPIP . . . frames, then skip I frames to achieve the desirable video thumbnail frame rate.
- the first preferred embodiment video thumbnail methods provide zoom by decoding higher subbands in addition to the DC subband and combining. For example, if just the zero-tree roots are decoded, then this recovers the LH4, HL4, and HH4 subbands which, when combined with the previously-decoded DC (LL4) subband, reconstructs the LL3 subband; this is a 2N ⁇ 2M array version of the original I frame. Similarly, decoding further into the zero-trees successively reconstructs 4N ⁇ 4M, 8N ⁇ 8M, and 16N ⁇ 16M arrays with the 16N ⁇ 16M being the reconstructed original I frame. In short, the preferred embodiment methods simply additionally decode sufficient higher layers of an I frame with scalable encoding to create increasing resolution zooms. Of course, upsampling plus interpolation can adjust array size and provide intermediate zoom factors.
- Second preferred embodiment methods of video thumbnail creation are similar to the first preferred embodiments but extract from a video clip with DCT-encoded I frames.
- Video coding methods such MPEG-1/2/4 and MJPEG (motion JPEG) use DCT transforms for the I frames.
- MPEG-1/2/4 and MJPEG motion JPEG
- MJPEG motion JPEG
- Each 8 ⁇ 8 DCT block has one DC coefficient and 63 AC coefficients.
- the DC coefficients form a 2N ⁇ 2M luminance plus two N ⁇ M chrominance arrays, and thus provide a low resolution version of the encoded I frame.
- a 640 ⁇ 480-pixel encoded I frame (40 ⁇ 30 macroblocks) has DC coefficients forming a 80 ⁇ 60-pixel low resolution version of the frame.
- the second preferred embodiment methods parse a video clip to select the required I frames and initially decode just the DC coefficients to give a sequence of low-resolution frames as a video thumbnail.
- the DC coefficients may be separated from the AC coefficients by a dc marker, so the file parsing can be quite simple.
- the DC coefficients may be encoded using predictions from earlier-in-the-scan DC coefficients, so the decoding includes inverse prediction.
- the length of the video thumbnail is selected by the start and stop locations in the video clip, and the frame rate of the video thumbnail is determined by the frequency of I frames in the video clip.
- the second preferred embodiment video thumbnail methods provide zoom by decoding some of the AC coefficients in addition to the DC coefficients. For example, combining the three lowest frequency AC coefficients with the DC coefficient and then applying a 2 ⁇ 2 inverse DCT (and inverse quantization) gives a 4N ⁇ 4M array version of the I frame which enhaneces the resolution of the DC-coefficient-only version.
- Third preferred embodiment methods are similar to the first and second preferred embodiments but extract video thumbnails from an H.264/AVC encoded video clip.
- the clip includes one or more coded video sequences with each coded video sequence consisting of a series of access units that are sequential in a network access layer (NAL) unit stream and use only one sequence parameter set.
- NAL network access layer
- Each access unit decodes to one picture.
- Each coded video sequence begins with an instantaneous decoding refresh (IDR) access unit which contains an Intra-coded picture, and the coded video sequence can be decoded without reference to any other coded video sequence.
- IDR instantaneous decoding refresh
- An access unit generally contains a set of video coding layer (VCL) NAL units which encode a picture plus various optional other NAL units such as delimiters, end of sequence, and supplemental enhancement information (SEI) non-VCL NAL units together with redundant VCL NAL units.
- VCL video coding layer
- SEI supplemental enhancement information
- the third preferred embodiment methods parse a video clip to select the access units with Intra-coded pictures (including the IDR access units), and extract frames for a video thumbnail.
- VCL NAL units consist of (start codes), headers, and payloads of slices or slice data partitions that represent the samples of the video picture encoded in the access unit.
- a picture may be partitioned into one or more slices, where a slice is a group of macroblocks (16 ⁇ 16 luma and 8 ⁇ 8 chroma) which are coded using only within-slice data plus any reference pictures.
- each slice can be coded using one of five coding types: (1) an “I slice” has all macroblocks encoded using intra prediction; (2) a “P slice” has at least some macroblocks coded using inter prediction with one motion-compensation prediction per block and the remaining macroblocks have I slice coding; (3) a “B slice” has at least some macroblocks coded using inter prediction with two motion-compensation predictions per block and the remaining macroblocks have P slice coding; (4) an “SP slice” is a “switching” P slice coded for efficient switching between pre-coded pictures; and (5) an “SI slice” is a switching I slice coded to provide an exact match of a macroblock in an SP slice which is useful for random access or error recovery.
- All luma and chroma samples of a macroblock are either spatially (intra) predicted or temporally (inter) predicted, and the prediction residual (error) is transform coded.
- the spatial prediction for a luma block can be one of Intra — 4 ⁇ 4, Intra — 16 ⁇ 16, or I_PCM (which skips the prediction).
- the spatial prediction uses already encoded blocks (above or to the left of the current block) and is performed in the spatial domain, not the transform domain. For an Intra coded picture, the predictions are all spatial and confined to the I slice containing the current macroblock.
- the transform coding utilizes 4 ⁇ 4 blocks and a 4 ⁇ 4 integer transform; the resulting coefficients consist of one DC coefficient and 15 AC coefficients.
- the resulting 16 4 ⁇ 4 transformed blocks yield 16 luma DC coefficients which form a 4 ⁇ 4 array and which is subjected to a second 4 ⁇ 4 integer transform (plus 2 ⁇ 2 transforms for each of the corresponding two chroma DC coefficient 2 ⁇ 2 arrays). After transformation, scale and quantize the coefficients.
- the third preferred embodiment decodes (inverse quantization, inverse scaling, inverse second integer transforms if Intra — 16 ⁇ 16) the DC coefficients to reconstruct a 4N ⁇ 4M-pixel array from an Intra-encoded N ⁇ M-macroblock picture. This forms a low resolution version of the Intra encoded picture. Again with the 640 ⁇ 480-pixel example, the DC coefficients define a 160 ⁇ 120-pixel low-resolution version.
- For zooming (4 to 1) include the AC coefficients of the Intra-encoded picture, and decode.
- Alternative video clip thumbnail preferred embodiments do not rely upon scalable encoded I frames for all of the zoom capability; rather, they are hybrids with one or more separate thumbnail files of differing resolution created and stored along with the video clip to provide further zoom levels; see FIG. 1 b.
- a fourth preferred embodiment uses the third preferred embodiment to provide two resolutions (one-sixteenth resolution and full resolution) plus has a third resolution provided by a separate file made up of one quarter resolution versions of the same Intra-coded pictures used by the third preferred embodiment.
- the thumbnail provides a 160 ⁇ 120 resolution sequence by decoding the DC coefficients of Intra pictures, a 640 ⁇ 480 resolution sequence by decoding entire Intra pictures, and a 320 ⁇ 240 resolution sequence from a separate (encoded) file of downsampled versions of the Intra pictures.
- This middle resolution file may have any convenient encoding to compress its size.
- thumbnails may be used to provide further resolutions in addition to those available from the I frames.
- a preferred embodiment thumbnail for a video clip with I frames not having scalable encoding decodes the I frames only for zoom to the highest resolution; separate thumbnail file(s) would have the lowest (initial) resolution and any intermediate resolution(s) for intermediate zoom.
- a first separate thumbnail file provides a 160 ⁇ 120 resolution sequence created by downsampling the reference intra-coded pictures
- a second separate thumbnail file provides a 320 ⁇ 240 resolution sequence for 2 to 1 zoom and created again by downsampling reference pictures; and lastly a 640 ⁇ 480 resolution sequence for 4 to 1 zoom by decoding the reference pictures of the video clip.
- these separate thumbnail file(s) could be compressed.
- the preferred embodiments can be varied while retaining one or more of the features of extracting a video thumbnail from a video clip by decoding the base (low resolution) layer of Intra-coded pictures which have scalable encoding plus providing zoom by higher layer decoding and of multiple thumbnail files of varying resolution for zoom.
- the encoding method of the video clip could differ from the preferred embodiment examples of MPEG, H.264, . . . ; the picture/frame sizes of the video clip could vary; and so forth.
Abstract
Video thumbnails for browsing video clips compressed with methods including scalable intra-frame coding are created by extraction and decoding of the base layer Intra-coded pictures of a video clip. Zoom is available by additional decoding of higher layer(s) of the intra-coded pictures. No separate video thumbnail file is required.
Description
- The present invention relates to digital video storage, and more particularly to methods and structures for browsing stored video.
- Recent years have seen an explosion in the number of video clips that are being produced and archived. This is mainly due to the increasing popularity of streaming video, camera phones, and digital camcorders. As a result, methods for easy visual browsing of stored video archives are becoming important. Video thumbnails of video clips stored in an archive can be created to aid in the process of visual browsing of the archive. That is, video thumbnails are the extension of the popular concept of image thumbnails in that video thumbnails are lower spatial and/or temporal resolution versions of the original video clip which can be easily decoded and viewed to assess the contents of the corresponding full video clip.
- Known methods for video thumbnails include the Macromedia flash MX 2004 which provides a scheme to create video thumbnails. First, a user selects a set of reference frames from a given video clip, and then the user encodes the selected set of frames at a lower resolution as a separate file. Indeed, for a 10-seconds-long video clip with resolution of 240×180 pixels/frame and frame rate of 15 frames/sec, the video thumbnail could be a 2-seconds-long excerpt with resolution of 84×68 pixels/frame but the same frame rate of 15 frames/sec. Thus the video thumbnail would have a file size of roughly 3% of the original video clip file size.
- Of course, both the original video clip file and the associated video thumbnail file can be compressed using a standard video coding method. Indeed, various international standards for video coding have been and are continuing to be developed. Current standards, such as H.263, MPEG-2, MPEG-4, and H.264, use a hybrid of block motion compensation and transform coding for compression. Block motion compensation decomposes a picture into blocks for prediction by blocks of preceding pictures; this relies upon removal of temporal redundancies. Transform of blocks to a spatial frequency domain (and quantization) for coding relies upon removal of spatial redundancies. In this approach there are Intra-coded pictures (I frames) and Inter-coded pictures (P and B frames). An I frame has all Intra-coded blocks which proceeds by transforming the block to the frequency domain and (quantizing and) encoding; for example, a 16×16 macroblock may have its 8×8 blocks (4 luma blocks and 2 chroma blocks) transformed with a discrete cosine transformation (DCT) or may have is 4×4 blocks (16 luma blocks and 8 chroma blocks) transformed with an integer transform which approximates a DCT. In contrast, a P or B frame has at least one Inter-coded block which proceeds by finding the best prediction block in prior pictures (thereby defining its motion vector) and then transforming the residual block (i.e., the difference block between the current block and its prediction block) to the frequency domain for (quantizing and) encoding. Note that for an I frame, a non-block transform, such as a discrete wavelet transform (DWT), could be used in place of the block transform; MPEG-4 and JPEG2000 provide for DWT.
-
FIG. 2 depicts the functions in typical block motion compensation video encoding using DCT and variable length coding (VLC) of the quantized transform coefficients (Q). For motion compensation (MC), inverse-quantization (IQ) and inverse DCT (IDCT) are needed for the feedback loop. Except for MC, all the functions inFIG. 2 operate on an 8×8 block basis. The rate-control unit inFIG. 2 is responsible for producing the quantizer scale (quantizer parameter, QP) according to the target bit-rate and buffer-fullness to control the DCT-coefficients quantization unit. Indeed, a larger quantizer scale implies more vanishing and/or smaller quantized coefficients which means fewer and/or shorter codewords. - However, compressing both a video clip and its associated video thumbnail has problems of maintaining two separate files and the known video thumbnails have problems including a lack of zoom capability due to the low resolution.
- The present invention provides video thumbnails which include the use of intra-coded reference frames to create video thumbnails which may be embedded in the original video clips.
-
FIGS. 1 a-1 b are flow diagrams. -
FIG. 2 illustrates the functions of hybrid block-based motion compensation plus DCT transform video encoding. -
FIGS. 3-4 show decoding. - 1. Overview
- Preferred embodiment video thumbnail methods use I frames (I-vops) of an encoded video clip to extract (at least in part) a video thumbnail; and when the I frames have scalable encoding, the methods can use it for zoom. The video thumbnail may have an initial resolution determined by the low resolution of the extracted scalable encoded I frames, and the thumbnail zoom simply uses higher resolution decodings of the scalable encoded I frame. The thumbnail frame rate depends upon whether fast-forward or normal-speed motion is desired; for normal-speed motion, the video thumbnail frame rate will just be rate of available I frames. In contrast, for fast forward with a high fraction of I frames (e.g., all I frames or IPIP . . . type clips), the methods may skip I frames to approximate the target frame rate. The same approach applies to both pictures encoded as progressive frames or as interlaced fields. The preferred embodiment video thumbnail is not a separate file, but rather a method to extract a video thumbnail from the original video clip. This permits full zoom capability in the video thumbnail plus requires no additional files or storage for the video thumbnail.
FIG. 1 a illustrates the flow (where “next” frame may be the same frame if paused and zoomed), andFIG. 3 shows decoding for either the full video clip or for its preferred embodiment video thumbnail.FIG. 1 b illustrates a preferred embodiment hybrid of extraction of thumbnail frames from intra-coded frames of a video clip for zoom together with a separate low-resolution thumbnail file. This hybrid is useful when the video clip does not have scalable intra-encoding. - Preferred embodiment systems (e.g., video clip archive with browser) perform preferred embodiment methods with digital signal processors (DSPs) or general-purpose programmable processors or application specific circuitry or systems on a chip (SoC) such as both a DSP and RISC processor on the same chip with the RISC processor controlling. Programs could be stored in memory in an onboard ROM or external flash EEPROM for a DSP or programmable processor to perform the signal processing of the preferred embodiment methods.
- 2. Wavelet I Frames
- First preferred embodiment methods extract video thumbnails from video clips encoded with a zero-tree wavelet transform for I frames; this would include encoding methods such as a motion JPEG2000 (sequence of I frames) and MPEG-4 with the encoding of still texture video objects by wavelet transform plus zero-tree coding. Thus presume an encoded video clip with a frame rate of 30 frames/sec, an I frame every n frames, and I frames with zero-tree encoding of discrete wavelet transform (DWT) coefficients. In particular, presume a four-level hierarchy wavelet decomposition into subbands; that is, four repetitions of both horizontal and vertical filtering with a highpass (wavelet kernel) and the complementary lowpass (scaling function kernel) plus decimation by 2. (This is repeated half-band filtering followed by critical decimation.) Thus each 16×16 macroblock of samples roughly yields 3 8×8 blocks (the HL1, LH1, and HH1 subbands), 3 4×4 blocks (the HL2, LH2, and HH2 subbands), 3 2×2 blocks (the HL3, LH3, and HH3 subbands), and 4 1×1 blocks (the LH4, HL4, HH4, and LL4 subbands) of wavelet coefficients; that is, the 256 pixels are transformed into 256 coefficients. The LL4 coefficient is termed the DC coefficient because it is the result of four repeated lowpass filterings plus decimations. These 256 coefficients are quantized and encoded as three zero trees with the tree roots as the LH4, HL4, and HH4 coefficients plus the DC coefficient. Thus a frame of N×M macroblocks yields an N×M array of DC coefficients which is a low-resolution version of the frame plus 3 N×M zero trees. For example, the DC subband of a 640×480-pixel (VGA) I frame is a 40×30-pixel low-resolution version of the I frame. Note that this array of DC coefficients is predictively encoded in MPEG-4; see
FIG. 4 illustrating decoding with the top branch for the DC coefficients. - The first preferred embodiment methods parse the video clip to select the required I frames and initially decode just the DC coefficients to give a sequence of low-resolution frames. The length and frame rate of the video thumbnail are selectable; the length is determined by where the extraction of I frames begins and ends in the video clip and the playback rate (normal-speed or fast forward).
- For example, if every sixth frame is an I frame, then the video thumbnail frame rate would be 5 frames/sec for a somewhat discontinuous-appearing playback at normal speed. However, if these same I frames are used in a 25 frames/sec playback, then this would appear as a 5× fast forward.
- In contrast, if the I frame rate is greater than the desired playback rate, such as for fast forward with a video clip of IPIP . . . frames, then skip I frames to achieve the desirable video thumbnail frame rate.
- The first preferred embodiment video thumbnail methods provide zoom by decoding higher subbands in addition to the DC subband and combining. For example, if just the zero-tree roots are decoded, then this recovers the LH4, HL4, and HH4 subbands which, when combined with the previously-decoded DC (LL4) subband, reconstructs the LL3 subband; this is a 2N×2M array version of the original I frame. Similarly, decoding further into the zero-trees successively reconstructs 4N×4M, 8N×8M, and 16N×16M arrays with the 16N×16M being the reconstructed original I frame. In short, the preferred embodiment methods simply additionally decode sufficient higher layers of an I frame with scalable encoding to create increasing resolution zooms. Of course, upsampling plus interpolation can adjust array size and provide intermediate zoom factors.
- 3. DCT I Frames
- Second preferred embodiment methods of video thumbnail creation are similar to the first preferred embodiments but extract from a video clip with DCT-encoded I frames. Video coding methods such MPEG-1/2/4 and MJPEG (motion JPEG) use DCT transforms for the I frames. Thus presume encoded I frames as N×M arrays of macroblocks with each macroblock a set of four 8×8 quantized luminance DCT blocks and two quantized 8×8 chrominance DCT blocks. Each 8×8 DCT block has one DC coefficient and 63 AC coefficients. The DC coefficients form a 2N×2M luminance plus two N×M chrominance arrays, and thus provide a low resolution version of the encoded I frame. For example, a 640×480-pixel encoded I frame (40×30 macroblocks) has DC coefficients forming a 80×60-pixel low resolution version of the frame.
- The second preferred embodiment methods parse a video clip to select the required I frames and initially decode just the DC coefficients to give a sequence of low-resolution frames as a video thumbnail. Note that the DC coefficients may be separated from the AC coefficients by a dc marker, so the file parsing can be quite simple. Also, the DC coefficients may be encoded using predictions from earlier-in-the-scan DC coefficients, so the decoding includes inverse prediction. The length of the video thumbnail is selected by the start and stop locations in the video clip, and the frame rate of the video thumbnail is determined by the frequency of I frames in the video clip.
- The second preferred embodiment video thumbnail methods provide zoom by decoding some of the AC coefficients in addition to the DC coefficients. For example, combining the three lowest frequency AC coefficients with the DC coefficient and then applying a 2×2 inverse DCT (and inverse quantization) gives a 4N×4M array version of the I frame which enhaneces the resolution of the DC-coefficient-only version.
- 4. H.264/AVC
- Third preferred embodiment methods are similar to the first and second preferred embodiments but extract video thumbnails from an H.264/AVC encoded video clip. In particular, presume the clip includes one or more coded video sequences with each coded video sequence consisting of a series of access units that are sequential in a network access layer (NAL) unit stream and use only one sequence parameter set. Each access unit decodes to one picture. Each coded video sequence begins with an instantaneous decoding refresh (IDR) access unit which contains an Intra-coded picture, and the coded video sequence can be decoded without reference to any other coded video sequence. An access unit generally contains a set of video coding layer (VCL) NAL units which encode a picture plus various optional other NAL units such as delimiters, end of sequence, and supplemental enhancement information (SEI) non-VCL NAL units together with redundant VCL NAL units.
- The third preferred embodiment methods parse a video clip to select the access units with Intra-coded pictures (including the IDR access units), and extract frames for a video thumbnail.
- VCL NAL units consist of (start codes), headers, and payloads of slices or slice data partitions that represent the samples of the video picture encoded in the access unit. A picture may be partitioned into one or more slices, where a slice is a group of macroblocks (16×16 luma and 8×8 chroma) which are coded using only within-slice data plus any reference pictures. In particular, each slice can be coded using one of five coding types: (1) an “I slice” has all macroblocks encoded using intra prediction; (2) a “P slice” has at least some macroblocks coded using inter prediction with one motion-compensation prediction per block and the remaining macroblocks have I slice coding; (3) a “B slice” has at least some macroblocks coded using inter prediction with two motion-compensation predictions per block and the remaining macroblocks have P slice coding; (4) an “SP slice” is a “switching” P slice coded for efficient switching between pre-coded pictures; and (5) an “SI slice” is a switching I slice coded to provide an exact match of a macroblock in an SP slice which is useful for random access or error recovery.
- All luma and chroma samples of a macroblock are either spatially (intra) predicted or temporally (inter) predicted, and the prediction residual (error) is transform coded. The spatial prediction for a luma block can be one of Intra—4×4, Intra—16×16, or I_PCM (which skips the prediction). The spatial prediction uses already encoded blocks (above or to the left of the current block) and is performed in the spatial domain, not the transform domain. For an Intra coded picture, the predictions are all spatial and confined to the I slice containing the current macroblock.
- The transform coding utilizes 4×4 blocks and a 4×4 integer transform; the resulting coefficients consist of one DC coefficient and 15 AC coefficients. For macroblocks which had been Intra—16×16 predicted in the spatial domain, the resulting 16 4×4 transformed blocks yield 16 luma DC coefficients which form a 4×4 array and which is subjected to a second 4×4 integer transform (plus 2×2 transforms for each of the corresponding two chroma DC coefficient 2×2 arrays). After transformation, scale and quantize the coefficients.
- The third preferred embodiment decodes (inverse quantization, inverse scaling, inverse second integer transforms if Intra—16×16) the DC coefficients to reconstruct a 4N×4M-pixel array from an Intra-encoded N×M-macroblock picture. This forms a low resolution version of the Intra encoded picture. Again with the 640×480-pixel example, the DC coefficients define a 160×120-pixel low-resolution version.
- For zooming (4 to 1), include the AC coefficients of the Intra-encoded picture, and decode.
- 5. Hybrid Thumbnails
- Alternative video clip thumbnail preferred embodiments do not rely upon scalable encoded I frames for all of the zoom capability; rather, they are hybrids with one or more separate thumbnail files of differing resolution created and stored along with the video clip to provide further zoom levels; see
FIG. 1 b. - In particular, a fourth preferred embodiment uses the third preferred embodiment to provide two resolutions (one-sixteenth resolution and full resolution) plus has a third resolution provided by a separate file made up of one quarter resolution versions of the same Intra-coded pictures used by the third preferred embodiment. Thus for the 640×480-pixel example, the thumbnail provides a 160×120 resolution sequence by decoding the DC coefficients of Intra pictures, a 640×480 resolution sequence by decoding entire Intra pictures, and a 320×240 resolution sequence from a separate (encoded) file of downsampled versions of the Intra pictures. This middle resolution file may have any convenient encoding to compress its size.
- Further, more than one separate thumbnail file may be used to provide further resolutions in addition to those available from the I frames. For example, a preferred embodiment thumbnail for a video clip with I frames not having scalable encoding, decodes the I frames only for zoom to the highest resolution; separate thumbnail file(s) would have the lowest (initial) resolution and any intermediate resolution(s) for intermediate zoom. Again, for the 640×480-pixel example, a first separate thumbnail file provides a 160×120 resolution sequence created by downsampling the reference intra-coded pictures, (optionally) a second separate thumbnail file provides a 320×240 resolution sequence for 2 to 1 zoom and created again by downsampling reference pictures; and lastly a 640×480 resolution sequence for 4 to 1 zoom by decoding the reference pictures of the video clip. Of course, these separate thumbnail file(s) could be compressed.
- 6. Modifications
- The preferred embodiments can be varied while retaining one or more of the features of extracting a video thumbnail from a video clip by decoding the base (low resolution) layer of Intra-coded pictures which have scalable encoding plus providing zoom by higher layer decoding and of multiple thumbnail files of varying resolution for zoom.
- For example, the encoding method of the video clip could differ from the preferred embodiment examples of MPEG, H.264, . . . ; the picture/frame sizes of the video clip could vary; and so forth.
Claims (7)
1. A method of extracting a video thumbnail, comprising:
(a) providing a video clip including a sequence of intra-coded frames and inter-coded frames; and
(b) decoding a first plurality of said intra-coded frames at a first resolution without decoding inter-coded frames.
2. The method of claim 1 , further comprising:
(a) decoding a second frame of said intra-coded frames at a second resolution, wherein said second resolution differs from said first resolution and wherein said second frame may be included in said first plurality of said intra-coded frames.
3. The method of claim 1 , wherein:
(a) said intra-coded frames are H.264 encoded frames; and
(b) said first resolution is a resolution of DC coefficients of intra-coded frames.
4. The method of claim 1 , further comprising:
(a) providing a second sequence of frames at a second resolution, said frames of said second sequence of frames corresponding to frames of said sequence of intra-coded frames; and
(b) decoding a second frame of said second sequence of frames, wherein said second resolution differs from said first resolution.
5. A video thumbnail with zoom, comprising:
(a) a first sequence of frames of a first resolution, said first sequence of frames corresponding to frames of a video clip; and
(b) a second sequence of frames of a second resolution, said second sequence of frames corresponding to frames of said first sequence, and said second resolution differing from said first resolution.
6. The video thumbnail of claim 5 , wherein:
(a) said first sequence of frames correspond to a third sequence of intra-coded frames of said video clip.
7. A method of video thumbnail with zoom, comprising:
(a) providing a video clip and a first sequence of frames wherein said first sequence of frames corresponds to a second sequence of intra-coded frames of said video clip;
(b) decoding a first plurality of said first sequence of frames at a first resolution; and
(c) when zoom is desired, decoding a second frame of said second sequence at a second resolution wherein said second resolution is greater than said first resolution.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/095,286 US20060227871A1 (en) | 2005-03-31 | 2005-03-31 | Video thumbnail method |
JP2006094260A JP2006304287A (en) | 2005-03-31 | 2006-03-30 | Video thumbnail method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/095,286 US20060227871A1 (en) | 2005-03-31 | 2005-03-31 | Video thumbnail method |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060227871A1 true US20060227871A1 (en) | 2006-10-12 |
Family
ID=37083140
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/095,286 Abandoned US20060227871A1 (en) | 2005-03-31 | 2005-03-31 | Video thumbnail method |
Country Status (2)
Country | Link |
---|---|
US (1) | US20060227871A1 (en) |
JP (1) | JP2006304287A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070206673A1 (en) * | 2005-12-08 | 2007-09-06 | Stephen Cipolli | Systems and methods for error resilience and random access in video communication systems |
US20070230566A1 (en) * | 2006-03-03 | 2007-10-04 | Alexandros Eleftheriadis | System and method for providing error resilience, random access and rate control in scalable video communications |
EP1940178A2 (en) * | 2006-12-28 | 2008-07-02 | Samsung Electronics Co., Ltd. | Generating a reduced image from a compressed original image comprising blocks encoded by intra prediction |
EP1944979A2 (en) * | 2007-01-10 | 2008-07-16 | Samsung Electronics Co., Ltd. | Method for generating reduced image of original image comprising adaptively encoded macroblocks, and image apparatus thereof |
US20080225155A1 (en) * | 2007-03-15 | 2008-09-18 | Sony Corporation | Information processing apparatus, imaging apparatus, image display control method and computer program |
US20090074377A1 (en) * | 2007-09-19 | 2009-03-19 | Herz William S | Video navigation system and method |
US20090160933A1 (en) * | 2007-12-19 | 2009-06-25 | Herz William S | Video perspective navigation system and method |
US20090161762A1 (en) * | 2005-11-15 | 2009-06-25 | Dong-San Jun | Method of scalable video coding for varying spatial scalability of bitstream in real time and a codec using the same |
US20090225870A1 (en) * | 2008-03-06 | 2009-09-10 | General Instrument Corporation | Method and apparatus for decoding an enhanced video stream |
US20100142614A1 (en) * | 2007-04-25 | 2010-06-10 | Purvin Bibhas Pandit | Inter-view prediction |
US20110142359A1 (en) * | 2008-08-13 | 2011-06-16 | University-Industry Cooperation Group Of Kyung-Hee University | Method for generating thumbnail image in image frame of the h.264 standard |
US20120224640A1 (en) * | 2011-03-04 | 2012-09-06 | Qualcomm Incorporated | Quantized pulse code modulation in video coding |
US9167246B2 (en) | 2008-03-06 | 2015-10-20 | Arris Technology, Inc. | Method and apparatus for decoding an enhanced video stream |
US20160100011A1 (en) * | 2014-10-07 | 2016-04-07 | Samsung Electronics Co., Ltd. | Content processing apparatus and content processing method thereof |
US11323724B2 (en) * | 2011-07-21 | 2022-05-03 | Texas Instruments Incorporated | Methods and systems for chroma residual data prediction |
WO2022149156A1 (en) * | 2021-01-05 | 2022-07-14 | Sling Media Pvt Ltd. | Method and apparatus for thumbnail generation for a video device |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5635985A (en) * | 1994-10-11 | 1997-06-03 | Hitachi America, Ltd. | Low cost joint HD/SD television decoder methods and apparatus |
US6400768B1 (en) * | 1998-06-19 | 2002-06-04 | Sony Corporation | Picture encoding apparatus, picture encoding method, picture decoding apparatus, picture decoding method and presentation medium |
US7471834B2 (en) * | 2000-07-24 | 2008-12-30 | Vmark, Inc. | Rapid production of reduced-size images from compressed video streams |
-
2005
- 2005-03-31 US US11/095,286 patent/US20060227871A1/en not_active Abandoned
-
2006
- 2006-03-30 JP JP2006094260A patent/JP2006304287A/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5635985A (en) * | 1994-10-11 | 1997-06-03 | Hitachi America, Ltd. | Low cost joint HD/SD television decoder methods and apparatus |
US6400768B1 (en) * | 1998-06-19 | 2002-06-04 | Sony Corporation | Picture encoding apparatus, picture encoding method, picture decoding apparatus, picture decoding method and presentation medium |
US7471834B2 (en) * | 2000-07-24 | 2008-12-30 | Vmark, Inc. | Rapid production of reduced-size images from compressed video streams |
Cited By (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090161762A1 (en) * | 2005-11-15 | 2009-06-25 | Dong-San Jun | Method of scalable video coding for varying spatial scalability of bitstream in real time and a codec using the same |
US20120069135A1 (en) * | 2005-12-08 | 2012-03-22 | Stephen Cipolli | Systems and methods for error resilience and random access in video communication systems |
US20070206673A1 (en) * | 2005-12-08 | 2007-09-06 | Stephen Cipolli | Systems and methods for error resilience and random access in video communication systems |
US9179160B2 (en) | 2005-12-08 | 2015-11-03 | Vidyo, Inc. | Systems and methods for error resilience and random access in video communication systems |
US8804848B2 (en) * | 2005-12-08 | 2014-08-12 | Vidyo, Inc. | Systems and methods for error resilience and random access in video communication systems |
US9077964B2 (en) | 2005-12-08 | 2015-07-07 | Layered Media | Systems and methods for error resilience and random access in video communication systems |
US9307199B2 (en) | 2006-03-03 | 2016-04-05 | Vidyo, Inc. | System and method for providing error resilience, random access and rate control in scalable video communications |
US9270939B2 (en) | 2006-03-03 | 2016-02-23 | Vidyo, Inc. | System and method for providing error resilience, random access and rate control in scalable video communications |
US20070230566A1 (en) * | 2006-03-03 | 2007-10-04 | Alexandros Eleftheriadis | System and method for providing error resilience, random access and rate control in scalable video communications |
US8693538B2 (en) | 2006-03-03 | 2014-04-08 | Vidyo, Inc. | System and method for providing error resilience, random access and rate control in scalable video communications |
EP1940178A3 (en) * | 2006-12-28 | 2014-06-11 | Samsung Electronics Co., Ltd. | Generating a reduced image from a compressed original image comprising blocks encoded by intra prediction |
EP1940178A2 (en) * | 2006-12-28 | 2008-07-02 | Samsung Electronics Co., Ltd. | Generating a reduced image from a compressed original image comprising blocks encoded by intra prediction |
EP1944979A2 (en) * | 2007-01-10 | 2008-07-16 | Samsung Electronics Co., Ltd. | Method for generating reduced image of original image comprising adaptively encoded macroblocks, and image apparatus thereof |
US8265147B2 (en) | 2007-01-10 | 2012-09-11 | Samsung Electronics Co., Ltd. | Method for generating reduced image of original image comprising adaptively encoded macroblocks, and image apparatus thereof |
KR101498044B1 (en) * | 2007-01-10 | 2015-03-05 | 삼성전자주식회사 | Method for generating small image of compressed image adaptivly encoded macro block and apparatus thereof |
EP1944979A3 (en) * | 2007-01-10 | 2008-12-03 | Samsung Electronics Co., Ltd. | Method for generating reduced image of original image comprising adaptively encoded macroblocks, and image apparatus thereof |
US20080225155A1 (en) * | 2007-03-15 | 2008-09-18 | Sony Corporation | Information processing apparatus, imaging apparatus, image display control method and computer program |
US8760554B2 (en) * | 2007-03-15 | 2014-06-24 | Sony Corporation | Information processing apparatus, imaging apparatus, image display control method and computer program |
US9143691B2 (en) | 2007-03-15 | 2015-09-22 | Sony Corporation | Apparatus, method, and computer-readable storage medium for displaying a first image and a second image corresponding to the first image |
US20100142614A1 (en) * | 2007-04-25 | 2010-06-10 | Purvin Bibhas Pandit | Inter-view prediction |
US10313702B2 (en) | 2007-04-25 | 2019-06-04 | Interdigital Madison Patent Holdings | Inter-view prediction |
US20090074377A1 (en) * | 2007-09-19 | 2009-03-19 | Herz William S | Video navigation system and method |
US8942536B2 (en) * | 2007-09-19 | 2015-01-27 | Nvidia Corporation | Video navigation system and method |
US8683067B2 (en) | 2007-12-19 | 2014-03-25 | Nvidia Corporation | Video perspective navigation system and method |
US20090160933A1 (en) * | 2007-12-19 | 2009-06-25 | Herz William S | Video perspective navigation system and method |
US8369415B2 (en) * | 2008-03-06 | 2013-02-05 | General Instrument Corporation | Method and apparatus for decoding an enhanced video stream |
US9167246B2 (en) | 2008-03-06 | 2015-10-20 | Arris Technology, Inc. | Method and apparatus for decoding an enhanced video stream |
US20090225870A1 (en) * | 2008-03-06 | 2009-09-10 | General Instrument Corporation | Method and apparatus for decoding an enhanced video stream |
US20210409782A1 (en) * | 2008-03-06 | 2021-12-30 | Arris Enterprises Llc | Method and apparatus for decoding an enhanced video stream |
US11722702B2 (en) * | 2008-03-06 | 2023-08-08 | Bison Patent Licensing LLC | Method and apparatus for decoding an enhanced video stream |
US11146822B2 (en) * | 2008-03-06 | 2021-10-12 | Arris Enterprises Llc | Method and apparatus for decoding an enhanced video stream |
US9854272B2 (en) | 2008-03-06 | 2017-12-26 | Arris Enterprises, Inc. | Method and apparatus for decoding an enhanced video stream |
US20180103272A1 (en) * | 2008-03-06 | 2018-04-12 | Arris Enterprises Llc | Method and apparatus for decoding an enhanced video stream |
US10616606B2 (en) * | 2008-03-06 | 2020-04-07 | Arris Enterprises Llc | Method and apparatus for decoding an enhanced video stream |
US20110142359A1 (en) * | 2008-08-13 | 2011-06-16 | University-Industry Cooperation Group Of Kyung-Hee University | Method for generating thumbnail image in image frame of the h.264 standard |
US8666182B2 (en) * | 2008-08-13 | 2014-03-04 | University-Industry Cooperation Group Of Kyung-Hee University | Method for generating thumbnail image in image frame of the H.264 standard |
US10200689B2 (en) * | 2011-03-04 | 2019-02-05 | Qualcomm Incorporated | Quantized pulse code modulation in video coding |
US20120224640A1 (en) * | 2011-03-04 | 2012-09-06 | Qualcomm Incorporated | Quantized pulse code modulation in video coding |
US11323724B2 (en) * | 2011-07-21 | 2022-05-03 | Texas Instruments Incorporated | Methods and systems for chroma residual data prediction |
US20160100011A1 (en) * | 2014-10-07 | 2016-04-07 | Samsung Electronics Co., Ltd. | Content processing apparatus and content processing method thereof |
WO2022149156A1 (en) * | 2021-01-05 | 2022-07-14 | Sling Media Pvt Ltd. | Method and apparatus for thumbnail generation for a video device |
Also Published As
Publication number | Publication date |
---|---|
JP2006304287A (en) | 2006-11-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20060227871A1 (en) | Video thumbnail method | |
JP2935934B2 (en) | Compressed moving image data generation method and decoding method | |
Marpe et al. | Performance evaluation of Motion-JPEG2000 in comparison with H. 264/AVC operated in pure intracoding mode | |
US8126054B2 (en) | Method and apparatus for highly scalable intraframe video coding | |
US8170097B2 (en) | Extension to the AVC standard to support the encoding and storage of high resolution digital still pictures in series with video | |
US8542737B2 (en) | Intra video image compression and decompression | |
US6130911A (en) | Method and apparatus for compressing reference frames in an interframe video codec | |
US20080095235A1 (en) | Method and apparatus for intra-frame spatial scalable video coding | |
US20090141809A1 (en) | Extension to the AVC standard to support the encoding and storage of high resolution digital still pictures in parallel with video | |
US20110310959A1 (en) | Method and apparatus for predecoding and decoding bitstream including base layer | |
WO2008153619A1 (en) | Shutter time compensation | |
Nguyen et al. | Adaptive downsampling/upsampling for better video compression at low bit rate | |
US8401083B2 (en) | Extreme video compression over a fixed bandwidth channel | |
US8243798B2 (en) | Methods and apparatus for scalable video bitstreams | |
Edwards et al. | High throughput JPEG 2000 for broadcast and IP-based applications | |
WO2003096700A1 (en) | Method for decoding video encoded as compressed bitstream encoded as a plurality of blocks | |
Schelkens et al. | A comparative study of scalable video coding schemes utilizing wavelet technology | |
Lee et al. | Low-complexity DCT-domain video transcoders for arbitrary-size downscaling | |
Bensaid et al. | Lossy video compression using limited set of mathematical functions and reference values | |
Liu et al. | Intra Coding Performance Comparison of HEVC, H. 264/AVC, Motion-JPEG2000 and JPEGXR Encoders | |
WO2008051755A2 (en) | Method and apparatus for intra-frame spatial scalable video coding | |
Lopez et al. | Fully scalable video coding with packed stream | |
Yang et al. | Improved interlayer prediction for scalable video coding | |
Lei | A quad-tree embedded compression algorithm for memory-saving DTV decoders | |
Guye et al. | SUIT Doc Number SUIT_211 Project Number IST-4-028042 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TEXAS INSTRUMENTS INCORPORATED, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BUDAGAVI, MADHUKAR;REEL/FRAME:016024/0466 Effective date: 20050513 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |