US20100161692A1 - Scalable video coding (svc) file format - Google Patents

Scalable video coding (svc) file format Download PDF

Info

Publication number
US20100161692A1
US20100161692A1 US12/721,383 US72138310A US2010161692A1 US 20100161692 A1 US20100161692 A1 US 20100161692A1 US 72138310 A US72138310 A US 72138310A US 2010161692 A1 US2010161692 A1 US 2010161692A1
Authority
US
United States
Prior art keywords
data stream
sub
file format
scalable
svc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/721,383
Inventor
Mohammed Zubair Visharam
Ali Tabatabai
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Corp
Sony Electronics Inc
Original Assignee
Sony Corp
Sony Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corp, Sony Electronics Inc filed Critical Sony Corp
Priority to US12/721,383 priority Critical patent/US20100161692A1/en
Assigned to SONY CORPORATION, SONY ELECTRONICS INC. reassignment SONY CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: TABATABAI, ALI, VISHARAM, MOHAMMED ZUBAIR
Publication of US20100161692A1 publication Critical patent/US20100161692A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/19Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier
    • G11B27/28Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording
    • G11B27/30Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording
    • G11B27/3027Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information detectable on the record carrier by using information signals recorded by the same method as the main recording on the same track as the main recording used signal is digitally coded
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234327Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8451Structuring of content, e.g. decomposing content into time segments using Advanced Video Coding [AVC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/85406Content authoring involving a specific file format, e.g. MP4 format
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/913Multimedia
    • Y10S707/914Video
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99951File or database maintenance
    • Y10S707/99952Coherency, e.g. same view to multiple users

Definitions

  • the present invention relates to the field of video encoding. More particularly, the present invention relates to the field of SVC encoding and extending the current AVC file format to support the storage of video coded data using scalable video coding.
  • a file format is a particular way to encode information for storage in a computer file.
  • the conventional manner of storing the format of a file is to explicitly store information about the format in the file system. This approach keeps the metadata separate from both the main data and the file name.
  • the ISO Base Media File Format is designed to contain timed media information or media data streams, such as a movie.
  • the stored media information can be transmitted locally or via a network or other stream delivery mechanism.
  • the files have a logical structure, a time structure, and a physical structure.
  • the logical structure of the file includes a set of time-parallel tracks.
  • the time structure of the file provides the tracks with sequences of data samples in time, and those sequences are mapped into a timeline of the overall media data stream by optional edit lists.
  • the physical structure of the file separates the data needed for logic, time, and structural de-composition, from the media data samples themselves. This structural information is concentrated in a metadata box, possibly extended in time by metadata fragment boxes.
  • the metadata box documents the logical and timing relationships of the data samples, and also includes pointers to where the data samples are stored.
  • Each media data stream is included in a track specialized for that media type (audio, video etc.), and is further parameterized by a sample entry.
  • the sample entry includes the ‘name’ of the exact media type, for example the type of the decoder needed to decode the media data stream, and other parameters needed for decoding.
  • Metadata takes two forms. First, timed metadata is stored in an appropriate track and synchronized with the media data it is describing. Second, there is general support for non-timed metadata attached to the media data stream or to an individual track. These generalized metadata structures are also be used at the file level in the form of a metadata box. In this case, the metadata box is the primary access means to the stored media data streams.
  • sample groups permit the documentation of arbitrary characteristics that are shared by some of the data samples in a track.
  • AVC Advanced Video Coding
  • the AVC file format defines a storage format for video streams encoded according to the AVC standard.
  • the AVC file format extends the ISO Base Media File Format.
  • the AVC file format enables AVC video streams to be used in conjunction with other media streams, such as audio, to be formatted for delivery by a streaming server, using hint tracks, and to inherit all the use cases and features of the ISO Base Media File Format.
  • FIG. 1 illustrates an exemplary configuration of an AVC file format 10 including a media data section 20 and a metadata section 30 .
  • Each data stream is stored in the media data section 20 .
  • Multiple data streams can be stored in one file format.
  • four data streams 22 , 24 , 26 , and 28 are stored in the media data section 20 .
  • a track 32 corresponds to the data stream 22
  • a track 33 corresponds to the data stream 24
  • a track 36 corresponds to the data stream 26
  • a track 38 corresponds to the data stream 28 .
  • there are N tracks stored in the metadata section for N data streams stored in the data section.
  • the H.264, or MPEG-4 Part 10 specification is a high compression digital video codec standard written by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG) in a collective effort partnership often known as the Joint Video Team (JVT).
  • VCEG Video Coding Experts Group
  • MPEG Moving Picture Experts Group
  • JVT Joint Video Team
  • the ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are technically identical, and the technology is also known as AVC, for Advanced Video Coding.
  • H.264 is a name related to the ITU-T line of H.26x video standards
  • AVC relates to the ISO/IEC MPEG side of the partnership project that completed the work on the standard, after earlier development done in the ITU-T as a project called H.26L.
  • H.264/AVC or AVC/H.264 or H.264/MPEG-4 AVC or MPEG-4/H.264 AVC
  • JVT codec in reference to the JVT organization that developed it.
  • Scalable Video Codec (SVC)
  • SVC Scalable Video Codec
  • the existing file formats do not provide an easy and clear mechanism to extract the different variations of the spatial, temporal and SNR (quality) layers from the stored media data in the file format. Therefore, this information must be extracted by parsing the coded media stream, which is very inefficient and slow.
  • This new extensions define a structuring and grouping mechanism for the dependencies that exist in a group of pictures and within each sample to obtain a flexible stream structure that provides for spatial, temporal, and quality flexibility.
  • the SVC standard proposes to encode the entire scalable media data as one single scalable bitstream, from which variants of temporal, spatial and quality layers can be extracted.
  • each video stream is encoded, and subsequently decoded, as an independent stream according to a particular frame rate, resolution, and quality.
  • a SVC elementary stream from the single encoded video stream, referred to as a SVC elementary stream, multiple different types of video can be extracted, for example a low resolution video stream, a standard resolution video stream, or a high resolution video stream.
  • the file formats need to be modified.
  • the SVC standard is currently under development and as the SVC standard defines a new design for a video codec, an appropriately defined new file format standard is also required to enable the storage and extraction of the new SVC video streams.
  • a new SVC file format is under development which extends the AVC file format to support the storage of the SVC video streams.
  • specific extensions that define access to the stored scalable video have yet to be developed.
  • a system and method are described for extending the current ISO/MP4/AVC File Format to store video content, such as that coded using the MPEG-4: Part 10/Amd-1 Scalable Video Codec (SVC) standard whose development is currently under progress in MPEG/ITU-T.
  • SVC Scalable Video Codec
  • extensions to the AVC file format are made to provide a new SVC file format that enables the storage and access of scalable video data.
  • the scalable video data can be stored as a single track within a media data section of the SVC file format.
  • New extensions are defined for description entries and boxes within a metadata section of the SVC file format. These extensions provide means for extracting sub-streams or layers from the single track of scalable video data stored in the media data section.
  • a modified file format includes a media data section to store a scalable data stream, and a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream.
  • the scalable data stream can comprise a scalable video stream.
  • the scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream.
  • the modified file format can comprise a modified Scalable Video Coding (SVC) file format.
  • the scalable data stream can comprise a single encoded track.
  • the scalable data stream can comprise a series of access units.
  • the one or more metadata boxes can be configured to define the sub-layer data stream according to one or more device requirements received from an end user device capable of processing the sub-layer data stream.
  • the one or more metadata boxes can be further configured to define one of a plurality of description entries according to the one or more device requirements.
  • the one or more metadata boxes can be further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
  • the one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry.
  • the one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units.
  • the one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards.
  • SVC Scaling Video Coding
  • the scalable data stream can comprise a plurality of sub-layer data streams.
  • the one or more metadata boxes can be configured to define a hint track associated with the sub-layer data stream.
  • the sub-layer data stream can comprise an Advanced Video Coding (AVC) compatible base layer stream.
  • AVC Advanced Video Coding
  • a file server configured to utilize a modified file format.
  • the file server includes a memory configured to store and extract data according to the modified file format, wherein the modified file format includes a media data section to store a scalable data stream, and a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream.
  • the file server also includes a processing module configured to provide control instruction to the memory and to extract the sub-layer data stream from the scalable data stream.
  • the scalable data stream can comprise a series of access units.
  • the file server can also include a network interface module configured to receive one or more device requirements from an end user device and to transmit the defined sub-layer data stream.
  • the one or more metadata boxes can be configured to define one of a plurality of description entries according to the one or more device requirements.
  • the one or more metadata boxes can be further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
  • the one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry.
  • the one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units.
  • the scalable data stream can comprise a scalable video stream.
  • the scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream.
  • the modified file format can comprise a modified Scalable Video Coding (SVC) file format.
  • the scalable data stream can comprise a single encoded track.
  • the scalable data stream can comprise a plurality of sub-layer data streams.
  • a system configured to utilize a modified file format.
  • the system includes an end user device to transmit one or more device requirements and a file server configured to receive the one or more device requirements and to utilize the modified file system.
  • the file server includes a memory and a processing module.
  • the memory is configured to store and extract data according to the modified file format, wherein the modified file format comprises a media data section to store a scalable data stream, and a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream according to the one or more device requirements.
  • the processing module is configured to provide control instruction to the memory and to extract the sub-layer data stream from the media data section.
  • the file server can also include a network interface module configured to receive the one or more device requirements from the end user device and to transmit the defined sub-layer data stream.
  • the scalable data stream can comprise a series of access units.
  • the one or more metadata boxes can be configured to define one of a plurality of description entries according to the one or more device requirements.
  • the one or more metadata boxes can be further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
  • the one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry.
  • the one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units.
  • the scalable data stream can comprise a scalable video stream.
  • the scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream.
  • the modified file format can comprise a modified Scalable Video Coding (SVC) file format.
  • the one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards.
  • the one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards.
  • the scalable data stream can comprise a single encoded track.
  • the scalable data stream can comprise a plurality of sub-layer data streams.
  • the one or more metadata boxes can be configured to define a hint track associated with the sub-layer data stream.
  • the sub-layer data stream can comprise an Advanced Video Coding (AVC) compatible base layer stream.
  • AVC Advanced Video Coding
  • a method of extracting a data stream from a modified file format includes receiving a request for a specific data stream, wherein the request includes one or more device requirements, associating the request with a specific scalable data stream stored in a media data section of the modified file format, determining one or more tracks corresponding to the specific scalable data stream, wherein the one or more tracks are stored in a metadata section of the modified file format, further wherein each track comprises one or more metadata boxes, determining a sub-layer data stream of the specific scalable data stream according to the one or more device requirements, wherein the sub-layer data stream is determined using the one or more metadata boxes, and extracting the sub-layer data stream from the stored scalable data stream.
  • the method can also include transmitting the extracted sub-layer data stream.
  • the method can also include decoding the determined one or more tracks prior to determining the sub-layer data stream.
  • the one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) file format standards.
  • the scalable data stream can comprise a scalable video stream.
  • the scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream.
  • the modified file format can comprise a modified Scalable Video Coding (SVC) file format.
  • the scalable data stream can comprise a single encoded track.
  • the scalable data stream can comprise a series of access units.
  • the method can also include configuring the one or more metadata boxes to define one of a plurality of description entries according to the one or more device requirements.
  • the method can also include configuring the one or more metadata boxes to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
  • the one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry.
  • the one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units.
  • the one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards.
  • the scalable data stream can comprise a plurality of sub-layer data streams.
  • the method can also include storing an encoded version of the scalable data stream in the media data section.
  • the scalable data stream can be encoded according to a Scaling Video Coding (SVC) standard.
  • the method can also include configuring the one or metadata boxes to define a hint track associated with the sub-layer data stream.
  • the sub-layer data stream can comprise an Advanced Video Coding (AVC) compatible base layer stream.
  • FIG. 1 illustrates an exemplary configuration of an AVC file format.
  • FIG. 2 illustrates a block diagram of an exemplary network including a file server configured to implement a modified file format.
  • FIG. 3 illustrates an exemplary block diagram of the internal components of the file server of FIG. 2 .
  • FIG. 4 illustrates an exemplary configuration of a SVC elementary stream.
  • FIG. 5 illustrates an exemplary configuration of the modified SVC file format.
  • FIG. 6 illustrates an exemplary structure of a SVC access unit according to the SVC standard.
  • FIG. 7 illustrates an exemplary method of implementing the modified SVC file format.
  • FIG. 2 illustrates a block diagram of an exemplary network including a file server configured to implement a modified file format.
  • a file server 50 is coupled to a playback device 60 via a network 70 .
  • the network is any conventional network, wired or wireless, capable of transmitting data.
  • the playback device 60 is any conventional device capable of receiving and processing transmitted data.
  • FIG. 3 illustrates an exemplary block diagram of the internal components of the file server 50 of FIG. 2 .
  • the file server 50 is any conventional computing device configurable to implement the modified file format.
  • the file server 50 includes a processing module 82 , a host memory 84 , a video memory 86 , a mass storage device 88 , and an interface circuit 90 , all coupled together by a conventional bidirectional system bus 92 .
  • the interface circuit 90 includes a physical interface circuit for sending and receiving communications over the network 70 ( FIG. 3 ).
  • the interface circuit 90 is implemented on a network interface card within the file server 50 . However, it should be apparent to those skilled in the art that the interface circuit 90 can be implemented within the file server 50 in any other appropriate manner, including building the interface circuit onto the motherboard itself.
  • the mass storage device 88 may include both fixed and removable media using any one or more of magnetic, optical or magneto-optical storage technology or any other available mass storage technology.
  • the system bus 92 enables access to any portion of the memory 84 and 88 and data transfer between and among the CPU 82 , the host memory 84 , the video memory 86 , and the mass storage device 88 .
  • Host memory 84 functions as system main memory, which is used by processing module 82 .
  • the file server 50 is also coupled to a number of peripheral input and output devices including a keyboard 94 , a mouse 96 , and an associated display 98 .
  • the keyboard 94 is coupled to the CPU 82 for allowing a user to input data and control commands into the file server 50 .
  • the mouse 96 is coupled to the keyboard 94 , or coupled to the CPU 82 , for manipulating graphic images on the display 98 as a cursor control device.
  • the file server 50 includes graphics circuitry 100 to convert data into signals appropriate for display. It is understood that the configuration of file server 50 shown in FIG. 3 is for exemplary purposes only and that file server 50 can be configured in any other conventional manner.
  • the SVC file format is an extension of the AVC file format
  • the SVC file format also includes a metadata section and a media data section.
  • the media data section stores unaltered, encoded SVC elementary streams, such as video data streams, where the SVC elementary streams consist of a series of access units.
  • FIG. 4 illustrates an exemplary configuration of a SVC elementary stream 40 .
  • the SVC elementary stream 40 includes a series of successive units, referred to as access units (AU).
  • a decoder receiving the SVC elementary stream decodes each access unit into a picture, thereby producing a video sequence.
  • the metadata section stores information about each SVC elementary stream stored in the media data section. Such information includes, but is not limited to, the type of the video stream, the resolution(s) of the video stream, the frame rate(s) of the video stream, the storage address of each access unit in the video stream, random access points within the video stream, and timing as to when each access unit is to be decoded.
  • Scalable video streams enable adaptation to various networks. For example, if a scalable video stream is encoded at 60 frames per second, but a playback device only supports 30 frames per second, then only a portion of the scalable video stream is transmitted to the playback device. As another example, if the quality of the encoded scalable video stream is very high, say 10 Mbps, but the network over which the scalable video stream is to be transmitted only supports 1 Mbps, then again only a portion of the scalable video stream is transmitted to match the supported transmission speed of the network. In this manner, all or only portions of the encoded scalable video stream are extracted from the file format, based on the network or playback device requirements.
  • a scalable video stream is fully scalable in the sense that if the entire scalable video stream is decoded, the result is full resolution, full frame rate, and high quality. However, if the network or the playback device do not support the full resolution, the full frame rate, or the high quality of the entire scalable video stream, then only portions of the scalable video stream are extracted from the data section of the file format, and transmitted over the network, thereby conserving bandwidth.
  • the scalable video stream is encoded according to the format defined in the SVC standard.
  • the encoded scalable video stream is stored in the data section of the SVC file format.
  • Tracks are stored in the metadata section of the SVC file format, where the tracks contain metadata information corresponding to the scalable video streams stored in the media data section. These tracks include information used to extract all or portions of the encoded scalable video streams stored in the data section of the SVC file format.
  • new metadata boxes are defined.
  • the new metadata boxes define the parameters used for extracting various types of video streams from the scalable video stream.
  • the parameters used to define an extracted video stream include, but are mot limited to, resolution, frame rate, and quality. For example, one extracted video stream corresponds to a low resolution requirement, a second extracted video stream corresponds to a standard resolution requirement, and a third extracted video stream corresponds to a high resolution requirement.
  • FIG. 5 illustrates an exemplary configuration of a modified SVC file format 110 .
  • the modified SVC file format 110 includes a media data section 120 and a metadata section 130 .
  • the media data section 120 includes one or more scalable data streams, or SVC elementary streams.
  • the metadata section 130 includes one or more tracks. In one embodiment, there is one track in the metadata section 130 for each SVC elementary stream stored in the media data section 120 .
  • the media data section 120 includes two SVC elementary streams 122 and 124 .
  • a track 132 in the metadata section 130 corresponds to the SVC elementary stream 122 stored in the media data section 120 .
  • a track 134 in the metadata section 130 corresponds to the SVC elementary stream 124 stored in the media data section 120 .
  • Each SVC elementary stream includes a series of access units. As shown in FIG. 5 , SVC elementary stream 122 is expanded to show a portion of its access units.
  • New extensions are provided within the metadata section of the modified SVC file format to define and extract sub-layers within the stored scalable video streams, where a specific sub-layer data stream is determined by specified device requirements, such as resolution, frame rate, and/or quality.
  • One or more metadata boxes are defined that identify and extract the specific access units that correspond to each sub-layer data stream.
  • Each track is configured to include these one or more metadata boxes.
  • FIG. 5 shows one embodiment in which a metadata box 140 and a metadata box 142 are defined to define and extract the specific access units that form the specific sub-layer data stream.
  • the metadata box 140 is referred to as a SampleGroupDescription Box and metadata box 142 is referred to as a SampleToGroup Box.
  • the playback device 60 sends a request to the file server 50 for a particular video stream, in this case the SVC elementary stream 122 , stored in the media data portion 120 of the file format 110 .
  • the request includes the required specifications of the playback device 60 , for example the specific resolution and frame rate supported.
  • the file server 50 matches the requested video stream, SVC elementary stream 122 , to its corresponding track 132 in the metadata section 130 . Using a file format decoder, only the matching track 132 is decoded.
  • the file format decoder does not decode the other tracks stored in the metadata section 130 nor does the file format decode any of the encoded scalable video streams stored in the media data section 120 , including the SVC elementary stream 122 . Additionally, the media data stored in the media data section 120 is encoded differently than the metadata stored in the metadata section 130 , and therefore requires a different decoder.
  • the metadata box 140 is accessed.
  • the metadata box 140 utilizes the one or more device requirements to determine a matching description entry.
  • the matching description entry defines parameter values and a group_description_index value that correspond to the device requirements.
  • the value of the group_description_index is used by the metadata box 142 to identify and extract specific access units in the corresponding SVC elementary stream 122 .
  • FIG. 5 illustrates the exemplary case where the metadata box 142 determines and extracts access units 1, 4, 7, and so on. The extracted access units remain unaltered and encoded. Once extracted, the access units are transmitted to the end user device as a sub-layer data stream that meets the device requirements originally provided.
  • each scalable video stream is stored as a single video stream in the data section of the file format, and a corresponding one track associated with the single video stream is stored in the metadata section of the file format.
  • New extensions are defined that provide means to extract sub-streams or layers of interest from the single video stream.
  • more than one track can be associated with the single video stream.
  • a separate track is created for different end-user devices.
  • Each of these devices can still have their own internal sub-sets of frame rates, spatial resolutions, quality, etc., which they can support.
  • One scalable video stream supports various device requirements, such spatial resolutions (QCIF, CIF, SD and HD), frame rates (7.5, 15, 30, 60 fps), and various qualities for the above.
  • the stream is still stored in the media data section of the modified file format.
  • the metadata section there can be three tracks, for example. Each of the three tracks refers the same scalable video stream. However, each track operates on a sub-set of the entire scalable stream.
  • track 1 is customized for portable players (small screen size, low frame rate, etc)
  • track 2 is customized for Standard TV, computers, etc (SD screen size, medium frame rate, etc)
  • track 3 is customized for HD players (large screen size, high frame rate, etc).
  • Each track still includes the metadata boxes that define the description entries, since each track still supports some amount of variation (scalability).
  • track 1 supports QCIF and CIF, 7.5 and 15 fps.
  • Track 2 supports all of track 1 and SD and 30 fps.
  • Track 3 supports all of track 2 and HD and 60 fps.
  • the specific video stream requested by the end user device does not correspond to a single track.
  • Different methods are contemplated to determine the appropriate track in this case.
  • the file server decodes a part of a track to determine the device requirements that it defines.
  • the file server parses the tracks at a high level to identify which one to use.
  • tracks have track headers that provide more information about the content they describe. In this particular case, the file server decodes some high level information from each track to decide which one to use. In general, any method that matches the device requirements received in the end user device request to the proper track can be used.
  • extracting portions of a SVC elementary stream is supported by defining two new boxes, a SampleGroupDescription Box and a SVCSampleToGroup Box.
  • Each new description entry is defined within the SampleGroupDescription Box defines unique description entries, where each description entry refers to a different type of video streams that can be extracted from the SVC elementary stream. For example, if the SVC elementary stream is encoded to support three different resolutions, low, standard, and high, then there are three description entries, one description entry for each supported resolution. Each description entry defines the description entry fields, as described below, for that particular resolution, low, standard, or high. If the SVC elementary stream is encoded to support three different resolution, low, standard, and high, and also to support two different frame rates, high and low, then there are six different description entries, one for each combination of resolution and frame rate.
  • the SVC Dependency Description Entry documents and describes the various possible spatio-temporal combinations present within an SVC elementary stream.
  • Each description entry is defined using a grouping type of ‘svcd’.
  • the description entries are ordered in terms of their dependency.
  • the first description entry documents the base layer description and subsequent description entries document the enhancement layer descriptions.
  • a group_description_index is used to index the ordered description entries.
  • the group_description_index is also used to group the various SVC NAL units. Each SVC NAL unit refers to one of the description entries.—
  • Each SVC NAL unit in the SVC elementary stream is grouped using an index, the group_description_index, into the ordered list of description entries.
  • NAL units referring to a particular index may require all or some of the NAL units referring to all the lower indices for proper decoding operation, but do not require any NAL unit referring to a higher index value. In other words, dependency only exist in the direction of lower layers.
  • the file server determines the sub-set of indices required for proper decoding operation based on the values of the description fields present within the description entries, for example the resolution and/or temporal rate. In another embodiment, the end user device determines the sub-set of indices.
  • the grouping type used to refer to the ordered list of description entries is ‘svcd’.
  • the grouping type is used to link the entries present in this list to the SVCSampleToGroup Box, which includes the grouping of all the SVC samples and is described in greater detail below.
  • variable profile_compatibility is a byte defined exactly the same as the byte which occurs between the profile_DC and level_DC in a sequence parameter set, as defined in the AVC/SVC video specification.
  • the variable levelIndication includes the level code as defined in AVC/SVC video specification.
  • the profileIndication and levelIndication fields indication fields present in each entry provide the profile and level values to which the particular layer is compatible.
  • the variable temporal_level takes the value of the temporal_level syntax element present in the scalable extension NAL unit defined in the AVC/SVC video specification. This non-negative integer indicates the temporal level that the sample provides along time. The lowest temporal level is numbered as zero and the enhancement layers in the temporal direction are numbered as one or higher.
  • the temporal_level field takes a default value of zero, in the case of AVC NAL units.
  • extension_flag is equal to 0 then the parameters that specify the mapping of simple_priority_id to temporal_level are present in the SPS and are mapped accordingly.
  • the variable dependency_id takes the value of the dependency_id syntax element present in the scalable extension NAL unit defined in the AVC/SVC video specification.
  • the dependency_id is a non-negative integer, with the value zero signaling that the NAL units corresponds to the lowest spatial resolution, and all higher values signal that the enhancement layers provide an increase either in spatial resolution and/or quality, for example coarse grain scalability.
  • the dependency_id also controls the spatial scalability.
  • the dependency_id field takes a default value of zero in the case of AVC NAL units. In SVC Scalable Extension NAL units, if the extension_flag is equal to 0, then the parameters that specify the mapping of simple_priority_id to dependency_id are present in the SPS and are mapped accordingly.
  • the variable temporalFrameRate indicates the temporal frame rate that is associated with the temporal level field in the entry.
  • the variable visualWidth gives the value of the width of the coded picture in pixels in this layer of the SVC stream.
  • the variable visualHeight gives the value of the height of the coded picture in pixels in this layer of the SVC stream.
  • the variable baseBitRate gives the bitrate in bits/second of the minimum quality that is provided by this layer without any progressive refinements. NAL units in this and lower levels that fall within the dependency hierarchy are taken into account in the calculation.
  • the variable maxBitRate gives the maximum rate in bits/second that is provided by this layer over any window of one second. NAL units in this and lower levels that fall within the dependency hierarchy are taken into account in the calculation.
  • variable avgBitRate gives the average bit rate bits/second. NAL units in this and lower levels are taken into account in the calculation.
  • the variable progressiveRefinementLayerFlag when true, indicates that this layer contains progressive refinement NAL units, and is FGS scalable.
  • a SVCSampleToGroup Box is used to extract SVC scalable sub-streams from the SVC Elementary stream stored in the SVC file format, depending on the constraints imposed in terms of temporal, spatial and quality requirements.
  • the SVCSampleToGroup Box provides the grouping information for each NAL unit of an SVC sample.
  • the grouping information is provided by means of a group_description_index, which associates each NAL unit with its description information present in the SVCDependencyDescriptionEntry.
  • the group_description_index refers to a specific description entry.
  • the group_description_index ranges from 1 to the number of sample group entries in the SampleGroupDescription Box.
  • Each NAL unit within the access units is assigned a specific group_description_index value.
  • the requirements specified by the end user device ultimately determine a specific group_description_index value. All NAL units within the access units that are assigned the determined group_description_index value, or those NAL units within the access units with a group_description_index value lower than the determined group_description_index value, are extracted from the stored scalable video stream in the media data section of the file format. Those extracted access units are transmitted to the end user device. In this manner, portions of a single scalable video stream can be extracted and transmitted to an end user device depending on the requirements specified by the end user device.
  • the variable grouping_type is an integer that identifies the type of sample grouping used.
  • the variable sample_count denotes the number of samples that are present in the media track for the SVC Elementary stream, and is an inferred value that is calculated using the Sample to Chunk Box.
  • the variable numberOfNalUnits is an integer that gives the number of NAL units present in a SVC sample.
  • the variable group_description_index is an integer that gives the index of the sample group entry which describes the NAL units in this group. The index ranges from 1 to the number of sample group entries in the particular SampleGroupDescription Box, or takes the value 0 to indicate that the particular SVC Sample is a member of no group of this type.
  • the variable is DiscardableNalUnitFlag is a flag, the semantics of which are specified in the NAL unit semantics of the AVC video specification.
  • the variable is PRNalUnitFlag is a flag, which if equal to 1, indicates that this NAL unit is a progressive refinement NAL unit.
  • the variable quality_level specifies the quality level for the current NAL unit as specified in the NAL unit header or in the quality_level_list[ ]. If absent, its value is inferred to be zero.
  • the sub-sample information box defined in the ISO base syntax cannot be easily used for defining a SampleToGroup box used to extract select portions of the SVC Elementary stream.
  • sub-sample information box does not have a ‘grouping_type’ indicator to map to the relevant description entries.
  • the sub-sample information box is inefficient for this purpose, since the sub-sample information box requires a (32) bit entry called ‘sample_count’. This was originally intended to indicate a run of samples, but in the case of SVC File Format, each SVC Sample may have a variable number of subsample_count, making the use of this sample_count mandatory for each SVC sample.
  • the new file format does not need to signal each subsample size, since a count of NAL units is used and the lengths of each NAL unit are already specified before them.
  • FIG. 6 illustrates an exemplary structure of an SVC Sample, or SVC access unit, according to the SVC standard.
  • SVC Samples are externally framed and have a size supplied by that external framing.
  • the SVC access unit is made up of a set of NAL units. Each NAL unit is represented with a length, which indicates the length in bytes of the following NAL Unit. The length field is configured to be of 1, 2, or 4 bytes.
  • the configuration size is specified in the decoder configuration record.
  • Each NAL Unit contains the NAL unit data as specified in lSO/IEC AVC/SVC video specification.
  • the SVC Decoder Configuration Record includes the size of the length field used in each sample to indicate the length of its contained NAL units as well as the initial parameter sets.
  • the SVC Decoder Configuration Record is externally framed, meaning its size is supplied by the structure that contains it.
  • the SVC Decoder Configuration Record also includes a version field. Incompatible changes to the SVC Decoder Configuration Record are indicated by a change of version number.
  • the configuration record When used to provide the configuration of a parameter set elementary stream or a video elementary stream used in conjunction with a parameter set elementary stream, the configuration record contains no sequence or picture parameter sets, for example the variables numOfSequenceParameterSets and numOfPictureParameterSets both have the value.
  • the level indication indicates a level of capability equal to or greater than the highest level indicated in the included parameter sets.
  • Each profile compatibility flag is set if all the included parameter sets set that flag.
  • the profile indication indicates a profile to which the entire stream conforms. The individual profiles and levels of each layer are documented in the SVCDependencyDescriptionEntry box.
  • variable profile_compatibility is a byte defined the same as the byte which occurs between the profile_IDC and level_IDC in a sequence parameter set, as defined in the AVC specification.
  • the variable SVCLevelIndication contains the level code as defined in the AVC specification.
  • the variable lengthSizeMinusOne indicates the length in bytes of the NALUnitLength field in an SVC video sample or SVC parameter set sample of the associated stream minus one. For example, a size of one byte is indicated with a value of 0. The value of this field is one of 0, 1, or 3 corresponding to a length encoded with 1, 2, or 4 bytes, respectively.
  • variable numOfSequenceParameterSets indicates the number of sequence parameter sets that are used as the initial set of sequence parameter sets for decoding the SVC elementary stream.
  • the variable sequenceParameterSetLength indicates the length in bytes of the sequence parameter set NAL unit as defined in the AVC specification.
  • the variable sequenceParameterSetNALUnit contains a sequence parameter set NAL Unit, as specified in the AVC specification. Sequence parameter sets occur in order of ascending parameter set identifier with gaps being allowed.
  • the variable numOfPictureParameterSets indicates the number of picture parameter sets that are used as the initial set of picture parameter sets for decoding the SVC elementary stream.
  • variable pictureParameterSetLength indicates the length in bytes of the picture parameter set NAL unit as defined in the AVC specification.
  • the variable pictureParameterSetNALUnit contains a picture parameter set NAL Unit, as specified in the AVC specification. Picture parameter sets occur in order of ascending parameter set identifier with gaps being allowed.
  • the scalable SVC video stream is stored as a single track. If the scalable SVC video stream has a base layer that is AVC compatible, then those AVC compatible NAL units that are present in each SVC sample are grouped together using the new extensions as previously described. To find which entries are AVC compatible, the Profile and Level Indicators present in the SVCDependencyDescriptionEntries are used to parse through the SVCSampleToGroup Box and extract only those NAL units from each SVC Sample that are AVC compatible.
  • the modified SVC file format is derived from the ISO Base Media File Format. As such, there is a correspondence of terms in the modified SVC file format and the ISO Base Media File Format. For example, the terms stream and access unit used in the modified SVC file format correspond to the terms track and sample, respectively, in the ISO Base Media File Format.
  • SVC tracks are video or visual tracks. They therefore use a handler_type of ‘vide’ in the HandlerBox, a video media header ‘vmhd’, and, as defined below, a derivative of the VisualSampleEntry.
  • the AVC Configuration Box documents the Profile, Level and Parameter Set information pertaining to the AVC compatible base layer as defined by the AVCDecoderConfigurationRecord, in the AVC File Format specification.
  • the SVC Configuration Box documents the Profile, Level and Parameter Set information pertaining to the SVC compatible enhancement layers as defined by the SVCDecoderConfigurationRecord, defined below. If the SVC Elementary stream does not contain an AVC base layer, then an SVC visual sample entry (‘svc1’) is used.
  • the SVC visual sample entry contains an SVC Configuration Box, as defined below. This includes an SVCDecoderConfigurationRecord, also as defined below. Multiple sample descriptions are used, as permitted by the ISO Base Media File Format specification, to indicate sections of video that use different configurations or parameter sets.
  • AVCConfigurationBox extends Box (’avcC’) ⁇ AVCDecoderConfig-urationRecord ( ) AVCConfig; ⁇ class SVCConfigurationBox extends Box (’svcC’) ⁇ SVCDecoderConfigurationRecord ( ) SVCConfig; ⁇ // Use this if base layer is AVC compatible class AVCSampleEntry( ) extends VisualSampleEntry (‘avc1 ’) ⁇ AVCConfigurationBox avcconfig; SVCConfigurationBox svcconfig; MPEG4BitRateBox ( ); // optional MPEG4ExtensionDescriptorsBox ( ); // optional ⁇ // Use this if base layer is NOT AVC compatible class SVCSampleEntry( ) extends VisualSamplesntry (‘svc1 ’) ⁇ SVCConfigurationBox svcconfig; MPEG4BitRateBox 0; // optional MPEG4ExtensionDescnip
  • the format of a sample in an SVC video elementary stream is configured via the decoder specific configuration for the SVC elementary stream.
  • the SVC Sample contains all the NAL units pertaining to all the scalable levels that are present for the primary coded picture as shown in FIG. 7 .
  • the length field includes the size of both the one byte NAL header and the EBSP payload but does not include the length field itself.
  • the variable NALUnit contains a single NAL unit.
  • the syntax of an NAL unit is as defined in the ISO/IEC AVC/SVC video specification and includes both the one byte NAL header and the variable length encapsulated byte stream payload.
  • AVC Parameter Set Elementary stream as specified in the AVC File Format also applies in this case for the storage of SVC Parameter Sets as separate elementary streams.
  • the width and height in the VisualSampleEntry document the correct cropped largest spatial resolution in terms of pixels that is obtained by decoding the entire scalable bitstream. To obtain the individual width and height of each layer, the group description entries are evaluated. Unless otherwise specified herein, all other definitions as specified in the AVC File Format Specification apply.
  • each description entry describes the properties of a scalable layer and its possible refinements in the case of fully scalable streams.
  • Each description entry documents the temporal frame rate (temporal scalability), the spatial dimensions (spatial scalability), the range of bit-rates available from this layer, indicates if this layer is Fine Grain Scalable, the profile and level indications, and dependency information.
  • the dependency hierarchy is easily maintained by the index of the group description entries where each higher index indicates that it depends on all or some of the lower layers described by the entries below it.
  • the SampleToGroup box maps each NAL unit of a SVC sample to its group_description_index. This allows for an efficient method of reading, parsing and skipping any un-needed data. If the entire scalable sample is desired, then the whole SVC sample is read. If only particular scalable layers are desired, then those NAL units (VCL or otherwise) that do not map to the desired layer are skipped while parsing.
  • the modified file format defines a mechanism to access and extract the entire scalable layer, or portions thereof, stored in the file format.
  • either the scalability information in the SEI messages is used, or alternative rules to drop un-desired NAL units over the network are used.
  • One possibility is to define rules as part of the RTP mapping process to enable such alternative functionality.
  • the modified file format is backwards compatible with the existing AVC File Format specification to the fullest extent. There is no change in the DecoderConfiguration, Sample syntax structure and elementary stream structure when storing AVC compatible streams.
  • the File Type indication signals the presence of an AVC compatible base layer stream by using the brand ‘avc1’ in its compatible_brands list. The presence of AVC compatible streams is detected by reading the Profile/Level Indicators present in each group_description_entry. Alternatively, a separate ‘hint’ track is also created for the AVC compatible base layer stream.
  • the extracted access units are transmitted over the network using Real-time Transport Protocol (RTP).
  • RTP has its own headers which are added to the payload, in this case the extracted access units.
  • the hint tracks include pre-generated RTP headers and pointers to the scalable data.
  • the proper hint track is accessed to retrieve the pre-generated RTP headers and pointers, thereby eliminating the additional overhead of generating the RTP headers.
  • Each sample in an AVC hint track stores information referring to the AVC Base Layer compatible NAL units in the scalable video stream. All NAL units within this sample have the same timestamp.
  • variable sample_index indicates the sample number of the SVC sample that contains AVG base layer NAL units.
  • variable nalunitcount indicates the number of consecutive NAL units from the beginning of the SVC sample that are AVG compatible.
  • FIG. 7 illustrates an exemplary method of implementing the modified SVC file format.
  • a file server configured to implement the modified SVC file format receives a request from an end user device.
  • the request identifies the name of a specific data stream to be transmitted.
  • the specific data stream corresponds to a specific scalable data stream stored in the modified file format.
  • the request also includes device requirements of the end user device, such as a supported resolution and frame rate.
  • the file server determines a track associated with the specified data stream.
  • the file server decodes the track determined in the step 210 .
  • one or more metadata boxes within the decoded track are used to determine a description entry associated with the device requirements.
  • the decoded track includes a SampleGroupDescription Box which is used to determine the associated description entry.
  • the description entry defines parameter values corresponding to the device requirements.
  • the one or more metadata boxes are used to determine the access units, and the specific NAL units within each access unit, within the specific scalable data stream. The specific NAL units within the access units are determined according to description entry determined in the step 230 .
  • the decoded track includes a SampleToGroup Box which is used to determine the specific access units.
  • the specific access units determined in the step 240 are extracted from the specific scalable data stream.
  • the extracted access units are a sub-layer data stream of the specific scalable data stream.
  • the sub-layer data stream matches the device requirements received in the step 200 and is therefore supported by the end user device.
  • the sub-layer data stream is transmitted to the end user device.

Abstract

The currently existing ISO/AVC file format is modified by providing extensions to store and access video content currently being defined by the SVC standard. Specifically, extensions to the AVC file format are made to provide a new SVC file format that enables the storage and access of scalable video data. The scalable video data is stored as a single track within a media data section of the SVC file format. New extensions are defined for description entries and boxes within a metadata section of the SVC file format. These extensions provide means for extracting sub-streams or layers from the single track of scalable video data stored in the media data section.

Description

    RELATED APPLICATIONS
  • This application claims priority of U.S. provisional application Ser. No. 60/699,535, filed Jul. 15, 2005, and entitled “Scalable Video Coding (SVC) file format”, by the same inventors. This application incorporates U.S. provisional application Ser. No. 60/699,535, filed Jul. 15, 2005, and entitled “Scalable Video Coding (SVC) file format” in its entirety by reference.
  • FIELD OF THE INVENTION
  • The present invention relates to the field of video encoding. More particularly, the present invention relates to the field of SVC encoding and extending the current AVC file format to support the storage of video coded data using scalable video coding.
  • BACKGROUND OF THE INVENTION
  • A file format is a particular way to encode information for storage in a computer file. The conventional manner of storing the format of a file is to explicitly store information about the format in the file system. This approach keeps the metadata separate from both the main data and the file name.
  • The ISO Base Media File Format is designed to contain timed media information or media data streams, such as a movie. The stored media information can be transmitted locally or via a network or other stream delivery mechanism. The files have a logical structure, a time structure, and a physical structure. The logical structure of the file includes a set of time-parallel tracks. The time structure of the file provides the tracks with sequences of data samples in time, and those sequences are mapped into a timeline of the overall media data stream by optional edit lists. The physical structure of the file separates the data needed for logic, time, and structural de-composition, from the media data samples themselves. This structural information is concentrated in a metadata box, possibly extended in time by metadata fragment boxes. The metadata box documents the logical and timing relationships of the data samples, and also includes pointers to where the data samples are stored.
  • Each media data stream is included in a track specialized for that media type (audio, video etc.), and is further parameterized by a sample entry. The sample entry includes the ‘name’ of the exact media type, for example the type of the decoder needed to decode the media data stream, and other parameters needed for decoding. There are defined sample entry formats for a variety media types.
  • Support for metadata takes two forms. First, timed metadata is stored in an appropriate track and synchronized with the media data it is describing. Second, there is general support for non-timed metadata attached to the media data stream or to an individual track. These generalized metadata structures are also be used at the file level in the form of a metadata box. In this case, the metadata box is the primary access means to the stored media data streams.
  • In some cases, the data samples within a track have different characteristics or need to be specially identified. One such characteristic is the synchronization point, often a video I-frame. These points are identified by a special table in each track. More generally, the nature of dependencies between track samples is documented in this manner. There is also the concept of sample groups. Sample groups permit the documentation of arbitrary characteristics that are shared by some of the data samples in a track. In the Advanced Video Coding (AVC) file format, sample groups are used to support the concept of layering and sub-sequences.
  • The AVC file format defines a storage format for video streams encoded according to the AVC standard. The AVC file format extends the ISO Base Media File Format. The AVC file format enables AVC video streams to be used in conjunction with other media streams, such as audio, to be formatted for delivery by a streaming server, using hint tracks, and to inherit all the use cases and features of the ISO Base Media File Format.
  • FIG. 1 illustrates an exemplary configuration of an AVC file format 10 including a media data section 20 and a metadata section 30. Each data stream is stored in the media data section 20. Multiple data streams can be stored in one file format. As shown in FIG. 1, four data streams 22, 24, 26, and 28 are stored in the media data section 20. For each data stream stored in the media data section of the AVC file format there is a corresponding track stored in the metadata section. In FIG. 1, a track 32 corresponds to the data stream 22, a track 33 corresponds to the data stream 24, a track 36 corresponds to the data stream 26, and a track 38 corresponds to the data stream 28. In general, there are N tracks stored in the metadata section for N data streams stored in the data section.
  • The H.264, or MPEG-4 Part 10, specification is a high compression digital video codec standard written by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG) in a collective effort partnership often known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are technically identical, and the technology is also known as AVC, for Advanced Video Coding. It should be noted that H.264 is a name related to the ITU-T line of H.26x video standards, while AVC relates to the ISO/IEC MPEG side of the partnership project that completed the work on the standard, after earlier development done in the ITU-T as a project called H.26L. It is usual to call the standard as H.264/AVC (or AVC/H.264 or H.264/MPEG-4 AVC or MPEG-4/H.264 AVC) to emphasize the common heritage. Occasionally, it has also been referred to as “the JVT codec”, in reference to the JVT organization that developed it.
  • Currently JVT is working on a new codec known as the Scalable Video Codec (SVC), which would be an extension to the existing AVC codec. Work on the SVC started independently in the MPEG domain initially as a part of the MPEG-21 standard in 2003. But during its development in 2004, it was merged with the activities of the JVT group with a focus towards developing coding technology that would be backwards compatible with the existing AVC codec. As such it currently is jointly developed by the JVT group in MPEG and ITU-T. The goal of the Scalable Video Codec (SVC) activity is to address the need and provide for scalability in the Spatial, Temporal and Quality (SNR) levels.
  • The existing file formats (ISO/MP4 and AVC) do not provide an easy and clear mechanism to extract the different variations of the spatial, temporal and SNR (quality) layers from the stored media data in the file format. Therefore, this information must be extracted by parsing the coded media stream, which is very inefficient and slow. Thus, there is a need to enhance and define new extensions to support the storage of emerging video coding standards such as SVC and to address the existing limitations of current file format storage methods. These new extensions define a structuring and grouping mechanism for the dependencies that exist in a group of pictures and within each sample to obtain a flexible stream structure that provides for spatial, temporal, and quality flexibility. The SVC standard proposes to encode the entire scalable media data as one single scalable bitstream, from which variants of temporal, spatial and quality layers can be extracted.
  • In the AVC standard, each video stream is encoded, and subsequently decoded, as an independent stream according to a particular frame rate, resolution, and quality. According to the SVC standard, from the single encoded video stream, referred to as a SVC elementary stream, multiple different types of video can be extracted, for example a low resolution video stream, a standard resolution video stream, or a high resolution video stream. To support the storage and extraction of such scalable video streams in the file format, the file formats need to be modified.
  • The SVC standard is currently under development and as the SVC standard defines a new design for a video codec, an appropriately defined new file format standard is also required to enable the storage and extraction of the new SVC video streams. To support the new SVC video streams, a new SVC file format is under development which extends the AVC file format to support the storage of the SVC video streams. However, specific extensions that define access to the stored scalable video have yet to be developed.
  • SUMMARY OF THE INVENTION
  • A system and method are described for extending the current ISO/MP4/AVC File Format to store video content, such as that coded using the MPEG-4: Part 10/Amd-1 Scalable Video Codec (SVC) standard whose development is currently under progress in MPEG/ITU-T. Specifically, extensions to the AVC file format are made to provide a new SVC file format that enables the storage and access of scalable video data. The scalable video data can be stored as a single track within a media data section of the SVC file format. New extensions are defined for description entries and boxes within a metadata section of the SVC file format. These extensions provide means for extracting sub-streams or layers from the single track of scalable video data stored in the media data section.
  • In one aspect, a modified file format includes a media data section to store a scalable data stream, and a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream. The scalable data stream can comprise a scalable video stream. The scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream. The modified file format can comprise a modified Scalable Video Coding (SVC) file format. The scalable data stream can comprise a single encoded track. The scalable data stream can comprise a series of access units. The one or more metadata boxes can be configured to define the sub-layer data stream according to one or more device requirements received from an end user device capable of processing the sub-layer data stream. The one or more metadata boxes can be further configured to define one of a plurality of description entries according to the one or more device requirements. The one or more metadata boxes can be further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream. The one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry. The one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units. The one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards. The scalable data stream can comprise a plurality of sub-layer data streams. The one or more metadata boxes can be configured to define a hint track associated with the sub-layer data stream. The sub-layer data stream can comprise an Advanced Video Coding (AVC) compatible base layer stream.
  • In another aspect, a file server configured to utilize a modified file format is described. The file server includes a memory configured to store and extract data according to the modified file format, wherein the modified file format includes a media data section to store a scalable data stream, and a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream. The file server also includes a processing module configured to provide control instruction to the memory and to extract the sub-layer data stream from the scalable data stream. The scalable data stream can comprise a series of access units. The file server can also include a network interface module configured to receive one or more device requirements from an end user device and to transmit the defined sub-layer data stream. The one or more metadata boxes can be configured to define one of a plurality of description entries according to the one or more device requirements. The one or more metadata boxes can be further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream. The one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry. The one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units. The scalable data stream can comprise a scalable video stream. The scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream. The modified file format can comprise a modified Scalable Video Coding (SVC) file format. The scalable data stream can comprise a single encoded track. The scalable data stream can comprise a plurality of sub-layer data streams.
  • In yet another aspect, a system configured to utilize a modified file format is described. The system includes an end user device to transmit one or more device requirements and a file server configured to receive the one or more device requirements and to utilize the modified file system. The file server includes a memory and a processing module. The memory is configured to store and extract data according to the modified file format, wherein the modified file format comprises a media data section to store a scalable data stream, and a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream according to the one or more device requirements. The processing module is configured to provide control instruction to the memory and to extract the sub-layer data stream from the media data section. The file server can also include a network interface module configured to receive the one or more device requirements from the end user device and to transmit the defined sub-layer data stream. The scalable data stream can comprise a series of access units. The one or more metadata boxes can be configured to define one of a plurality of description entries according to the one or more device requirements. The one or more metadata boxes can be further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream. The one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry. The one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units. The scalable data stream can comprise a scalable video stream. The scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream. The modified file format can comprise a modified Scalable Video Coding (SVC) file format. The one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards. The one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards. The scalable data stream can comprise a single encoded track. The scalable data stream can comprise a plurality of sub-layer data streams. The one or more metadata boxes can be configured to define a hint track associated with the sub-layer data stream. The sub-layer data stream can comprise an Advanced Video Coding (AVC) compatible base layer stream.
  • In another aspect, a method of extracting a data stream from a modified file format is described. The method includes receiving a request for a specific data stream, wherein the request includes one or more device requirements, associating the request with a specific scalable data stream stored in a media data section of the modified file format, determining one or more tracks corresponding to the specific scalable data stream, wherein the one or more tracks are stored in a metadata section of the modified file format, further wherein each track comprises one or more metadata boxes, determining a sub-layer data stream of the specific scalable data stream according to the one or more device requirements, wherein the sub-layer data stream is determined using the one or more metadata boxes, and extracting the sub-layer data stream from the stored scalable data stream. The method can also include transmitting the extracted sub-layer data stream. The method can also include decoding the determined one or more tracks prior to determining the sub-layer data stream. The one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) file format standards. The scalable data stream can comprise a scalable video stream. The scalable video stream can comprise a Scalable Video Coding (SVC) elementary stream. The modified file format can comprise a modified Scalable Video Coding (SVC) file format. The scalable data stream can comprise a single encoded track. The scalable data stream can comprise a series of access units. The method can also include configuring the one or more metadata boxes to define one of a plurality of description entries according to the one or more device requirements. The method can also include configuring the one or more metadata boxes to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream. The one or more metadata boxes can comprise a SVC Sample Group Description Box configured to define the one description entry. The one or more metadata boxes can comprise a SVC Sample To Group Box to define and group the sub-set of access units. The one or more metadata boxes can comprise extensions to the Scaling Video Coding (SVC) standards. The scalable data stream can comprise a plurality of sub-layer data streams. The method can also include storing an encoded version of the scalable data stream in the media data section. The scalable data stream can be encoded according to a Scaling Video Coding (SVC) standard. The method can also include configuring the one or metadata boxes to define a hint track associated with the sub-layer data stream. The sub-layer data stream can comprise an Advanced Video Coding (AVC) compatible base layer stream.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an exemplary configuration of an AVC file format.
  • FIG. 2 illustrates a block diagram of an exemplary network including a file server configured to implement a modified file format.
  • FIG. 3 illustrates an exemplary block diagram of the internal components of the file server of FIG. 2.
  • FIG. 4 illustrates an exemplary configuration of a SVC elementary stream.
  • FIG. 5 illustrates an exemplary configuration of the modified SVC file format.
  • FIG. 6 illustrates an exemplary structure of a SVC access unit according to the SVC standard.
  • FIG. 7 illustrates an exemplary method of implementing the modified SVC file format.
  • Embodiments of the modified file format are described relative to the several views of the drawings. Where appropriate and only where identical elements are disclosed and shown in more than one drawing, the same reference numeral will be used to represent such identical elements.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • FIG. 2 illustrates a block diagram of an exemplary network including a file server configured to implement a modified file format. A file server 50 is coupled to a playback device 60 via a network 70. The network is any conventional network, wired or wireless, capable of transmitting data. The playback device 60 is any conventional device capable of receiving and processing transmitted data.
  • FIG. 3 illustrates an exemplary block diagram of the internal components of the file server 50 of FIG. 2. The file server 50 is any conventional computing device configurable to implement the modified file format. The file server 50 includes a processing module 82, a host memory 84, a video memory 86, a mass storage device 88, and an interface circuit 90, all coupled together by a conventional bidirectional system bus 92. The interface circuit 90 includes a physical interface circuit for sending and receiving communications over the network 70 (FIG. 3). The interface circuit 90 is implemented on a network interface card within the file server 50. However, it should be apparent to those skilled in the art that the interface circuit 90 can be implemented within the file server 50 in any other appropriate manner, including building the interface circuit onto the motherboard itself. The mass storage device 88 may include both fixed and removable media using any one or more of magnetic, optical or magneto-optical storage technology or any other available mass storage technology. The system bus 92 enables access to any portion of the memory 84 and 88 and data transfer between and among the CPU 82, the host memory 84, the video memory 86, and the mass storage device 88. Host memory 84 functions as system main memory, which is used by processing module 82.
  • The file server 50 is also coupled to a number of peripheral input and output devices including a keyboard 94, a mouse 96, and an associated display 98. The keyboard 94 is coupled to the CPU 82 for allowing a user to input data and control commands into the file server 50. The mouse 96 is coupled to the keyboard 94, or coupled to the CPU 82, for manipulating graphic images on the display 98 as a cursor control device. The file server 50 includes graphics circuitry 100 to convert data into signals appropriate for display. It is understood that the configuration of file server 50 shown in FIG. 3 is for exemplary purposes only and that file server 50 can be configured in any other conventional manner.
  • Since the SVC file format is an extension of the AVC file format, the SVC file format also includes a metadata section and a media data section. The media data section stores unaltered, encoded SVC elementary streams, such as video data streams, where the SVC elementary streams consist of a series of access units. FIG. 4 illustrates an exemplary configuration of a SVC elementary stream 40. The SVC elementary stream 40 includes a series of successive units, referred to as access units (AU). A decoder receiving the SVC elementary stream decodes each access unit into a picture, thereby producing a video sequence.
  • The metadata section stores information about each SVC elementary stream stored in the media data section. Such information includes, but is not limited to, the type of the video stream, the resolution(s) of the video stream, the frame rate(s) of the video stream, the storage address of each access unit in the video stream, random access points within the video stream, and timing as to when each access unit is to be decoded.
  • Scalable video streams enable adaptation to various networks. For example, if a scalable video stream is encoded at 60 frames per second, but a playback device only supports 30 frames per second, then only a portion of the scalable video stream is transmitted to the playback device. As another example, if the quality of the encoded scalable video stream is very high, say 10 Mbps, but the network over which the scalable video stream is to be transmitted only supports 1 Mbps, then again only a portion of the scalable video stream is transmitted to match the supported transmission speed of the network. In this manner, all or only portions of the encoded scalable video stream are extracted from the file format, based on the network or playback device requirements.
  • A scalable video stream is fully scalable in the sense that if the entire scalable video stream is decoded, the result is full resolution, full frame rate, and high quality. However, if the network or the playback device do not support the full resolution, the full frame rate, or the high quality of the entire scalable video stream, then only portions of the scalable video stream are extracted from the data section of the file format, and transmitted over the network, thereby conserving bandwidth.
  • For storage in the SVC file format, the scalable video stream is encoded according to the format defined in the SVC standard. The encoded scalable video stream is stored in the data section of the SVC file format. Tracks are stored in the metadata section of the SVC file format, where the tracks contain metadata information corresponding to the scalable video streams stored in the media data section. These tracks include information used to extract all or portions of the encoded scalable video streams stored in the data section of the SVC file format. Within each track, new metadata boxes are defined. The new metadata boxes define the parameters used for extracting various types of video streams from the scalable video stream. The parameters used to define an extracted video stream include, but are mot limited to, resolution, frame rate, and quality. For example, one extracted video stream corresponds to a low resolution requirement, a second extracted video stream corresponds to a standard resolution requirement, and a third extracted video stream corresponds to a high resolution requirement.
  • FIG. 5 illustrates an exemplary configuration of a modified SVC file format 110. The modified SVC file format 110 includes a media data section 120 and a metadata section 130. The media data section 120 includes one or more scalable data streams, or SVC elementary streams. The metadata section 130 includes one or more tracks. In one embodiment, there is one track in the metadata section 130 for each SVC elementary stream stored in the media data section 120. As shown in FIG. 5, the media data section 120 includes two SVC elementary streams 122 and 124. A track 132 in the metadata section 130 corresponds to the SVC elementary stream 122 stored in the media data section 120. A track 134 in the metadata section 130 corresponds to the SVC elementary stream 124 stored in the media data section 120. Each SVC elementary stream includes a series of access units. As shown in FIG. 5, SVC elementary stream 122 is expanded to show a portion of its access units.
  • New extensions are provided within the metadata section of the modified SVC file format to define and extract sub-layers within the stored scalable video streams, where a specific sub-layer data stream is determined by specified device requirements, such as resolution, frame rate, and/or quality. One or more metadata boxes are defined that identify and extract the specific access units that correspond to each sub-layer data stream. Each track is configured to include these one or more metadata boxes. FIG. 5 shows one embodiment in which a metadata box 140 and a metadata box 142 are defined to define and extract the specific access units that form the specific sub-layer data stream. In this embodiment, the metadata box 140 is referred to as a SampleGroupDescription Box and metadata box 142 is referred to as a SampleToGroup Box.
  • Referring to FIGS. 3 and 5, the playback device 60 sends a request to the file server 50 for a particular video stream, in this case the SVC elementary stream 122, stored in the media data portion 120 of the file format 110. The request includes the required specifications of the playback device 60, for example the specific resolution and frame rate supported. Upon receiving this request, the file server 50 matches the requested video stream, SVC elementary stream 122, to its corresponding track 132 in the metadata section 130. Using a file format decoder, only the matching track 132 is decoded. The file format decoder does not decode the other tracks stored in the metadata section 130 nor does the file format decode any of the encoded scalable video streams stored in the media data section 120, including the SVC elementary stream 122. Additionally, the media data stored in the media data section 120 is encoded differently than the metadata stored in the metadata section 130, and therefore requires a different decoder.
  • Once the track 132 is determined and decoded, the metadata box 140 is accessed. The metadata box 140 utilizes the one or more device requirements to determine a matching description entry. The matching description entry defines parameter values and a group_description_index value that correspond to the device requirements. The value of the group_description_index is used by the metadata box 142 to identify and extract specific access units in the corresponding SVC elementary stream 122. FIG. 5 illustrates the exemplary case where the metadata box 142 determines and extracts access units 1, 4, 7, and so on. The extracted access units remain unaltered and encoded. Once extracted, the access units are transmitted to the end user device as a sub-layer data stream that meets the device requirements originally provided.
  • In the embodiment described above in relation to FIG. 5, each scalable video stream is stored as a single video stream in the data section of the file format, and a corresponding one track associated with the single video stream is stored in the metadata section of the file format. New extensions are defined that provide means to extract sub-streams or layers of interest from the single video stream. Alternatively, more than one track can be associated with the single video stream.
  • As an example of this alternative embodiment, suppose a separate track is created for different end-user devices. Each of these devices can still have their own internal sub-sets of frame rates, spatial resolutions, quality, etc., which they can support. One scalable video stream supports various device requirements, such spatial resolutions (QCIF, CIF, SD and HD), frame rates (7.5, 15, 30, 60 fps), and various qualities for the above. The stream is still stored in the media data section of the modified file format. In the metadata section, there can be three tracks, for example. Each of the three tracks refers the same scalable video stream. However, each track operates on a sub-set of the entire scalable stream. For example, track 1 is customized for portable players (small screen size, low frame rate, etc), track 2 is customized for Standard TV, computers, etc (SD screen size, medium frame rate, etc), and track 3 is customized for HD players (large screen size, high frame rate, etc). Each track still includes the metadata boxes that define the description entries, since each track still supports some amount of variation (scalability). In this example, track 1 supports QCIF and CIF, 7.5 and 15 fps. Track 2 supports all of track 1 and SD and 30 fps. Track 3 supports all of track 2 and HD and 60 fps.
  • In this alternative embodiment, the specific video stream requested by the end user device does not correspond to a single track. Different methods are contemplated to determine the appropriate track in this case. First, the file server decodes a part of a track to determine the device requirements that it defines. Second, the file server parses the tracks at a high level to identify which one to use. Third, tracks have track headers that provide more information about the content they describe. In this particular case, the file server decodes some high level information from each track to decide which one to use. In general, any method that matches the device requirements received in the end user device request to the proper track can be used.
  • In one embodiment, extracting portions of a SVC elementary stream is supported by defining two new boxes, a SampleGroupDescription Box and a SVCSampleToGroup Box.
  • These new boxes define features and fields that are used to identify which access units within the scalable video stream to extract. Each new description entry is defined within the SampleGroupDescription Box defines unique description entries, where each description entry refers to a different type of video streams that can be extracted from the SVC elementary stream. For example, if the SVC elementary stream is encoded to support three different resolutions, low, standard, and high, then there are three description entries, one description entry for each supported resolution. Each description entry defines the description entry fields, as described below, for that particular resolution, low, standard, or high. If the SVC elementary stream is encoded to support three different resolution, low, standard, and high, and also to support two different frame rates, high and low, then there are six different description entries, one for each combination of resolution and frame rate.
  • The SVC Dependency Description Entry, or simply ‘description entry’, documents and describes the various possible spatio-temporal combinations present within an SVC elementary stream. Each description entry is defined using a grouping type of ‘svcd’. The description entries are ordered in terms of their dependency. The first description entry documents the base layer description and subsequent description entries document the enhancement layer descriptions. A group_description_index is used to index the ordered description entries. The group_description_index is also used to group the various SVC NAL units. Each SVC NAL unit refers to one of the description entries.—
  • Each SVC NAL unit in the SVC elementary stream is grouped using an index, the group_description_index, into the ordered list of description entries. NAL units referring to a particular index may require all or some of the NAL units referring to all the lower indices for proper decoding operation, but do not require any NAL unit referring to a higher index value. In other words, dependency only exist in the direction of lower layers.
  • In one embodiment, the file server determines the sub-set of indices required for proper decoding operation based on the values of the description fields present within the description entries, for example the resolution and/or temporal rate. In another embodiment, the end user device determines the sub-set of indices.
  • The grouping type used to refer to the ordered list of description entries is ‘svcd’. The grouping type is used to link the entries present in this list to the SVCSampleToGroup Box, which includes the grouping of all the SVC samples and is described in greater detail below.
  • The following is an exemplary SVC Dependency Description Entry syntax:
  • class SVCDependencyDescriptionEntry ( ) extends
    VisualSampleGroupEntry (‘svcd’)
    {
      unsigned int (8) ProfileIndication;
      unsigned int (8) profile_compatibility;
      unsigned int (8) LevelIndication;
      unsigned int (8) temporal_level;
      unsigned int (8) dependency_id;
      unsigned int (8) temporalFrameRate;
      unsigned int (16) visualWidth;
      unsigned int (16) visualHeight;
      unsigned int (16) baseBitRate;
      unsigned int (16) maxBitRate;
      unsigned int (16) avgBitRate;
      unsigned int (8) progressiveRefinementLayerFlag;
      unsigned int (32) reserved = 0;
      //optional boxes or fields may follow when defined later
      ScalabilityInformationSEIBox( );
      //optional
    }
    class ScalabilityInformationSEIBox extends Box( ‘seib’)
    {
    //   contains all the fields as defined in Section 4.1 Scalability
    Information SEI message syntax in JSVM 2.0 Reference encoding
    description (N7084)
    }

    The variable profileIndication includes the profile code as defined in the AVC/SVC video specification. The variable profile_compatibility is a byte defined exactly the same as the byte which occurs between the profile_DC and level_DC in a sequence parameter set, as defined in the AVC/SVC video specification. The variable levelIndication includes the level code as defined in AVC/SVC video specification. The profileIndication and levelIndication fields indication fields present in each entry provide the profile and level values to which the particular layer is compatible. The variable temporal_level takes the value of the temporal_level syntax element present in the scalable extension NAL unit defined in the AVC/SVC video specification. This non-negative integer indicates the temporal level that the sample provides along time. The lowest temporal level is numbered as zero and the enhancement layers in the temporal direction are numbered as one or higher. The temporal_level field takes a default value of zero, in the case of AVC NAL units. In SVC Scalable Extension NAL units, if the extension_flag is equal to 0, then the parameters that specify the mapping of simple_priority_id to temporal_level are present in the SPS and are mapped accordingly.
  • The variable dependency_id takes the value of the dependency_id syntax element present in the scalable extension NAL unit defined in the AVC/SVC video specification. The dependency_id is a non-negative integer, with the value zero signaling that the NAL units corresponds to the lowest spatial resolution, and all higher values signal that the enhancement layers provide an increase either in spatial resolution and/or quality, for example coarse grain scalability. The dependency_id also controls the spatial scalability. The dependency_id field takes a default value of zero in the case of AVC NAL units. In SVC Scalable Extension NAL units, if the extension_flag is equal to 0, then the parameters that specify the mapping of simple_priority_id to dependency_id are present in the SPS and are mapped accordingly. The variable temporalFrameRate indicates the temporal frame rate that is associated with the temporal level field in the entry. The variable visualWidth gives the value of the width of the coded picture in pixels in this layer of the SVC stream. The variable visualHeight gives the value of the height of the coded picture in pixels in this layer of the SVC stream. The variable baseBitRate gives the bitrate in bits/second of the minimum quality that is provided by this layer without any progressive refinements. NAL units in this and lower levels that fall within the dependency hierarchy are taken into account in the calculation. The variable maxBitRate gives the maximum rate in bits/second that is provided by this layer over any window of one second. NAL units in this and lower levels that fall within the dependency hierarchy are taken into account in the calculation. The variable avgBitRate gives the average bit rate bits/second. NAL units in this and lower levels are taken into account in the calculation. The variable progressiveRefinementLayerFlag, when true, indicates that this layer contains progressive refinement NAL units, and is FGS scalable.
  • A SVCSampleToGroup Box is used to extract SVC scalable sub-streams from the SVC Elementary stream stored in the SVC file format, depending on the constraints imposed in terms of temporal, spatial and quality requirements. The SVCSampleToGroup Box provides the grouping information for each NAL unit of an SVC sample. The grouping information is provided by means of a group_description_index, which associates each NAL unit with its description information present in the SVCDependencyDescriptionEntry. The group_description_index refers to a specific description entry. The group_description_index ranges from 1 to the number of sample group entries in the SampleGroupDescription Box. Each NAL unit within the access units is assigned a specific group_description_index value. The requirements specified by the end user device ultimately determine a specific group_description_index value. All NAL units within the access units that are assigned the determined group_description_index value, or those NAL units within the access units with a group_description_index value lower than the determined group_description_index value, are extracted from the stored scalable video stream in the media data section of the file format. Those extracted access units are transmitted to the end user device. In this manner, portions of a single scalable video stream can be extracted and transmitted to an end user device depending on the requirements specified by the end user device.
  • The following is an exemplary SVCSampleToGroup Box syntax:
  • aligned (8) class SVCSampleToGroupBox extends FullBox (‘svcg’,
    version = 0, 0)
    {
      unsigned int (32) grouping_type; // Grouping Type ‘svcd’
      unsigned int sample_count; // calculated from the sample to chunk
      box
      for (int i=1; i <= sample_count; i++)
      {
        unsigned int (8) numberOfNalUnits;
        for (int j=1; j <= numberOfNalUnits; j++)
        {
        unsigned int (8)   group_description_index;
        unsigned int (1)   isDiscardableNalUnitFlag;
        unsigned int (1)   isPRNalUnitFlag;
        unsigned int (2)   quality_level;
        unsigned int (4)   reserved = 0;
        }
      }
    }

    The variable grouping_type is an integer that identifies the type of sample grouping used. It links the SVCSampleToGroup Box to its associated sample group description table having the value ‘svcd’ for the grouping type. The variable sample_count denotes the number of samples that are present in the media track for the SVC Elementary stream, and is an inferred value that is calculated using the Sample to Chunk Box. The variable numberOfNalUnits is an integer that gives the number of NAL units present in a SVC sample. The variable group_description_index is an integer that gives the index of the sample group entry which describes the NAL units in this group. The index ranges from 1 to the number of sample group entries in the particular SampleGroupDescription Box, or takes the value 0 to indicate that the particular SVC Sample is a member of no group of this type. The variable is DiscardableNalUnitFlag is a flag, the semantics of which are specified in the NAL unit semantics of the AVC video specification. The variable is PRNalUnitFlag is a flag, which if equal to 1, indicates that this NAL unit is a progressive refinement NAL unit. The variable quality_level specifies the quality level for the current NAL unit as specified in the NAL unit header or in the quality_level_list[ ]. If absent, its value is inferred to be zero.
  • The sub-sample information box defined in the ISO base syntax cannot be easily used for defining a SampleToGroup box used to extract select portions of the SVC Elementary stream. First, sub-sample information box does not have a ‘grouping_type’ indicator to map to the relevant description entries. Second, the sub-sample information box is inefficient for this purpose, since the sub-sample information box requires a (32) bit entry called ‘sample_count’. This was originally intended to indicate a run of samples, but in the case of SVC File Format, each SVC Sample may have a variable number of subsample_count, making the use of this sample_count mandatory for each SVC sample. Third, the new file format does not need to signal each subsample size, since a count of NAL units is used and the lengths of each NAL unit are already specified before them.
  • FIG. 6 illustrates an exemplary structure of an SVC Sample, or SVC access unit, according to the SVC standard. SVC Samples are externally framed and have a size supplied by that external framing. The SVC access unit is made up of a set of NAL units. Each NAL unit is represented with a length, which indicates the length in bytes of the following NAL Unit. The length field is configured to be of 1, 2, or 4 bytes. The configuration size is specified in the decoder configuration record. Each NAL Unit contains the NAL unit data as specified in lSO/IEC AVC/SVC video specification.
  • The SVC Decoder Configuration Record includes the size of the length field used in each sample to indicate the length of its contained NAL units as well as the initial parameter sets. The SVC Decoder Configuration Record is externally framed, meaning its size is supplied by the structure that contains it. The SVC Decoder Configuration Record also includes a version field. Incompatible changes to the SVC Decoder Configuration Record are indicated by a change of version number.
  • When used to provide the configuration of a parameter set elementary stream or a video elementary stream used in conjunction with a parameter set elementary stream, the configuration record contains no sequence or picture parameter sets, for example the variables numOfSequenceParameterSets and numOfPictureParameterSets both have the value.
  • The values for SVCProfilelndication, SVCLevelIndication, and the flags which indicate profile compatibility valid for all parameter sets of the scalable video stream. The level indication indicates a level of capability equal to or greater than the highest level indicated in the included parameter sets. Each profile compatibility flag is set if all the included parameter sets set that flag. The profile indication indicates a profile to which the entire stream conforms. The individual profiles and levels of each layer are documented in the SVCDependencyDescriptionEntry box.
  • The following is an exemplary SVC Decoder Configuration Record syntax:
  • aligned (8) class SVCDecoderConfigurationRecord {
      unsigned int (8) confiqurationVersion = 1;
      unsigned int (8) SVCProfilelndication;
      unsigned int (8) profile_compatibility;
      unsigned int (8) SVCLevellndication;
      bit (6) reserved = ‘111111’b;
      unsigned int (2) lengthSizeMinusOne;
      bit (3) reserved = ‘111’b;
      unsigned int (5) numOfSequenceParameterSets;
      for (i=0; i< numOfSequenceParameterSets; i++) {
        unsigned int (16) sequenceParameterSetLength;
        bit (8*sequenceParameterSetLength)
        sequenceParameterSetNALUnit;
      }
      unsigned int (8) numOfPictureParameterSets;
      for (i=0; i< numOfPictureParameterSets; i++) {
        unsigned int (16) pictureParameterSetLength;
        bit (8*pictureParameterSetLength)
        pictureParameterSetNALUnit;
      }
    }

    The variable SVCProfilelndication contains the profile code as defined in the SVC specification. The variable profile_compatibility is a byte defined the same as the byte which occurs between the profile_IDC and level_IDC in a sequence parameter set, as defined in the AVC specification. The variable SVCLevelIndication contains the level code as defined in the AVC specification. The variable lengthSizeMinusOne indicates the length in bytes of the NALUnitLength field in an SVC video sample or SVC parameter set sample of the associated stream minus one. For example, a size of one byte is indicated with a value of 0. The value of this field is one of 0, 1, or 3 corresponding to a length encoded with 1, 2, or 4 bytes, respectively. The variable numOfSequenceParameterSets indicates the number of sequence parameter sets that are used as the initial set of sequence parameter sets for decoding the SVC elementary stream. The variable sequenceParameterSetLength indicates the length in bytes of the sequence parameter set NAL unit as defined in the AVC specification. The variable sequenceParameterSetNALUnit contains a sequence parameter set NAL Unit, as specified in the AVC specification. Sequence parameter sets occur in order of ascending parameter set identifier with gaps being allowed. The variable numOfPictureParameterSets indicates the number of picture parameter sets that are used as the initial set of picture parameter sets for decoding the SVC elementary stream. The variable pictureParameterSetLength indicates the length in bytes of the picture parameter set NAL unit as defined in the AVC specification. The variable pictureParameterSetNALUnit contains a picture parameter set NAL Unit, as specified in the AVC specification. Picture parameter sets occur in order of ascending parameter set identifier with gaps being allowed.
  • As described herein, the scalable SVC video stream is stored as a single track. If the scalable SVC video stream has a base layer that is AVC compatible, then those AVC compatible NAL units that are present in each SVC sample are grouped together using the new extensions as previously described. To find which entries are AVC compatible, the Profile and Level Indicators present in the SVCDependencyDescriptionEntries are used to parse through the SVCSampleToGroup Box and extract only those NAL units from each SVC Sample that are AVC compatible.
  • The modified SVC file format is derived from the ISO Base Media File Format. As such, there is a correspondence of terms in the modified SVC file format and the ISO Base Media File Format. For example, the terms stream and access unit used in the modified SVC file format correspond to the terms track and sample, respectively, in the ISO Base Media File Format.
  • In the terminology of the ISO Base Media File Format specification, SVC tracks (both video and parameter set tracks) are video or visual tracks. They therefore use a handler_type of ‘vide’ in the HandlerBox, a video media header ‘vmhd’, and, as defined below, a derivative of the VisualSampleEntry.
  • The sample entry and sample format for SVC video elementary streams are defined below. Definitions include:
  • Box Types: ‘avc1′, ‘avcC’, ‘svc1‘, ‘svcC’
    Container: Sample Table Box (‘stbl’)
    Mandatory: Either the avc1 (if base layer is AVC) or svc1
    box is mandatory.
    Quantity: One or more sample entries may be present

    To retain backwards compatibility with AVC, two types of visual sample entries are defined. First, if an SVC Elementary stream contains an AVC compatible base layer, then an AVC visual sample entry (‘avc1’) is used. Here, the entry contains initially an AVC Configuration Box, followed by an SVC Configuration Box as defined below. The AVC Configuration Box documents the Profile, Level and Parameter Set information pertaining to the AVC compatible base layer as defined by the AVCDecoderConfigurationRecord, in the AVC File Format specification. The SVC Configuration Box documents the Profile, Level and Parameter Set information pertaining to the SVC compatible enhancement layers as defined by the SVCDecoderConfigurationRecord, defined below. If the SVC Elementary stream does not contain an AVC base layer, then an SVC visual sample entry (‘svc1’) is used. The SVC visual sample entry contains an SVC Configuration Box, as defined below. This includes an SVCDecoderConfigurationRecord, also as defined below. Multiple sample descriptions are used, as permitted by the ISO Base Media File Format specification, to indicate sections of video that use different configurations or parameter sets.
  • The following is an exemplary AVC Configuration Box and the SVC Configuration Box syntax:
  • // Visual Sequences
    class AVCConfigurationBox extends Box (’avcC’) {
      AVCDecoderConfig-urationRecord ( ) AVCConfig;
    }
    class SVCConfigurationBox extends Box (’svcC’) {
      SVCDecoderConfigurationRecord ( ) SVCConfig;
    }
    //   Use this if base layer is AVC compatible
    class AVCSampleEntry( ) extends VisualSampleEntry (‘avc1 ’) {
      AVCConfigurationBox avcconfig;
      SVCConfigurationBox svcconfig;
      MPEG4BitRateBox ( ); // optional
      MPEG4ExtensionDescriptorsBox ( ); // optional
    }
    //   Use this if base layer is NOT AVC compatible
    class SVCSampleEntry( ) extends VisualSamplesntry (‘svc1 ’) {
      SVCConfigurationBox svcconfig;
      MPEG4BitRateBox 0; // optional
      MPEG4ExtensionDescniptorsBox ( ); // optional
    }

    The variable Compressorname in the base class VisualSampleEntry indicates the name of the compressor used with the value “\012AVC Coding” or “\012SVC Coding” being recommended (\012 is 10, the length of the string as a byte). If a separate parameter set stream is used, the variables numOfSequenceParameterSets and numOfPictureParameterSets must both be zero.
  • The format of a sample in an SVC video elementary stream is configured via the decoder specific configuration for the SVC elementary stream. The SVC Sample contains all the NAL units pertaining to all the scalable levels that are present for the primary coded picture as shown in FIG. 7.
  • The following is an exemplary SVC Sample syntax:
  • aligned (8) class SVCSample
    {
     unsigned int PictureLength = // SVCSample Size from
     sample_size; SampleSizeBox
     for (i=0; i<PictureLength; ) // till the end of the picture
     {
      unsigned int
      ( (SVCDecoderConfigurationRecord.LengthSizeMinusOne+1) * 8)
      NALUnitLength;
     bit (NALUnitLength * 8) NALUnit;
     i + = (SVCDecoderConfigurationRecord.LengthSizeMinusOne+1) +
     NALUnitLength;
     }
    }

    The variable NALUnitLength indicates the size of a NAL unit measured in bytes. The length field includes the size of both the one byte NAL header and the EBSP payload but does not include the length field itself. The variable NALUnit contains a single NAL unit. The syntax of an NAL unit is as defined in the ISO/IEC AVC/SVC video specification and includes both the one byte NAL header and the variable length encapsulated byte stream payload.
  • The definition for the AVC Parameter Set Elementary stream as specified in the AVC File Format also applies in this case for the storage of SVC Parameter Sets as separate elementary streams.
  • The width and height in the VisualSampleEntry document the correct cropped largest spatial resolution in terms of pixels that is obtained by decoding the entire scalable bitstream. To obtain the individual width and height of each layer, the group description entries are evaluated. Unless otherwise specified herein, all other definitions as specified in the AVC File Format Specification apply.
  • In general, the new SVC file format extensions provide for access and adaptation of fully scalable and layered scalable streams. The grouping methodology enables the creation of multiple Group Description Entries, wherein each description entry describes the properties of a scalable layer and its possible refinements in the case of fully scalable streams. Each description entry documents the temporal frame rate (temporal scalability), the spatial dimensions (spatial scalability), the range of bit-rates available from this layer, indicates if this layer is Fine Grain Scalable, the profile and level indications, and dependency information. The dependency hierarchy is easily maintained by the index of the group description entries where each higher index indicates that it depends on all or some of the lower layers described by the entries below it. The SampleToGroup box maps each NAL unit of a SVC sample to its group_description_index. This allows for an efficient method of reading, parsing and skipping any un-needed data. If the entire scalable sample is desired, then the whole SVC sample is read. If only particular scalable layers are desired, then those NAL units (VCL or otherwise) that do not map to the desired layer are skipped while parsing.
  • The modified file format defines a mechanism to access and extract the entire scalable layer, or portions thereof, stored in the file format. For transmission over a network and for possible adaptation of the coded data over the network, either the scalability information in the SEI messages is used, or alternative rules to drop un-desired NAL units over the network are used. One possibility is to define rules as part of the RTP mapping process to enable such alternative functionality.
  • The modified file format is backwards compatible with the existing AVC File Format specification to the fullest extent. There is no change in the DecoderConfiguration, Sample syntax structure and elementary stream structure when storing AVC compatible streams. The File Type indication signals the presence of an AVC compatible base layer stream by using the brand ‘avc1’ in its compatible_brands list. The presence of AVC compatible streams is detected by reading the Profile/Level Indicators present in each group_description_entry. Alternatively, a separate ‘hint’ track is also created for the AVC compatible base layer stream.
  • In one embodiment, the extracted access units are transmitted over the network using Real-time Transport Protocol (RTP). RTP has its own headers which are added to the payload, in this case the extracted access units. The hint tracks include pre-generated RTP headers and pointers to the scalable data. When the file server transmits the extracted access units, the proper hint track is accessed to retrieve the pre-generated RTP headers and pointers, thereby eliminating the additional overhead of generating the RTP headers.
  • Each sample in an AVC hint track stores information referring to the AVC Base Layer compatible NAL units in the scalable video stream. All NAL units within this sample have the same timestamp.
  • Following is a sample syntax for an AVC hint track:
  • aligned (8) class AVCHintsample {
       unsigned int (32) sample_index;
       unsigned int (8) nalunitcount;
       unsigned int (8) reserved;
    }

    The variable sample_index indicates the sample number of the SVC sample that contains AVG base layer NAL units. The variable nalunitcount indicates the number of consecutive NAL units from the beginning of the SVC sample that are AVG compatible.
  • FIG. 7 illustrates an exemplary method of implementing the modified SVC file format. At the step 200, a file server configured to implement the modified SVC file format receives a request from an end user device. The request identifies the name of a specific data stream to be transmitted. The specific data stream corresponds to a specific scalable data stream stored in the modified file format. The request also includes device requirements of the end user device, such as a supported resolution and frame rate. At the step 210, the file server determines a track associated with the specified data stream. At the step 220, the file server decodes the track determined in the step 210. At the step 230, one or more metadata boxes within the decoded track are used to determine a description entry associated with the device requirements. In one embodiment, the decoded track includes a SampleGroupDescription Box which is used to determine the associated description entry. In this embodiment, the description entry defines parameter values corresponding to the device requirements. At the step 240, the one or more metadata boxes are used to determine the access units, and the specific NAL units within each access unit, within the specific scalable data stream. The specific NAL units within the access units are determined according to description entry determined in the step 230. In one embodiment, the decoded track includes a SampleToGroup Box which is used to determine the specific access units. At the step 250, the specific access units determined in the step 240 are extracted from the specific scalable data stream. The extracted access units are a sub-layer data stream of the specific scalable data stream. The sub-layer data stream matches the device requirements received in the step 200 and is therefore supported by the end user device. At the step 260, the sub-layer data stream is transmitted to the end user device.
  • The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of the principles of construction and operation of the invention. Such references, herein, to specific embodiments and details thereof are not intended to limit the scope of the claims appended hereto. It will be apparent to those skilled in the art that modifications can be made in the embodiments chosen for illustration without departing from the spirit and scope of the invention.

Claims (45)

1. A modified file format stored in a memory, the modified file format comprising:
a. a media data section to store a scalable data stream; and
b. a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream.
2. The modified file format of claim 1 wherein the scalable data stream comprises a scalable video stream.
3. The modified file format of claim 2 wherein the scalable video stream comprises a Scalable Video Coding (SVC) elementary stream.
4. The modified file format of claim 1 wherein the modified file format comprises a modified Scalable Video Coding (SVC) file format.
5. The modified file format of claim 1 wherein the scalable data stream comprises a single encoded track.
6. The modified file format of claim 1 wherein the scalable data stream comprises a series of access units.
7. The modified file format of claim 6 wherein the one or more metadata boxes are configured to define the sub-layer data stream according to one or more device requirements received from an end user device capable of processing the sub-layer data stream.
8. The modified file format of claim 7 wherein the one or more metadata boxes are further configured to define one of a plurality of description entries according to the one or more device requirements.
9. The modified file format of claim 8 wherein the one or more metadata boxes are further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
10. The modified file format of claim 9 wherein the one or more metadata boxes comprise a SVC Sample Group Description Box configured to define the one description entry.
11. The modified file format of claim 10 wherein the one or more metadata boxes comprise a SVC Sample To Group Box to define and group the sub-set of access units.
12. The modified file format of claim 1 wherein the one or more metadata boxes comprise extensions to the Scaling Video Coding (SVC) standards.
13. The modified file format of claim 1 wherein the scalable data stream comprises a plurality of sub-layer data streams.
14. The modified file format of claim 1 wherein the one or more metadata boxes are configured to define a hint track associated with the sub-layer data stream.
15. The modified file format of claim 14 wherein the sub-layer data stream comprises an Advanced Video Coding (AVC) compatible base layer stream.
16. A file server configured to utilize a modified file format, the file server comprising:
a. a memory configured to store and extract data according to the modified file format, wherein the modified file format comprises:
i. a media data section to store a scalable data stream; and
ii. a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream; and
b. a processing module configured to provide control instruction to the memory and to extract the sub-layer data stream from the scalable data stream.
17. The file server of claim 16 wherein the scalable data stream comprises a series of access units.
18. The file server of claim 17 further comprising a network interface module configured to receive one or more device requirements from an end user device and to transmit the defined sub-layer data stream.
19. The file server of claim 18 wherein the one or more metadata boxes are configured to define one of a plurality of description entries according to the one or more device requirements.
20. The file server of claim 19 wherein the one or more metadata boxes are further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
21. The file server of claim 20 wherein the one or more metadata boxes comprise a SVC Sample Group Description Box configured to define the one description entry.
22. The file server of claim 21 wherein the one or more metadata boxes comprise a SVC Sample To Group Box to define and group the sub-set of access units.
23. The file server of claim 16 wherein the scalable data stream comprises a scalable video stream.
24. The file server of claim 23 wherein the scalable video stream comprises a Scalable Video Coding (SVC) elementary stream.
25. The file server of claim 16 wherein the modified file format comprises a modified Scalable Video Coding (SVC) file format.
26. The file server of claim 16 wherein the scalable data stream comprises a single encoded track.
27. The file server of claim 16 wherein the scalable data stream comprises a plurality of sub-layer data streams.
28. A system configured to utilize a modified file format, the system comprising:
a. an end user device to transmit one or more device requirements; and
b. a file server configured to receive the one or more device requirements and to utilize the modified file system, the file server comprising:
i. a memory configured to store and extract data according to the modified file format, wherein the modified file format comprises:
A. a media data section to store a scalable data stream; and
B. a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream according to the one or more device requirements; and
ii. a processing module configured to provide control instruction to the memory and to extract the sub-layer data stream from the media data section.
29. The system of claim 28 wherein the file server further comprises a network interface module configured to receive the one or more device requirements from the end user device and to transmit the defined sub-layer data stream.
30. The system of claim 28 wherein the scalable data stream comprises a series of access units.
31. The system of claim 30 wherein the one or more metadata boxes are configured to define one of a plurality of description entries according to the one or more device requirements.
32. The system of claim 31 wherein the one or more metadata boxes are further configured to define a sub-set of access units according to the selected description entry, wherein the defined sub-set of access units comprises the sub-layer data stream.
33. The system of claim 32 wherein the one or more metadata boxes comprise a SVC Sample Group Description Box configured to define the one description entry.
34. The system of claim 33 wherein the one or more metadata boxes comprise a SVC Sample To Group Box to define and group the sub-set of access units.
35. The system of claim 28 wherein the scalable data stream comprises a scalable video stream.
36. The system of claim 35 wherein the scalable video stream comprises a Scalable Video Coding (SVC) elementary stream.
37. The system of claim 28 wherein the modified file format comprises a modified Scalable Video Coding (SVC) file format.
38. The system of claim 28 wherein the one or more metadata boxes comprise extensions to the Scaling Video Coding (SVC) standards.
39. The system of claim 28 wherein the one or more metadata boxes comprise extensions to the Scaling Video Coding (SVC) standards.
40. The system of claim 28 wherein the scalable data stream comprises a single encoded track.
41. The system of claim 28 wherein the scalable data stream comprises a plurality of sub-layer data streams.
42. The system of claim 28 wherein the one or more metadata boxes are configured to define a hint track associated with the sub-layer data stream.
43. The system of claim 42 wherein the sub-layer data stream comprises an Advanced Video Coding (AVC) compatible base layer stream.
44-62. (canceled)
63. A modified file format comprising a modified Scalable Video Coding (SVC) file format stored in a memory, the modified file format comprising:
a. a media data section to store a scalable data stream, wherein the scalable data stream comprises a series of access units and a plurality of sub-layer data streams; and
b. a metadata section including at least one track associated with the scalable data stream stored in the media data section, wherein each track comprises one or more metadata boxes to define and group a sub-layer data stream of the scalable data stream, wherein the one or more metadata boxes define the sub-layer data stream according to one or more device requirements received from an end user device capable of processing the sub-layer data stream and further wherein the one or more metadata boxes define one of a plurality of description entries according to the one or more device requirements, wherein the one or more metadata boxes define a sub-set of access units according to the selected description entry, the defined sub-set of access units comprising the sub-layer data stream, the one or more metadata boxes comprising a SVC Sample Group Description Box defining the one description entry and the one or more metadata boxes define a hint track associated with the sub-layer data stream, the sub-layer data stream comprising an Advanced Video Coding (AVC) compatible base layer stream.
US12/721,383 2005-07-15 2010-03-10 Scalable video coding (svc) file format Abandoned US20100161692A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/721,383 US20100161692A1 (en) 2005-07-15 2010-03-10 Scalable video coding (svc) file format

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US69953505P 2005-07-15 2005-07-15
US11/398,195 US7725593B2 (en) 2005-07-15 2006-04-04 Scalable video coding (SVC) file format
US12/721,383 US20100161692A1 (en) 2005-07-15 2010-03-10 Scalable video coding (svc) file format

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/398,195 Division US7725593B2 (en) 2005-07-15 2006-04-04 Scalable video coding (SVC) file format

Publications (1)

Publication Number Publication Date
US20100161692A1 true US20100161692A1 (en) 2010-06-24

Family

ID=37662860

Family Applications (3)

Application Number Title Priority Date Filing Date
US11/398,195 Expired - Fee Related US7725593B2 (en) 2005-07-15 2006-04-04 Scalable video coding (SVC) file format
US12/721,383 Abandoned US20100161692A1 (en) 2005-07-15 2010-03-10 Scalable video coding (svc) file format
US12/755,674 Expired - Fee Related US8291104B2 (en) 2005-07-15 2010-04-07 Scalable video coding (SVC) file format

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/398,195 Expired - Fee Related US7725593B2 (en) 2005-07-15 2006-04-04 Scalable video coding (SVC) file format

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/755,674 Expired - Fee Related US8291104B2 (en) 2005-07-15 2010-04-07 Scalable video coding (SVC) file format

Country Status (6)

Country Link
US (3) US7725593B2 (en)
EP (1) EP1920322A4 (en)
JP (1) JP2009502055A (en)
CN (1) CN101595475B (en)
TW (1) TW200721844A (en)
WO (1) WO2007011836A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110150217A1 (en) * 2009-12-21 2011-06-23 Samsung Electronics Co., Ltd. Method and apparatus for providing video content, and method and apparatus reproducing video content
US20130097334A1 (en) * 2010-06-14 2013-04-18 Thomson Licensing Method and apparatus for encapsulating coded multi-component video
US9118939B2 (en) 2010-12-20 2015-08-25 Arris Technology, Inc. SVC-to-AVC rewriter with open-loop statistical multiplexer
US9426462B2 (en) 2012-09-21 2016-08-23 Qualcomm Incorporated Indication and activation of parameter sets for video coding
US10764615B2 (en) 2016-02-09 2020-09-01 Sony Corporation Transmission device, transmission method, reception device and reception method

Families Citing this family (100)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6307487B1 (en) * 1998-09-23 2001-10-23 Digital Fountain, Inc. Information additive code generator and decoder for communication systems
US7068729B2 (en) * 2001-12-21 2006-06-27 Digital Fountain, Inc. Multi-stage code generator and decoder for communication systems
US9240810B2 (en) * 2002-06-11 2016-01-19 Digital Fountain, Inc. Systems and processes for decoding chain reaction codes through inactivation
EP2348640B1 (en) 2002-10-05 2020-07-15 QUALCOMM Incorporated Systematic encoding of chain reaction codes
US7370212B2 (en) * 2003-02-25 2008-05-06 Microsoft Corporation Issuing a publisher use license off-line in a digital rights management (DRM) system
US7483532B2 (en) * 2003-07-03 2009-01-27 Microsoft Corporation RTP payload format
US7139960B2 (en) * 2003-10-06 2006-11-21 Digital Fountain, Inc. Error-correcting multi-stage code generator and decoder for communication systems having single transmitters or multiple transmitters
WO2005112250A2 (en) 2004-05-07 2005-11-24 Digital Fountain, Inc. File download and streaming system
US7721184B2 (en) * 2004-08-11 2010-05-18 Digital Fountain, Inc. Method and apparatus for fast encoding of data symbols according to half-weight codes
WO2006061838A2 (en) * 2004-12-08 2006-06-15 Imagine Communications Ltd. Distributed statistical multiplexing of multi-media
US8438645B2 (en) 2005-04-27 2013-05-07 Microsoft Corporation Secure clock with grace periods
US8725646B2 (en) * 2005-04-15 2014-05-13 Microsoft Corporation Output protection levels
US20060265758A1 (en) * 2005-05-20 2006-11-23 Microsoft Corporation Extensible media rights
JP2008543142A (en) * 2005-05-24 2008-11-27 ノキア コーポレイション Method and apparatus for hierarchical transmission and reception in digital broadcasting
US7684566B2 (en) * 2005-05-27 2010-03-23 Microsoft Corporation Encryption scheme for streamed multimedia content protected by rights management system
US7769880B2 (en) * 2005-07-07 2010-08-03 Microsoft Corporation Carrying protected content using a control protocol for streaming and a transport protocol
US7561696B2 (en) * 2005-07-12 2009-07-14 Microsoft Corporation Delivering policy updates for protected content
US20070022215A1 (en) * 2005-07-19 2007-01-25 Singer David W Method and apparatus for media data transmission
DE102005033981A1 (en) * 2005-07-20 2007-02-01 Siemens Ag Method for storing individual data elements of a scalable data stream in a file and associated device
US7634816B2 (en) 2005-08-11 2009-12-15 Microsoft Corporation Revocation information management
US8321690B2 (en) * 2005-08-11 2012-11-27 Microsoft Corporation Protecting digital media of various content types
US7720096B2 (en) * 2005-10-13 2010-05-18 Microsoft Corporation RTP payload format for VC-1
KR100772868B1 (en) * 2005-11-29 2007-11-02 삼성전자주식회사 Scalable video coding based on multiple layers and apparatus thereof
KR20070108434A (en) * 2006-01-09 2007-11-12 한국전자통신연구원 Proposals for improving data sharing in the svc(scalable video coding) file format
KR20070108433A (en) * 2006-01-09 2007-11-12 한국전자통신연구원 Share of video data by using chunk descriptors in svc file format
JP4874343B2 (en) * 2006-01-11 2012-02-15 ノキア コーポレイション Aggregation of backward-compatible pictures in scalable video coding
US9136983B2 (en) * 2006-02-13 2015-09-15 Digital Fountain, Inc. Streaming and buffering using variable FEC overhead and protection periods
US9270414B2 (en) * 2006-02-21 2016-02-23 Digital Fountain, Inc. Multiple-field based code generator and decoder for communications systems
WO2007134196A2 (en) 2006-05-10 2007-11-22 Digital Fountain, Inc. Code generator and decoder using hybrid codes
US9419749B2 (en) 2009-08-19 2016-08-16 Qualcomm Incorporated Methods and apparatus employing FEC codes with permanent inactivation of symbols for encoding and decoding processes
US9209934B2 (en) 2006-06-09 2015-12-08 Qualcomm Incorporated Enhanced block-request streaming using cooperative parallel HTTP and forward error correction
US9380096B2 (en) 2006-06-09 2016-06-28 Qualcomm Incorporated Enhanced block-request streaming system for handling low-latency streaming
US9178535B2 (en) 2006-06-09 2015-11-03 Digital Fountain, Inc. Dynamic stream interleaving and sub-stream based delivery
US9432433B2 (en) 2006-06-09 2016-08-30 Qualcomm Incorporated Enhanced block-request streaming system using signaling or block creation
US9386064B2 (en) * 2006-06-09 2016-07-05 Qualcomm Incorporated Enhanced block-request streaming using URL templates and construction rules
US8699583B2 (en) * 2006-07-11 2014-04-15 Nokia Corporation Scalable video coding and decoding
KR101046749B1 (en) * 2006-10-19 2011-07-06 엘지전자 주식회사 Encoding method and apparatus and decoding method and apparatus
KR100776680B1 (en) * 2006-11-09 2007-11-19 한국전자통신연구원 Method for packet type classification to svc coded video bitstream, and rtp packetization apparatus and method
EP3041195A1 (en) * 2007-01-12 2016-07-06 University-Industry Cooperation Group Of Kyung Hee University Packet format of network abstraction layer unit, and algorithm and apparatus for video encoding and decoding using the format
EP2191402A4 (en) * 2007-08-20 2014-05-21 Nokia Corp Segmented metadata and indexes for streamed multimedia data
CN101802797B (en) 2007-09-12 2013-07-17 数字方敦股份有限公司 Generating and communicating source identification information to enable reliable communications
WO2009093647A1 (en) 2008-01-24 2009-07-30 Nec Corporation Dynamic image stream processing method and device, and dynamic image reproduction device and dynamic image distribution device using the same
US8681856B2 (en) * 2008-04-24 2014-03-25 Sk Planet Co., Ltd. Scalable video providing and reproducing system and methods thereof
JP5462259B2 (en) * 2008-07-16 2014-04-02 シズベル インターナショナル エス.アー. Method and apparatus for track and track subset grouping
KR101547557B1 (en) * 2008-11-14 2015-08-26 삼성전자주식회사 Method and apparatus to select content playing device
KR20100071688A (en) * 2008-12-19 2010-06-29 한국전자통신연구원 A streaming service system and method for universal video access based on scalable video coding
US9281847B2 (en) 2009-02-27 2016-03-08 Qualcomm Incorporated Mobile reception of digital video broadcasting—terrestrial services
US20100250763A1 (en) * 2009-03-31 2010-09-30 Nokia Corporation Method and Apparatus for Transmitting Information on Operation Points
US20100250764A1 (en) * 2009-03-31 2010-09-30 Nokia Corporation Method and Apparatus for Signaling Layer Information of Scalable Media Data
JP5542912B2 (en) * 2009-04-09 2014-07-09 テレフオンアクチーボラゲット エル エム エリクソン(パブル) Media container file management
CN102165776B (en) * 2009-07-06 2012-11-21 华为技术有限公司 Transmission method, receiving method and device for scalable video coding files
US8566393B2 (en) * 2009-08-10 2013-10-22 Seawell Networks Inc. Methods and systems for scalable video chunking
US9288010B2 (en) 2009-08-19 2016-03-15 Qualcomm Incorporated Universal file delivery methods for providing unequal error protection and bundled file delivery services
US9917874B2 (en) * 2009-09-22 2018-03-13 Qualcomm Incorporated Enhanced block-request streaming using block partitioning or request controls for improved client-side handling
US20110096828A1 (en) * 2009-09-22 2011-04-28 Qualcomm Incorporated Enhanced block-request streaming using scalable encoding
TWI419568B (en) * 2010-05-27 2013-12-11 Univ Nat Sun Yat Sen Three dimensional image dividing method
US20130091154A1 (en) * 2010-06-14 2013-04-11 Thomson Licensing Method And Apparatus For Encapsulating Coded Multi-Component Video
JP2013534101A (en) * 2010-06-14 2013-08-29 トムソン ライセンシング Method and apparatus for encapsulating encoded multi-component video
EP2580920A1 (en) * 2010-06-14 2013-04-17 Thomson Licensing Method and apparatus for encapsulating coded multi-component video
US9049497B2 (en) 2010-06-29 2015-06-02 Qualcomm Incorporated Signaling random access points for streaming video data
US8918533B2 (en) 2010-07-13 2014-12-23 Qualcomm Incorporated Video switching for streaming video data
US9185439B2 (en) 2010-07-15 2015-11-10 Qualcomm Incorporated Signaling data for multiplexing video components
US9131033B2 (en) 2010-07-20 2015-09-08 Qualcomm Incoporated Providing sequence data sets for streaming video data
US9596447B2 (en) 2010-07-21 2017-03-14 Qualcomm Incorporated Providing frame packing type information for video coding
US8799405B2 (en) * 2010-08-02 2014-08-05 Ncomputing, Inc. System and method for efficiently streaming digital video
US9456015B2 (en) 2010-08-10 2016-09-27 Qualcomm Incorporated Representation groups for network streaming of coded multimedia data
CN102547273B (en) * 2010-12-08 2014-05-07 中国科学院声学研究所 Multi-media file structure supporting scalable video coding based on MKV
US9270299B2 (en) 2011-02-11 2016-02-23 Qualcomm Incorporated Encoding and decoding using elastic codes with flexible source block mapping
US8958375B2 (en) 2011-02-11 2015-02-17 Qualcomm Incorporated Framing for an improved radio link protocol including FEC
US9253233B2 (en) 2011-08-31 2016-02-02 Qualcomm Incorporated Switch signaling methods providing improved switching between representations for adaptive HTTP streaming
US9843844B2 (en) 2011-10-05 2017-12-12 Qualcomm Incorporated Network streaming of media data
KR102047492B1 (en) * 2012-03-12 2019-11-22 삼성전자주식회사 Method and apparatus for scalable video encoding, method and apparatus for scalable video decoding
KR102115323B1 (en) 2012-03-16 2020-05-26 엘지전자 주식회사 Method for storing image information, method for parsing image information and apparatus using same
US9294226B2 (en) 2012-03-26 2016-03-22 Qualcomm Incorporated Universal object delivery and template-based file delivery
US9351016B2 (en) 2012-04-13 2016-05-24 Sharp Kabushiki Kaisha Devices for identifying a leading picture
US9161004B2 (en) * 2012-04-25 2015-10-13 Qualcomm Incorporated Identifying parameter sets in video files
US10097841B2 (en) * 2012-05-04 2018-10-09 Lg Electronics Inc. Method for storing image data, method for parsing image data, and an apparatus for using the same
US20140092953A1 (en) 2012-10-02 2014-04-03 Sharp Laboratories Of America, Inc. Method for signaling a step-wise temporal sub-layer access sample
US20140098868A1 (en) * 2012-10-04 2014-04-10 Qualcomm Incorporated File format for video data
KR20150092120A (en) * 2012-11-30 2015-08-12 소니 주식회사 Image processing device and method
US9357199B2 (en) * 2013-01-04 2016-05-31 Qualcomm Incorporated Separate track storage of texture and depth views for multiview coding plus depth
CN109587573B (en) * 2013-01-18 2022-03-18 佳能株式会社 Generation apparatus and method, display apparatus and method, and storage medium
EP2962479B1 (en) 2013-02-28 2016-11-30 Robert Bosch GmbH Mobile electronic device integration with in-vehicle information systems
KR20140123914A (en) * 2013-04-12 2014-10-23 삼성전자주식회사 Method and apparatus for multi-layer video encoding for random access, method and apparatus for multi-layer video decoding for random access
US10356459B2 (en) * 2013-07-22 2019-07-16 Sony Corporation Information processing apparatus and method
US9756363B2 (en) * 2013-08-20 2017-09-05 Lg Electronics Inc. Apparatus for transmitting media data via streaming service, apparatus for receiving media data via streaming service, method for transmitting media data via streaming service and method for receiving media data via streaming service
JP5774652B2 (en) 2013-08-27 2015-09-09 ソニー株式会社 Transmitting apparatus, transmitting method, receiving apparatus, and receiving method
US9621919B2 (en) * 2013-10-23 2017-04-11 Qualcomm Incorporated Multi-layer video file format designs
CN110636238B (en) * 2014-06-30 2022-01-28 索尼公司 Information processing apparatus and method
EP3216219A1 (en) * 2014-11-05 2017-09-13 Colin, Jean-Claude Method for producing animated images
US9928297B2 (en) * 2015-02-11 2018-03-27 Qualcomm Incorporated Sample grouping signaling in file formats
US10349067B2 (en) * 2016-02-17 2019-07-09 Qualcomm Incorporated Handling of end of bitstream NAL units in L-HEVC file format and improvements to HEVC and L-HEVC tile tracks
CN105635188B (en) * 2016-03-31 2019-07-09 深圳市矽伟智科技有限公司 A kind of visual content distribution method and system
US10187443B2 (en) * 2017-06-12 2019-01-22 C-Hear, Inc. System and method for encoding image data and other data types into one data format and decoding of same
US11588872B2 (en) 2017-06-12 2023-02-21 C-Hear, Inc. System and method for codec for combining disparate content
KR102495915B1 (en) 2018-04-30 2023-02-03 삼성전자 주식회사 Storage device and server including the storage device
EP3857876A1 (en) 2018-09-25 2021-08-04 Telefonaktiebolaget LM Ericsson (publ) Media bistream having backwards compatibility
JP6648811B2 (en) * 2018-12-13 2020-02-14 ソニー株式会社 Transmitting device, transmitting method, receiving device and receiving method
GB2585052B (en) * 2019-06-26 2023-07-26 Canon Kk Method and apparatus for encapsulating panorama images in a file
JP6773205B2 (en) * 2019-12-19 2020-10-21 ソニー株式会社 Transmitter, transmitter, receiver and receiver

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030012279A1 (en) * 1997-03-17 2003-01-16 Navin Chaddha Multimedia compression system with additive temporal layers
US6757450B1 (en) * 2000-03-30 2004-06-29 Microsoft Corporation Negotiated image data push processing
US20050084166A1 (en) * 2002-06-25 2005-04-21 Ran Boneh Image processing using probabilistic local behavior assumptions
US6937273B1 (en) * 1997-05-28 2005-08-30 Eastman Kodak Company Integrated motion-still capture system with indexing capability
US6937723B2 (en) * 2002-10-25 2005-08-30 Avaya Technology Corp. Echo detection and monitoring
US20050235047A1 (en) * 2004-04-16 2005-10-20 Qiang Li Method and apparatus for a large scale distributed multimedia streaming system and its media content distribution
US20050254575A1 (en) * 2004-05-12 2005-11-17 Nokia Corporation Multiple interoperability points for scalable media coding and transmission
US20050275752A1 (en) * 2002-10-15 2005-12-15 Koninklijke Philips Electronics N.V. System and method for transmitting scalable coded video over an ip network
US7043059B2 (en) * 2001-02-10 2006-05-09 Hewlett-Packard Development Company, L.P. Method of selectively storing digital images
US7047241B1 (en) * 1995-10-13 2006-05-16 Digimarc Corporation System and methods for managing digital creative works
US20060120450A1 (en) * 2004-12-03 2006-06-08 Samsung Electronics Co., Ltd. Method and apparatus for multi-layered video encoding and decoding
US20060233247A1 (en) * 2005-04-13 2006-10-19 Visharam Mohammed Z Storing SVC streams in the AVC file format
US20060268991A1 (en) * 2005-04-11 2006-11-30 Segall Christopher A Method and apparatus for adaptive up-scaling for spatially scalable coding
US7284041B2 (en) * 2003-03-13 2007-10-16 Hitachi, Ltd. Method for accessing distributed file system
US20080082482A1 (en) * 2005-01-11 2008-04-03 Peter Amon Method and Device for Processing Scalable Data
US7383288B2 (en) * 2001-01-11 2008-06-03 Attune Systems, Inc. Metadata based file switch and switched file system
US20100146082A1 (en) * 2008-12-10 2010-06-10 Hitachi, Ltd. Data distribution communication apparatus and data distribution system
US7965722B2 (en) * 2002-09-17 2011-06-21 Futch Richard J Communication of active data flows between a transport modem termination system and cable transport modems

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001333389A (en) * 2000-05-17 2001-11-30 Mitsubishi Electric Research Laboratories Inc Video reproduction system and method for processing video signal
US20040167925A1 (en) * 2003-02-21 2004-08-26 Visharam Mohammed Zubair Method and apparatus for supporting advanced coding formats in media files
WO2003073770A1 (en) * 2002-02-25 2003-09-04 Sony Electronics, Inc. Method and apparatus for supporting avc in mp4

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7047241B1 (en) * 1995-10-13 2006-05-16 Digimarc Corporation System and methods for managing digital creative works
US20030012279A1 (en) * 1997-03-17 2003-01-16 Navin Chaddha Multimedia compression system with additive temporal layers
US6937273B1 (en) * 1997-05-28 2005-08-30 Eastman Kodak Company Integrated motion-still capture system with indexing capability
US6757450B1 (en) * 2000-03-30 2004-06-29 Microsoft Corporation Negotiated image data push processing
US7383288B2 (en) * 2001-01-11 2008-06-03 Attune Systems, Inc. Metadata based file switch and switched file system
US7043059B2 (en) * 2001-02-10 2006-05-09 Hewlett-Packard Development Company, L.P. Method of selectively storing digital images
US20050084166A1 (en) * 2002-06-25 2005-04-21 Ran Boneh Image processing using probabilistic local behavior assumptions
US7965722B2 (en) * 2002-09-17 2011-06-21 Futch Richard J Communication of active data flows between a transport modem termination system and cable transport modems
US20050275752A1 (en) * 2002-10-15 2005-12-15 Koninklijke Philips Electronics N.V. System and method for transmitting scalable coded video over an ip network
US6937723B2 (en) * 2002-10-25 2005-08-30 Avaya Technology Corp. Echo detection and monitoring
US7284041B2 (en) * 2003-03-13 2007-10-16 Hitachi, Ltd. Method for accessing distributed file system
US20050235047A1 (en) * 2004-04-16 2005-10-20 Qiang Li Method and apparatus for a large scale distributed multimedia streaming system and its media content distribution
US20050254575A1 (en) * 2004-05-12 2005-11-17 Nokia Corporation Multiple interoperability points for scalable media coding and transmission
US20060120450A1 (en) * 2004-12-03 2006-06-08 Samsung Electronics Co., Ltd. Method and apparatus for multi-layered video encoding and decoding
US20080082482A1 (en) * 2005-01-11 2008-04-03 Peter Amon Method and Device for Processing Scalable Data
US20060268991A1 (en) * 2005-04-11 2006-11-30 Segall Christopher A Method and apparatus for adaptive up-scaling for spatially scalable coding
US20060233247A1 (en) * 2005-04-13 2006-10-19 Visharam Mohammed Z Storing SVC streams in the AVC file format
US20100146082A1 (en) * 2008-12-10 2010-06-10 Hitachi, Ltd. Data distribution communication apparatus and data distribution system

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110150217A1 (en) * 2009-12-21 2011-06-23 Samsung Electronics Co., Ltd. Method and apparatus for providing video content, and method and apparatus reproducing video content
US20130097334A1 (en) * 2010-06-14 2013-04-18 Thomson Licensing Method and apparatus for encapsulating coded multi-component video
US9118939B2 (en) 2010-12-20 2015-08-25 Arris Technology, Inc. SVC-to-AVC rewriter with open-loop statistical multiplexer
US9674561B2 (en) 2010-12-20 2017-06-06 Arris Enterprises, Inc. SVC-to-AVC rewriter with open-loop statistal multplexer
US9426462B2 (en) 2012-09-21 2016-08-23 Qualcomm Incorporated Indication and activation of parameter sets for video coding
US9554146B2 (en) 2012-09-21 2017-01-24 Qualcomm Incorporated Indication and activation of parameter sets for video coding
US10764615B2 (en) 2016-02-09 2020-09-01 Sony Corporation Transmission device, transmission method, reception device and reception method
US11223859B2 (en) 2016-02-09 2022-01-11 Sony Corporation Transmission device, transmission method, reception device and reception method
US11792452B2 (en) 2016-02-09 2023-10-17 Sony Group Corporation Transmission device, transmission method, reception device and reception method

Also Published As

Publication number Publication date
TW200721844A (en) 2007-06-01
US7725593B2 (en) 2010-05-25
US8291104B2 (en) 2012-10-16
WO2007011836A3 (en) 2009-04-30
EP1920322A2 (en) 2008-05-14
US20070016594A1 (en) 2007-01-18
WO2007011836A2 (en) 2007-01-25
CN101595475B (en) 2012-12-12
CN101595475A (en) 2009-12-02
JP2009502055A (en) 2009-01-22
US20100198887A1 (en) 2010-08-05
EP1920322A4 (en) 2015-07-01

Similar Documents

Publication Publication Date Title
US8291104B2 (en) Scalable video coding (SVC) file format
US20220038793A1 (en) Method, device, and computer program for encapsulating partitioned timed media data
US10110654B2 (en) Client, a content creator entity and methods thereof for media streaming
US10645428B2 (en) Method, device, and computer program for encapsulating partitioned timed media data using a generic signaling for coding dependencies
US10212491B2 (en) Method, device, and computer program for encapsulating partitioned timed media data using sub-track feature
AU2006269848B2 (en) Method and apparatus for media data transmission
US8635356B2 (en) Method for supporting scalable progressive downloading of video signal
US20060233247A1 (en) Storing SVC streams in the AVC file format
US20050193138A1 (en) Storage medium storing multimedia data, and method and apparatus for reproducing the multimedia data
CN103210642B (en) Occur during expression switching, to transmit the method for the scalable HTTP streams for reproducing naturally during HTTP streamings
GB2593897A (en) Method, device, and computer program for improving random picture access in video streaming
US20230370659A1 (en) Method, device, and computer program for optimizing indexing of portions of encapsulated media content data
US11575951B2 (en) Method, device, and computer program for signaling available portions of encapsulated media content
CN103430558A (en) A method for optimizing a video stream
JP2022546894A (en) Method, apparatus, and computer program for encapsulating media data into media files

Legal Events

Date Code Title Description
AS Assignment

Owner name: SONY CORPORATION,JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VISHARAM, MOHAMMED ZUBAIR;TABATABAI, ALI;REEL/FRAME:024061/0099

Effective date: 20060404

Owner name: SONY ELECTRONICS INC.,NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VISHARAM, MOHAMMED ZUBAIR;TABATABAI, ALI;REEL/FRAME:024061/0099

Effective date: 20060404

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION