CA2717584C - Method and apparatus for processing an audio signal - Google Patents

Method and apparatus for processing an audio signal Download PDF

Info

Publication number
CA2717584C
CA2717584C CA2717584A CA2717584A CA2717584C CA 2717584 C CA2717584 C CA 2717584C CA 2717584 A CA2717584 A CA 2717584A CA 2717584 A CA2717584 A CA 2717584A CA 2717584 C CA2717584 C CA 2717584C
Authority
CA
Canada
Prior art keywords
signal
audio signal
coding type
coding
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CA2717584A
Other languages
French (fr)
Other versions
CA2717584A1 (en
Inventor
Hyun Kook Lee
Sung Yong Yoon
Dong Soo Kim
Jae Hyun Lim
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LG Electronics Inc
Original Assignee
LG Electronics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by LG Electronics Inc filed Critical LG Electronics Inc
Publication of CA2717584A1 publication Critical patent/CA2717584A1/en
Application granted granted Critical
Publication of CA2717584C publication Critical patent/CA2717584C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00007Time or data compression or expansion
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B20/00Signal processing not specific to the method of recording or reproducing; Circuits therefor
    • G11B20/00007Time or data compression or expansion
    • G11B2020/00014Time or data compression or expansion the compressed signal being an audio signal

Abstract

An apparatus for processing an encoded signal and method thereof are disclosed, by which an audio signal can be compressed and reconstructed in higher efficiency. An audio signal processing method includes the steps of identifying whether a coding type of the audio signal is a music signal coding type using first type information, if the coding type of the audio signal is not the music signal coding type, identifying whether the coding type of the audio signal is a speech signal coding type or a mixed signal coding type using second type information, if the coding type of the audio signal is the mixed signal coding type, extracting spectral data and a linear predictive coefficient from the audio signal, generating a residual signal for linear prediction by performing inverse frequency conversion on the spectral data, reconstructing the audio signal by performing linear prediction coding on the linear predictive coefficient and the residual signal, and reconstructing a high frequency region signal using an extension base signal corresponding to a partial region of the reconstructed audio signal and band extension information. Accordingly, various kinds of audio signals can be encoded/decoded in higher efficiency.

Description

Method and Apparatus for Processing an Audio Signal BACKGROUND OF THE INVENTION
Field of the Invention The present invention relates to an audio signal processing apparatus for encoding and decoding various kinds of audio signals effectively and method thereof.
Discussion of the Related Art Generally, coding technologies are conventionally classified into two types such as perceptual audio coders and linear prediction based coders. For instance, the perceptual audio coder optimized for music adopts a scheme of reducing an information size in a coding process using the masking principle, which is human aural psychoacoustic theory, on a frequency axis. On the contrary, the linear prediction based coder optimized for speech adopts a scheme of reducing an information size by modeling speech vocalization on a time axis.
However, each of the above-described technologies has good performance on each optimized audio signal (e.g., a speech signal, a music signal) but fails to provide consistent performance on an audio signal generated from complicatedly mixing different types of audio signals or speech and music signals together.
SUMMARY OF THE INVENTION
Accordingly, the present invention is directed to an apparatus for processing an audio signal and method thereof that, in some embodiments, may substantially obviate one or more of the problems due to limitations and disadvantages of the related art.
Some embodiments may provide an apparatus for processing an audio signal and method thereof, by which different types of audio signals can be compressed and/or reconstructed in higher efficiency.
Some embodiments may provide an audio coding scheme suitable for characteristics of an audio signal.
Additional features and advantages of some embodiments of the invention will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the invention. The objectives and other advantages of some embodiments of the invention will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.
According to an aspect of the present invention, there is provided a method of processing an audio signal, comprising the steps of identifying whether a coding type of the audio signal is a music signal coding type using first type information, if the coding type of the audio signal is not the music signal coding type, identifying whether the coding type of the audio signal is a speech signal coding type or a mixed signal coding type using second type information, if the coding type of the audio signal is the mixed signal coding type, extracting spectral data and a linear predictive coefficient from the audio signal, generating a residual signal for linear prediction by performing inverse frequency conversion on the spectral data, reconstructing the audio signal by performing
2 linear prediction coding on the linear predictive coefficient and the residual signal, and reconstructing a high frequency region signal using an extension base signal corresponding to a partial region of the reconstructed audio signal and band extension information.
According to another aspect of the present invention, there is provided an apparatus for processing an audio signal, comprising: a demultiplexer configured to extract first type information and second type information from a bitstream; a decoder determining unit configured to identify whether a coding type of the audio signal is a music signal coding type using first type information, the decoder, if the coding type of the audio signal is not the music signal coding type, identifying whether the coding type of the audio signal is a speech signal coding type or a mixed signal coding type using second type information, the decoder then determining a decoding scheme; an information extracting unit configured to extract spectral data and a linear predictive coefficient from the audio signal if the coding type of the audio signal is the mixed signal coding type; a frequency transforming unit configured to generate a residual signal for linear prediction by performing inverse frequency conversion on the spectral data; a linear prediction unit configured to reconstruct the audio signal by performing linear prediction coding on the linear predictive coefficient and the residual signal; and a bandwidth extension decoding unit configured to reconstruct a high frequency region signal using an extension base signal corresponding to a partial region of the reconstructed audio signal and band extension information.
3 In some embodiments, preferably, the audio signal includes a plurality of subframes and wherein the second type information exists by a unit of the subframe.
In some embodiments, preferably, a bandwidth of the high frequency region signal is not equal to that of the extension base signal. Preferably, the band extension information includes at least one of a filter range applied to the reconstructed audio signal, a start frequency of the extension base signal and an end frequency of the extension base signal.
In some embodiments, preferably, if the coding type of the audio signal is the music signal coding type, the audio signal comprises a frequency-domain signal, wherein if the coding type of the audio signal is the speech signal coding type, the audio signal comprises a time-domain signal, and wherein if the coding type of the audio signal is the mixed signal coding type, the audio signal comprises an MDCT-domain signal.
In some embodiments, preferably, the linear predictive coefficient extracting includes extracting a linear predictive coefficient mode and extracting the linear predictive coefficient having a variable bit size corresponding to the extracted linear predictive coefficient mode.
According to another aspect of the present invention, there is provided in an audio signal processing apparatus including an audio coder for processing an audio signal, a method of processing the audio signal, comprising the steps of:
removing a high frequency band signal of the audio signal and generating band extension information for reconstructing the high frequency band signal; determining a coding type of the
4 audio signal; if the audio signal is a music signal, generating first type information indicating that the audio signal is coded into a music signal coding type; if the audio signal is not the music signal, generating second type information indicating that the audio signal is coded into either a speech signal coding type or a mixed signal coding type; if the coding type of the audio signal is the mixed signal coding type, generating a linear predictive coefficient by performing linear prediction coding on the audio signal; generating a residual signal for the linear prediction coding; generating a spectral coefficient by frequency-transforming the residual signal; and generating an audio bitstream including the first type information, the second type information, the linear predictive coefficient and the residual signal.
According to another aspect of the present invention, there is provided an apparatus for processing an audio signal, comprising: a bandwidth preprocessing unit for removing a high frequency band signal of the audio signal, the bandwidth preprocessing unit being configured to generate band extension information for reconstructing the high frequency band signal;
a signal classifying unit configured to determine a coding type of the audio signal, the signal classifying unit, if the audio signal is a music signal, generating first type information indicating that the audio signal is coded into a music signal coding type, the signal classifying unit, if the audio signal is not the music signal, generating second type information indicating that the audio signal is coded into either a speech signal coding type or a mixed signal coding type; a linear prediction modeling unit configured to generate a linear predictive coefficient by performing linear prediction coding on the audio signal if the coding type of the audio signal is the mixed signal coding type; a residual signal extracting unit configured to generate a residual signal for the linear prediction coding; and a frequency transforming unit configured to generate a spectral coefficient by frequency-transforming the residual signal.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.
In the drawings:
FIG. 1 is a block diagram of an audio coding apparatus according to one embodiment of the present invention;
FIG. 2 is a block diagram of an audio coding apparatus according to another embodiment of the present invention;
FIG. 3 is a detailed block diagram of a bandwidth preprocessing unit 150 according to an embodiment of the present invention;
FIG. 4 is a flowchart for a method of coding an audio signal using audio type information according to one embodiment of the present invention;

FIG. 5 is a diagram for an example of an audio bitstream structure coded according to an embodiment of the present invention;
FIG. 6 is a block diagram of an audio decoding apparatus according to one embodiment of the present invention;
FIG. 7 is a block diagram of an audio decoding apparatus according to another embodiment of the present invention;
FIG. 8 is a detailed block diagram of a bandwidth extending unit 250 according to an embodiment of the present invention;
FIG. 9 is a diagram for a configuration of a product implemented with an audio decoding apparatus according to an embodiment of the present invention;
FIG. 10 is a diagram for an example of relations between products implemented with an audio decoding apparatus according to an embodiment of the present invention; and FIG. 11 is a flowchart for an audio decoding method according to one embodiment of the present invention.
DETAILED DESCRIPTION OF EMBODIMENTS
Reference will now be made in detail to the preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings.
In the present invention, terminologies in the present invention can be construed as the following references. First of all, 'coding' can be occasionally construed as encoding or decoding. Information is a terminology that includes values, parameters, coefficients, elements and the like.
Regarding the present invention, 'audio signal' in the present invention is conceptionally discriminated from a video signal. And, the audio signal indicates all signals that can be aurally identified in reproduction. Therefore, audio signals can be classified into a speech signal mainly relevant to human vocalization or a signal similar to the speech signal (hereinafter named 'speech signal'), a music signal mainly relevant to a mechanical noise and sound or a signal similar to the music signal (hereinafter named 'music signal'), and a 'mixed signal' generated from mixing the speech signal and the music signal together. The present invention intends to provide an apparatus for encoding/decoding the above three types of audio signals and method thereof in order to encode/decode the audio signals to be suitable for characteristics of the audio signals. Yet, the audio signals are classified for the description of the present invention only. And, it is apparent that the technical idea of the present invention is identically applicable to a case of classifying the audio signal according to a different method.

FIG. 1 is a block diagram of an audio coding apparatus according to one preferred embodiment of the present invention. In particular, FIG. 1 shows a process of classifying an inputted audio signal according to a preset reference and then coding the classified audio signal by selecting an audio coding scheme suitable for the corresponding audio signal.
Referring to FIG. 1, an audio coding apparatus according to one preferred embodiment of the present invention includes a signal classifying unit (sound activity detector) 100 classifying an inputted audio signal into a type of a speech signal, a music signal or a mixed signal of speech and music by analyzing a characteristic of the inputted audio signal, a linear prediction modeling unit 110 coding the speech signal of the signal type determined by the signal classifying unit 100, a psychoacoustic model unit 120 coding the music signal, and a mixed signal modeling unit 130 coding the mixed signal of speech and music. And, the audio coding apparatus can further include a switching unit 101 configured to select a coding scheme suitable for the audio signal classified by the signal classifying unit 100. The switching unit 101 is operated using audio signal coding type information (e.g., first type information and second type information, which will be explained in detail with reference to FIG. 2 and Fig.
3) generated by the signal classifying unit 100 as a control signal. Moreover, the mixed signal modeling unit 130 can include a linear prediction unit 131, a residual signal extracting unit 132 and a frequency transforming unit 133. In the following description, the respective elements shown in FIG. 1 are explained in detail.
First of all, the signal classifying unit 100 classifies a type of an inputted audio signal and then generates a control signal to select an audio coding scheme suitable for the classified type. For instance, the signal classifying unit 100 classifies whether an inputted audio signal is a music signal, a speech signal or a mixed signal of speech and music. Thus, the type of the inputted audio signal is classified to select an optimal coding scheme per audio signal type from audio coding schemes which will be explained later. Therefore, the signal classifying unit 100 performs a process of analyzing an inputted audio signal and then selecting an audio coding scheme optimal for the inputted audio signal. For instance, the signal classifying unit 100 generates audio coding type information 1000 by analyzing an inputted audio signal. The generated audio coding type information 1000 is utilized as a reference for selecting a coding scheme. The generated audio coding type information 1000 is included as a bitstream in a finally-coded audio signal and is then transferred to a decoding or receiving device. Besides, a decoding method and apparatus using the audio coding type information 1000 will be explained in detail with reference to FIGs. 6 to 8 and FIG. 11. Moreover, the audio coding type information 1000 generated by the signal classifying unit 100 can include first type information and second type information for example. This will be described with reference to FIG. 4 and FIG. 5.
The signal classifying unit 100 determines an audio signal type according to a characteristic of an inputted audio signal. For instance, if the inputted audio signal is a signal better for modeling with a specific coefficient and a residual signal, the signal classifying unit 100 determines the inputted 13 audio signal as a speech signal. If the inputted audio signal is a signal poor for modeling with a specific coefficient and a residual signal, the signal classifying unit 100 determines the inputted audio signal as a music signal. If it is difficult to determine the inputted audio signal as a speech signal or a music signal, the signal classifying unit 100 determines the inputted audio signal as a mixed signal. Regarding a detailed determination reference, for example, when the signal is modeled with a specific coefficient and a residual signal, if an energy level ratio of the residual signal to the signal is smaller than a preset reference value, the signal can be determined as a signal good for modeling. Therefore, the signal can be determined as a speech signal. If the signal has high redundancy on a time axis, the signal can be determined as a signal good for modeling by linear prediction for predicting a current signal from a past signal. Therefore, the signal can be determined as a music signal.
If a signal inputted according to this reference is determined as a speech signal, it is able to code an input signal using a speech coder optimized for the speech signal.
According to the present embodiment, the linear prediction modeling unit 100 is used for a coding scheme suitable for a speech signal. The linear prediction modeling unit 100 is provided with various schemes. For instance, ACELP
(algebraic code excited linear prediction) coding scheme, AMR (adaptive multi-rate) coding scheme or AMR-WB (adaptive multi-rate wideband) coding scheme is applicable to the linear prediction modeling unit 110.
The linear prediction modeling unit 110 is able to perform linear prediction coding on an inputted audio signal by frame unit. The linear prediction modeling unit 110 extracts a predictive coefficient per frame and then quantizes the extracted predictive coefficient. For instance, a scheme of extracting a predictive coefficient using 'Levinson-Durbin algorithm' is widely used in general.
In particular, if an inputted audio signal is constructed with a plurality of frames or there exist a plurality of super frames, each of which has a unit of a plurality of frames, for example, it is able to determine whether to apply a linear prediction modeling scheme per frame. It is possible to apply a different linear prediction modeling scheme per unit frame existing within one super frame or per subframe of a unit frame. This can raise coding efficiency of an audio signal.
Meanwhile, if an inputted audio signal is classified into a music signal by the signal classifying unit 100, it is able to code an input signal using a music coder optimized for the music signal. The psychoacoustic modeling unit 120 is configured based on a perceptual audio coder.
Meanwhile, if an inputted audio signal is classified into a mixed signal, in which speech and music are mixed together, by the signal classifying unit 100, it is able to code an input signal using a coder optimized for the mixed signal. According to the present embodiment, the mixed signal modeling unit 130 is used for a coding scheme suitable for a mixed signal.

The mixed signal modeling unit 130 is able to perform coding by a mixed scheme resulting from mixing the aforesaid linear prediction modeling scheme and the psychoacoustic modeling scheme together. In particular, the mixed signal modeling unit 130 performs linear prediction coding on an input signal, obtains a residual signal amounting to a difference between a linear prediction result signal and an original signal, and then codes the residual signal by a frequency transform coding scheme.
For instance, FIG. 1 shows an example that the mixed signal modeling unit 130 includes the linear prediction unit 131, the residual signal extracting unit 132 and the frequency transforming unit 133.
The linear prediction unit 131 performs linear predictive analysis on an inputted signal and then extracts a linear predictive coefficient indicating a characteristic of the signal. The residual signal extracting unit 132 extracts a residual signal, from which a redundancy component is removed, from the inputted signal using the extracted linear predictive coefficient. Since the redundancy is removed from the residual signal, the corresponding residual signal can have a type of a white noise. The linear prediction unit 131 is able to perform linear prediction coding on an inputted audio signal by frame unit. The linear prediction unit 131 extracts a predictive coefficient per frame and then quantizes the extracted predictive coefficient. For instance, in particular, if an inputted audio signal is constructed with a plurality of frames or there exist a plurality of super frames, each of which has a unit of a plurality of frames, it is able to determine whether to apply a linear prediction modeling scheme per frame. It is possible to apply a different linear prediction modeling scheme per unit frame existing within one super frame or per subframe of a unit frame. This can raise coding efficiency of an audio signal.
The residual signal extracting unit 132 receives an input of a remaining signal coded by the linear prediction unit 131 and an input of an original audio signal having passed through the signal classifying unit 100 and then extracts a residual signal that is a difference signal between the two inputted signals.
The frequency transforming unit 133 calculates a masking threshold or a signal-to-mask ratio (SMR) by performing frequency domain transform on an inputted residual signal by MDCT or the like and then codes the residual signal. The frequency transforming unit 133 is able to code a signal of a residual audio tendency using TCX as well as the psychoacoustic modeling.
As the linear prediction modeling unit 100 and the linear prediction unit 131 extract an audio characteristic reflected linear predictive coefficient (LPC) by performing linear prediction and analysis on an inputted audio signal, it is able to consider a scheme of using variable bits for a method of transferring the LPC data.
For instance, an LPC data mode is determined by considering a coding scheme per frame. It is then able to assign a linear predictive coefficient having a viable bit number per the determined LPC data mode. Through this, an overall audio bit number is reduced. Therefore, audio coding and decoding can be performed more efficiently.
Meanwhile, as mentioned in the foregoing description, the signal classifying unit 100 generates coding type information of an audio signal by classifying the audio signal into one of two types of the coding type information, enables the coding type information to be included in a bitstream, and then transfers the bitstream to a decoding apparatus. In the following description, audio coding type information 100C
according to the present invention is explained in detail with reference to FIG. 4 and FIG. 5.
FIG. 4 is a flowchart for a method of coding an audio signal using audio type information according to one preferred embodiment of the present invention.
First of all, the present invention proposes a method of representing a type of an audio signal in a manner of using first type information and second type information for classification. For instance, if an inputted audio signal is determined as a music signal [S100], the signal classifying unit 100 controls the switching unit 101 to select a coding scheme (e.g., psychoacoustic modeling scheme shown in FIG. 2) suitable for the music signal and then enables coding to be performed according to the selected coding scheme [S110]. Thereafter, the corresponding control information is configured as first type information and is then transferred by being included in a coded audio bitstream. Therefore, the first type information plays a role as coding identification information indicating that a coding type of an audio signal is a music signal coding type. The first type information is utilized in decoding an audio signal according to a decoding method and apparatus.
Moreover, if the inputted signal is determined as a speech signal [S120], the signal classifying unit 100 controls the switching unit 101 to select a coding scheme (e.g., linear prediction modeling shown in FIG. 1) suitable for the speech signal and then enables coding to be performed according to the selected coding scheme [S130].
If the inputted signal is determined as a mixed signal [S120], the signal classifying unit 100 controls the switching unit 101 to select a coding scheme (e.g., mixed signal modeling shown in FIG. 2) suitable for the mixed signal and then enables coding to be performed according to the selected coding scheme [S140]. Subsequently, control information indicating either the speech signal coding type or the mixed signal coding type is configured into second type information. The second type is then transferred by being included in a coded audio bitstream together with the first type information. Therefore, the second type information plays a role as coding identification information indicating that a coding type of an audio signal is either a speech signal coding type or a mixed signal coding type. The second type information is utilized together with the aforesaid first type information in decoding an audio signal according to a decoding method and apparatus.
Regarding the first type information and the second type information, there are two cases according to characteristics of inputted audio signals. Namely, the first information needs to be transferred only or both of the first type information and the second type information need to be transferred. For instance, if a type of an inputted audio signal is a music signal coding type, the first type information is transferred only by being included in a bitstream and the second type information may not be included in the bitstream [(a) of FIG. 5]. Namely, the second type information is included in a bitstream only if an inputted audio signal coding type is a speech signal coding type or a mixed signal coding type. Therefore, it is able to prevent the unnecessary bit number to represent a coding type of an audio signal.
Although the example of the present invention teaches that the first type information indicates a presence or non-presence of a music signal type, it is just exemplary.
And, it is apparent that the first type information is usable as information indicating a speech signal coding type or a mixed signal coding type. Thus, by utilizing an audio coding type having probability of high occurrence frequency according to a coding environment to which the present invention is applied, it is able to reduce an overall bit number of a bitstream.
FIG. 5 is a diagram for an example of an audio bitstream structure coded according to the present invention.
Referring to (a) of FIG. 5, an inputted audio signal corresponds to a music signal. First type information 301 is Included in a bitstream only but second type information is not included therein. Within the bitstream, audio data coded by a coding type corresponding to the first type information 301 is included (e.g., AAC bitstream 302).
Referring to (b) of FIG. 5, an inputted audio signal corresponds to a speech signal. Both first type information 311 and second type information 312 are included in a bitstream. Within the bitstream, audio data coded by a coding type corresponding to the second type information 312 is included (e.g., AMR bitstream 313).
Referring to (c) of FIG. 5, an inputted audio signal corresponds to a mixed signal. Both first type information 321 and second type information 322 are included in a bitstream. Within the bitstream, audio data coded by a coding type corresponding to the second type information 322 is included (e.g., TCX applied AAC bitstream 323).
Regarding this description, the information included in an audio bitstream coded by the present invention is exemplarily shown in (a) to (c) of FIG. 5. And, it is apparent that various applications are possible within the range of the present invention. For instance, in the present invention, examples of AMR and AAC are taken as examples of coding schemes by adding information for identifying the corresponding coding schemes. Further, various coding schemes are applicable and coding identification information for identifying the various coding schemes are variously available as well. Besides, the present invention shown in (a) to (c) of FIG. 5 is applicable to one super frame, unit frame and subframe.
Namely, the present invention is able to provide audio signal coding type information per preset frame unit.
In the following description, an audio signal coding method and apparatus, in which a coding processing process is included, according to another embodiment of the present invention are explained with reference to FIG. 2 and FIG. 3.
First of all, as a preprocessing process of an input signal using the linear prediction modeling unit 110, the psychoacoustic modeling unit 120 and the mixed signal modeling unit 130, a frequency bandwidth extending process and a channel number changing process can be performed.
For instance, as one embodiment of the frequency band extending process, a bandwidth preprocessing unit ('150' in FIG. 2) is able to generate a high frequency component using a low frequency component. As an example of the bandwidth processing unit, it is able to use SBR (spectral band replication) and HBE (high band extension), which are modified and enhanced.

Moreover, the channel number changing process reduces a bit allocation size by coding channel information of an audio signal into side information. As one embodiment of the channel number changing process, it is able to use a downmix channel generating unit ('140' in FIG. 2). The downmix channel generating unit 140 is able to adopt a PS
(parametric stereo) system. In this case, RS is a scheme for coding a stereo signal and downmixes a stereo signal into a mono signal. The downmix channel generating unit 140 generates a downmix signal and spatial information relevant to reconstruction of the downmixed signal.
According to one embodiment, if a 48 kHz stereo signal is transferred using SBR and PS (parametric stereo), a mono 24 kHz signal remains through the SBR/PS. This mono signal can be encoded by an encoder. Thus, the input signal of the encoder has 24 kHz. This is because a high frequency component is coded by SBR and is downsampled into a half of a previous frequency. Thus, input signal becomes the mono signal. This is because a stereo audio is extracted as a parameter through the PS (parametric stereo) to be changed into a sum of the mono signal and an additional audio.
FIG. 2 relates to a coding pre-processing process and shows a coding apparatus including the above-described downmix channel generating unit 140 and the above-described bandwidth preprocessing unit 150.
Operations of the linear prediction modeling unit 110, the psychoacoustic modeling unit 120, the mixed signal modeling unit 130 and the switching unit 101, which are described with reference to FIG. 1, are identically applied to operations of the corresponding elements shown in FIG. 2.
Moreover, although the signal classifying unit 100 generates control signal for controlling an activation of the downmix channel generating unit 140 and the bandwidth preprocessing unit 150.
In other words, the signal classifying unit 100 further generates a control signal 100a for controlling an presence or non-presence of activation of the downmix channel generating unit 140 and an operative range of the downmix channel generating unit 140 and a control signal 100b for controlling an presence or non-presence of activation of the bandwidth preprocessing unit 150 and an operative range of the bandwidth preprocessing unit 150.
FIG. 3 is a detailed block diagram of a bandwidth preprocessing unit 150 according to an embodiment of the present invention.
Referring to FIG. 3, a bandwidth preprocessing unit 150 for band extension includes a high frequency region removing unit 151, an extension information generating unit 152 and a spatial information inserting unit 153. The high frequency region removing unit 151 receives a downmix signal and spatial information from the downmix channel generating unit 140. The high frequency region removing unit 151 generates a low frequency downmix signal, which results from removing a high frequency signal corresponding to a high frequency region from a frequency signal of the downmix signal, and reconstruction information including a start frequency and end frequency of an extension base signal (described later).
In this case, it is able to determine the reconstruction information based on a characteristic of an input signal. Generally, a start frequency of a high frequency signal is a frequency amounting to a half of a whole bandwidth. On the contrary, according to a characteristic of an input signal, the reconstruction information can determine a start frequency as a frequency above or below a half of a whole frequency band. For instance, if using a whole bandwidth signal of the downmix signal is more efficient than encoding the downmix signal by removing a high frequency region using a bandwidth extension technique, the reconstruction information is able to represent a start frequency as a frequency located at an end of a bandwidth. It is able to determine the reconstruction information using at least one of a signal size, a length of segment used for coding and a type of a source, by which the present invention is non-limited.
The extension information generating unit 152 generates extension information for determining an extension base signal, which will be used for decoding, using the downmix signal and the spatial information generated by the downmix channel generating unit 140. The extension base signal is a frequency signal of a downmix signal, which is used to reconstruct the high frequency signal of the downmix signal removed by the high frequency region removing unit 151 in decoding. And, the extension base signal may be a low frequency signal or a partial signal of the low frequency signal. For instance, if is able to divide a low frequency signal into a low frequency band region and a middle frequency band region again by performing band-pass filtering on the downmix signal. In doing so, it is able to generate extension information using the low frequency 'band region only. A boundary frequency for discriminating the low frequency band region and the middle frequency band region can be set to a random fixed value. Alternatively, the boundary frequency can be variably set per frame according to information for analyzing a ratio of speech and music for a mixed signal.

The extension information may match information on a downmix signal not removed by the high frequency region removing unit 151, by which the present invention is non-limited. And, the extension information may be the information on a partial signal of the downmix signal. If the extension information is the information on a partial signal of the downmix signal, it can include a start frequency and an end frequency of the extension base signal and can further include a range of a filter applied to the frequency signal of the downmix signal.
The spatial information inserting unit 153 generates new spatial information resulting from inserting the reconstruction information generated by the high frequency region removing unit 121 and the extension information generated by the extension information generating unit 122 into the spatial information generated by the downmix channel generating unit 140.
FIG. 6 is a block diagram of an audio decoding apparatus according to one embodiment of the present invention.
Referring to FIG. 6, a decoding apparatus is able to reconstruct a signal from an inputted bitstream by performing a process reverse to the coding process performed by the coding apparatus described with reference to FIG. 1. In particular, the decoding apparatus can Include a demultiplexer 210, a decoder determining unit 22C, a decoding unit 230 and a synthesizing unit 240. The decoding unit 230 can include a plurality of decoding units 231, 232 and 233 to perform decoding by different schemes, respectively. And, they are operated under the control of the decoder determining unit 220. In more particular, the decoding unit 230 can include a linear prediction decoding unit 231, a psychoacoustic decoding unit 232 and a mixed signal decoding unit 233. Moreover, the mixed signal decoding unit 233 can include an information extracting unit 234, a frequency transforming unit 235 and a linear prediction unit 236.
The demultiplexer 210 extracts a plurality of coded signals and side information from an inputted bitstream. In this case, the side information is extracted to reconstruct the signals. The demultiplexer 210 extracts the side information, which is included in the bitstream, e.g., first type information and second type information (just included if necessary) and then transfers the extracted side information to the decoder determining unit 220.
The decoder determining unit 220 determines one of decoding schemes within the decoding units 231, 232 and 233 from the received first type information and the received second type information (just included if necessary).
Although the decoder determining unit 220 is able to determine the decoding scheme using the side information extracted from the bitstream, if the side information does not exist within the bitstream, the decoder determining unit 220 is able to determined scheme by an independent determining method. This determining method can be performed in a manner of utilizing the features of the aforesaid signal classifying unit (cf. '100' in FIG. 1).
The linear prediction decoder 231 within the decoding unit 230 is able to decode a speech signal type of an audio signal. The psychoacoustic decoder 233 decodes a music signal type of an audio signal. And, the mixed signal decoder 233 decodes a speech & music mixed type of an audio signal. In particular, the mixed signal decoder 233 includes an information extracting unit 234 extracting spectral data and a linear predictive coefficient from an audio signal, a frequency transforming unit 235 generating a residual signal for linear prediction by inverse-transforming the spectral data, and a linear prediction unit 236 generating an output signal by performing linear predictive coding on the linear predictive coefficient and the residual signal. The decoded signals are reconstructed into an audio signal before coding by being synthesized together by the synthesizing unit 240.
FIG. 7 shows a decoding apparatus according to one embodiment of the present invention, which relates to a post-processing process of a coded audio signal. The post-processing process means a process for performing bandwidth extension and channel number change for a decoded audio signal using one of the linear prediction decoding unit 231, the psychoacoustic decoding unit 232 and the mixed signal decoding unit 233. The post-processing process can include a bandwidth extension decoding unit 250 and a multi-channel generating unit 260 to correspond to the aforesaid downmix channel generating unit 140 and the aforesaid bandwidth preprocessing unit 150 shown in FIG. 2.
FIG. 8 shows a detailed configuration of the bandwidth extension decoding unit 250.
In a frequency band extending process, the demultiplexer 210 extracts the extension information generated by the bandwidth preprocessing unit 150 from the bitstream and the extracted extension information is utilized. And, spectral data of a different band (e.g., a high frequency band) is generated from a portion of the spectral data or the whole spectral data using the extension information included in the audio signal bitstream. In this case, units having similar characteristics can be grouped into a block in extending the frequency band. This is the same method of generating an envelope region by grouping type slots (or, samples) having a common envelope (or an envelope characteristic).
Referring to FIG. 8, a bandwidth extension decoding unit 250 includes an extension base region determining unit 251, a high frequency region reconstructing unit 252 and a bandwidth extending unit 253.
The extension region determining unit 251 determines an extension base region in a received downmix signal based on the received extension information and then generates an extension base signal as a result of the determination. The downmix signal may be a signal in a frequency domain and the extension base signal means a partial frequency region in the downmix signal of the frequency domain. Therefore, the extension information is used to determine the extension base signal and may include start and end frequencies of the extension base signal or a range of filter for filtering a portion of the downmix signal.
The high frequency region reconstructing unit 252 receives a downmix signal and extension information and also receives the extension base signal. The high frequency region reconstructing unit 252 is then able to reconstruct a high frequency region signal of the downmix signal, which was removed by the coding side, using the extension base signal and the extension information. The high frequency region signal may not be included in the downmix signal but may be included in an original signal. The high frequency region signal may not be an integer multiple of the downmix signal and a bandwidth of the high frequency region signal may not be equal to that of the extension base signal.
In a bandwidth extending apparatus and method according to one embodiment of the present invention, even if a reconstructed high frequency region is not an integer multiple of the downmix signal, it is able to use the bandwidth extending technique in a manner of using a signal corresponding to a partial frequency region in the downmix signal as the extension base signal instead of using the whole downmix signal of which high frequency region was removed by the coding side.
The high frequency region reconstructing unit 252 can further include a time extension downmix signal generating unit (not shown in the drawing) and a frequency signal extending unit (not shown in the drawing). The time extension downmix signal generating unit is able to extend the downmix signal into a time domain by applying the extension information to the extension base signal. The frequency signal extending unit is able to extend a signal in a frequency region of the downmix signal by reducing the sample number of the time extension downmix signal (decomation) If the high frequency region reconstructing unit 252 includes a reconstructed high frequency region signal only but does not include a low frequency region signal, the bandwidth extending unit 253 generates an extension downmix signal, of which bandwidth is extended, by combining the downmix signal and the high frequency region signal together. The high frequency region signal may not be an integer multiple of the downmix signal. Therefore, the bandwidth extending technique according to one embodiment of the present invention is usable for upsampling into a signal now in a multiple relation.
The extension downmix signal, which is finally generated by the bandwidth extending unit 253, is inputted to the multi-channel generating unit 260 to be converted to a multi-channel signal.
In the following description, a decoding method according to the present invention is explained in detail with reference to a flowchart shown in FIG. 11.
First of all, the demultiplexer 210 extracts first type information and second type information (if necessary) from an inputted bitstream. Moreover, the demultiplexer 210 extracts informations (e.g., band extension information, reconstruction information, etc.) for a post-processing process. The decoder determining unit 220 determines a coding type of a received audio signal using the first type information of the extracted information in the first place [S1000]. If a coding type of the received audio signal is a music signal coding type, the psychoacoustic decoding unit 232 within the decoding unit 230 is utilized. A coding scheme applied per frame or subframe is determined according to the first type information. Decoding is then performed by applying a suitable coding scheme [S1100].
If it is determined that the coding type of the received audio signal is not the music signal coding type using the first type information, the decoder determining unit 220 determines whether the coding type of the received audio signal is a speech signal coding type or a mixed signal coding type using the second type information [S1200].
If the second type information means the speech signal coding type, the coding scheme applied per frame or subframe is determined by utilizing coding identification information extracted from the bitstream in a manner of utilizing the linear prediction decoding unit 231 within the decoding unit 230. Decoding is then performed by applying a suitable coding scheme [S1300].
If the second type information means the mixed signal coding type, the coding scheme applied per frame or subframe is determined by utilizing coding identification information extracted from the bitstream in a manner of utilizing the mixed signal decoding unit 233 within the decoding unit 230. Decoding is then performed by applying a suitable coding scheme [S1400].
Besides, as a post-processing of the audio signal decoding process using the linear prediction decoding unit 231, the psychoacoustic decoding unit 232 and the mixed signal decoding unit 233, a bandwidth extension decoding unit 250 can perform a frequency band extending process [S1500]. The frequency band extending process is performed in a manner that the bandwidth extension decoding unit 250 generates spectral data of a different band (e.g., a high frequency band) from a portion of the spectral data or the whole spectral data by decoding bandwidth extension information extracted from an audio signal bitstream.
Subsequently, the multi-channel generating unit 260 can perform a process for generating a multi-channel for the bandwidth-extended audio signal generated after the band extending process [S1600].
FIG. 9 is a diagram for a configuration of a product implemented with an audio decoding apparatus 900 according to an embodiment of the present invention. And, FIG. 10 is a diagram for an example of relations between products implemented with an audio decoding apparatus according to an embodiment of the present invention.
Referring to FIG. 9, a wire/wireless communication unit 910 receives a bitstream through a wire/wireless communication system. In particular, the wire/wireless communication unit 910 can include at least one of a wire communication unit 910A, an IR (infrared) communication unit 910B, a Bluetooth0 unit 9100 and a wireless LAN communication unit 910D.
A user authenticating unit 920 receives an input of user information and then performs user authentication. The user authenticating unit 920 can include at least one of a fingerprint recognizing unit 920A, an iris recognizing unit 920B, a face recognizing unit 9200 and a speech recognizing unit 920D. The user authenticating unit 920 is able to perform the user authentication in a manner of inputting fingerprint/iris/face contour/speech information to the corresponding recognizing unit 920A/920B/9200/920D, converting the inputted information to user information and then determining whether the user information matches previously-registered user data.

An input unit 930 is an input device for enabling a user to input various kinds of commands. The input unit 930 is able to include at least one of a keypad unit 930A, a touchpad unit 930B and a remote controller unit 930C, by which the present invention is non-limited. A signal decoding unit 940 analyzes signal characteristics using a received bitstream and frame type information.
A signal decoding unit 940 may includes audio decoding apparatus 945 which may be audio decoding apparatus described with reference to FIG. 6. The audio decoding apparatus 945 decides at least one of different schemes and performs decoding using at least one of a linear prediction decoding unit, a psychoacoustic decoding unit and a mixed signal decoding unit. The signal decoding unit 940 outputs an output signal by decoding a signal using a decoding unit corresponding to the signal characteristic.
A control unit 950 receives input signals from input devices and controls all processes of the signal decoding unit 940 and an output unit 960. And, the output unit 960 is an element for outputting the output signal generated by the signal decoding unit 940 or the like. The output unit 960 is able to include a speaker unit 960A and a display unit 960B. If an output signal is an audio signal, it is outputted to a speaker. If an output signal is a video signal, it is outputted via a display.
FIG. 10 shows relations between a terminal and a server corresponding to the products shown in FIG. 9.
Referring to (A) of FIG. 10, it can be observed that a first terminal 1001 and a second terminal 1002 are able to bi-directionally communicate with each other via a wire/wireless communication unit to exchange data and/or bitstreams. Referring to (B) of FIG. 10, it can be observed that a server 1003 and a first terminal 1001 are able to perform wire/wireless communications.
An audio signal processing method according to the present invention can be implemented into a program to be run in a computer and can be stored in a computer-readable recording medium. And, multimedia data having a data structure according to the present invention can be stored in a computer-readable recording medium as well. The computer-readable media include all kinds of recording devices in which data readable by a computer system are stored. The computer-readable media include ROM, RAM, CD-ROM, magnetic tapes, floppy discs, optical data storage devices, and the like for example and also include carrier-wave type implementations (e.g., transmission via Internet).
Moreover, a bitstream generated by the encoding method is stored in a computer-readable recording medium or can be transmitted via wire/wireless communication network.
Accordingly, the present invention provides the following effects or advantages.
First of all, the present invention classifies audio signals into different types and provides an audio coding scheme suitable for characteristics of the classified audio signals, thereby enabling more efficient compression and reconstruction of an audio signal.
While the present invention has been described and illustrated herein with reference to the preferred embodiments thereof, it will be apparent to those skilled in the art that various modifications and variations can be made therein without departing from the scope of the invention. Thus, it is intended that the present invention covers the modifications and variations of this invention that come within the scope of the appended claims and their equivalents.

Claims (15)

CLAIMS:
1. In an audio signal processing apparatus including an audio decoder, a method of processing an audio signal, comprising the steps of:
identifying whether a coding type of the audio signal is a music signal coding type using first type information;
if the coding type of the audio signal is not the music signal coding type, identifying whether the coding type of the audio signal is a speech signal coding type or a mixed signal coding type using second type information;
if the coding type of the audio signal is the mixed signal coding type, extracting spectral data and a linear predictive coefficient from the audio signal;
generating a residual signal for linear prediction by performing inverse frequency conversion on the spectral data;
reconstructing the audio signal by performing linear prediction coding on the linear predictive coefficient and the residual signal; and reconstructing a high frequency region signal using an extension base signal corresponding to a partial region of the reconstructed audio signal and band extension information.
2. The method of claim 1, wherein the audio signal includes a plurality of subframes and wherein the second type information exists by a unit of the subframe.
3. The method of claim 1, wherein a bandwidth of the high frequency region signal is not equal to that of the extension base signal.
4. The method of claim 1, wherein the band extension information includes at least one of a filter range applied to the reconstructed audio signal, a start frequency of the extension base signal and an end frequency of the extension base signal.
5. The method of claim 1, wherein if the coding type of the audio signal is the music signal coding type, the audio signal comprises a frequency-domain signal, wherein if the coding type of the audio signal is the speech signal coding type, the audio signal comprises a time-domain signal, and wherein if the coding type of the audio signal is the mixed signal coding type, the audio signal comprises an MDCT-domain signal.
6. The method of claim 1, the linear predictive coefficient extracting step comprises the steps of:
extracting a linear predictive coefficient mode; and extracting the linear predictive coefficient having a variable bit size corresponding to the extracted linear predictive coefficient mode.
7. An apparatus for processing an audio signal, comprising:
a demultiplexer configured to extract first type information and second type information from a bitstream;
a decoder determining unit configured to identify whether a coding type of the audio signal is a music signal coding type using first type information, the decoder, if the coding type of the audio signal is not the music signal coding type, identifying whether the coding type of the audio signal is a speech signal coding type or a mixed signal coding type using second type information, the decoder then determining a decoding scheme;
an information extracting unit configured to extract spectral data and a linear predictive coefficient from the audio signal if the coding type of the audio signal is the mixed signal coding type;
a frequency transforming unit configured to generate a residual signal for linear prediction by performing inverse frequency conversion on the spectral data;
a linear prediction unit configured to reconstruct the audio signal by performing linear prediction coding on the linear predictive coefficient and the residual signal; and a bandwidth extension decoding unit configured to reconstruct a high frequency region signal using an extension base signal corresponding to a partial region of the reconstructed audio signal and band extension information.
8. The apparatus of claim 7, wherein the audio signal includes a plurality of subframes and wherein the second type information exists by a unit of the subframe.
9. The apparatus of claim 7, wherein a bandwidth of the high frequency region signal is not equal to that of the extension base signal.
10. The apparatus of claim 7, wherein the band extension information includes at least one of a filter range applied to the reconstructed audio signal, a start frequency of the extension base signal and an end frequency of the extension base signal.
11. The apparatus of claim 7, wherein if the coding type of the audio signal is the music signal coding type, the audio signal comprises a frequency-domain signal, wherein if the coding type of the audio signal is the speech signal coding type, the audio signal comprises a time-domain signal, and wherein if the coding type of the audio signal is the mixed signal coding type, the audio signal comprises an MDCT-domain signal.
12. The apparatus of claim 7, the linear predictive coefficient extracting comprising:
extracting a linear predictive coefficient mode; and extracting the linear predictive coefficient having a variable bit size corresponding to the extracted linear predictive coefficient mode.
13. In an audio signal processing apparatus including an audio coder for processing an audio signal, a method of processing the audio signal, comprising the steps of:
removing a high frequency band signal of the audio signal and generating band extension information for reconstructing the high frequency band signal;
determining a coding type of the audio signal;
if the audio signal is a music signal, generating first type information indicating that the audio signal is coded into a music signal coding type;
if the audio signal is not the music signal, generating second type information indicating that the audio signal is coded into either a speech signal coding type or a mixed signal coding type;

if the coding type of the audio signal is the mixed signal coding type, generating a linear predictive coefficient by performing linear prediction coding on the audio signal;
generating a residual signal for the linear prediction coding;
generating a spectral coefficient by frequency-transforming the residual signal; and generating an audio bitstream including the first type information, the second type information, the linear predictive coefficient and the residual signal.
14. An apparatus for processing an audio signal, comprising:
a bandwidth preprocessing unit for removing a high frequency band signal of the audio signal, the bandwidth preprocessing unit being configured to generate band extension information for reconstructing the high frequency band signal;
a signal classifying unit configured to determine a coding type of the audio signal, the signal classifying unit, if the audio signal is a music signal, generating first type information indicating that the audio signal is coded into a music signal coding type, the signal classifying unit, if the audio signal is not the music signal, generating second type information indicating that the audio signal is coded into either a speech signal coding type or a mixed signal coding type;
a linear prediction modeling unit configured to generate a linear predictive coefficient by performing linear prediction coding on the audio signal if the coding type of the audio signal is the mixed signal coding type;
a residual signal extracting unit configured to generate a residual signal for the linear prediction coding;
and a frequency transforming unit configured to generate a spectral coefficient by frequency-transforming the residual signal.
15. The apparatus of claim 14, wherein the audio signal includes a plurality of subframes and wherein the second type information is generated per the subframe.
CA2717584A 2008-03-04 2009-03-04 Method and apparatus for processing an audio signal Active CA2717584C (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US3371508P 2008-03-04 2008-03-04
US61/033,715 2008-03-04
US7876208P 2008-07-07 2008-07-07
US61/078,762 2008-07-07
PCT/KR2009/001081 WO2009110751A2 (en) 2008-03-04 2009-03-04 Method and apparatus for processing an audio signal

Publications (2)

Publication Number Publication Date
CA2717584A1 CA2717584A1 (en) 2009-09-11
CA2717584C true CA2717584C (en) 2015-05-12

Family

ID=41056476

Family Applications (1)

Application Number Title Priority Date Filing Date
CA2717584A Active CA2717584C (en) 2008-03-04 2009-03-04 Method and apparatus for processing an audio signal

Country Status (10)

Country Link
US (1) US8135585B2 (en)
EP (1) EP2259254B1 (en)
JP (1) JP5108960B2 (en)
KR (1) KR20100134623A (en)
CN (1) CN102007534B (en)
AU (1) AU2009220341B2 (en)
CA (1) CA2717584C (en)
ES (1) ES2464722T3 (en)
RU (1) RU2452042C1 (en)
WO (1) WO2009110751A2 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101889306A (en) * 2007-10-15 2010-11-17 Lg电子株式会社 The method and apparatus that is used for processing signals
MX2011000370A (en) * 2008-07-11 2011-03-15 Fraunhofer Ges Forschung An apparatus and a method for decoding an encoded audio signal.
JP5232121B2 (en) * 2009-10-02 2013-07-10 株式会社東芝 Signal processing device
US8447617B2 (en) * 2009-12-21 2013-05-21 Mindspeed Technologies, Inc. Method and system for speech bandwidth extension
KR101826331B1 (en) 2010-09-15 2018-03-22 삼성전자주식회사 Apparatus and method for encoding and decoding for high frequency bandwidth extension
BR122021007425B1 (en) 2010-12-29 2022-12-20 Samsung Electronics Co., Ltd DECODING APPARATUS AND METHOD OF CODING A UPPER BAND SIGNAL
CN102610231B (en) * 2011-01-24 2013-10-09 华为技术有限公司 Method and device for expanding bandwidth
CN103918247B (en) 2011-09-23 2016-08-24 数字标记公司 Intelligent mobile phone sensor logic based on background environment
CN103035248B (en) 2011-10-08 2015-01-21 华为技术有限公司 Encoding method and device for audio signals
ES2805308T3 (en) * 2011-11-03 2021-02-11 Voiceage Evs Llc Soundproof content upgrade for low rate CELP decoder
CN102446509B (en) * 2011-11-22 2014-04-09 中兴通讯股份有限公司 Audio coding and decoding method for enhancing anti-packet loss capability and system thereof
CN104221082B (en) * 2012-03-29 2017-03-08 瑞典爱立信有限公司 The bandwidth expansion of harmonic wave audio signal
KR101775084B1 (en) * 2013-01-29 2017-09-05 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에.베. Decoder for generating a frequency enhanced audio signal, method of decoding, encoder for generating an encoded signal and method of encoding using compact selection side information
EP2830052A1 (en) * 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio decoder, audio encoder, method for providing at least four audio channel signals on the basis of an encoded representation, method for providing an encoded representation on the basis of at least four audio channel signals and computer program using a bandwidth extension
CN103413553B (en) * 2013-08-20 2016-03-09 腾讯科技(深圳)有限公司 Audio coding method, audio-frequency decoding method, coding side, decoding end and system
CN103500580B (en) * 2013-09-23 2017-04-12 广东威创视讯科技股份有限公司 Audio mixing processing method and system
EP2863386A1 (en) 2013-10-18 2015-04-22 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Audio decoder, apparatus for generating encoded audio output data and methods permitting initializing a decoder
US9311639B2 (en) 2014-02-11 2016-04-12 Digimarc Corporation Methods, apparatus and arrangements for device to device communication
ES2702455T3 (en) 2014-02-24 2019-03-01 Samsung Electronics Co Ltd Procedure and signal classification device, and audio coding method and device that use the same
CN107424621B (en) * 2014-06-24 2021-10-26 华为技术有限公司 Audio encoding method and apparatus
CN104269173B (en) * 2014-09-30 2018-03-13 武汉大学深圳研究院 The audio bandwidth expansion apparatus and method of switch mode
CN107077849B (en) * 2014-11-07 2020-09-08 三星电子株式会社 Method and apparatus for restoring audio signal
CN106075728B (en) * 2016-08-22 2018-09-28 卢超 Music applied to electronic acupuncture apparatus modulates pulse acquisition methods
US10074378B2 (en) * 2016-12-09 2018-09-11 Cirrus Logic, Inc. Data encoding detection
CN115334349B (en) * 2022-07-15 2024-01-02 北京达佳互联信息技术有限公司 Audio processing method, device, electronic equipment and storage medium

Family Cites Families (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742735A (en) * 1987-10-06 1998-04-21 Fraunhofer Gesellschaft Zur Forderung Der Angewanten Forschung E.V. Digital adaptive transformation coding method
NL9000338A (en) * 1989-06-02 1991-01-02 Koninkl Philips Electronics Nv DIGITAL TRANSMISSION SYSTEM, TRANSMITTER AND RECEIVER FOR USE IN THE TRANSMISSION SYSTEM AND RECORD CARRIED OUT WITH THE TRANSMITTER IN THE FORM OF A RECORDING DEVICE.
JPH04150522A (en) * 1990-10-15 1992-05-25 Sony Corp Digital signal processor
US5680508A (en) * 1991-05-03 1997-10-21 Itt Corporation Enhancement of speech coding in background noise for low-rate speech coder
US5371853A (en) * 1991-10-28 1994-12-06 University Of Maryland At College Park Method and system for CELP speech coding and codebook for use therewith
DE4202140A1 (en) * 1992-01-27 1993-07-29 Thomson Brandt Gmbh Digital audio signal transmission using sub-band coding - inserting extra fault protection signal, or fault protection bit into data frame
US5285498A (en) * 1992-03-02 1994-02-08 At&T Bell Laboratories Method and apparatus for coding audio signals based on perceptual model
IT1257065B (en) * 1992-07-31 1996-01-05 Sip LOW DELAY CODER FOR AUDIO SIGNALS, USING SYNTHESIS ANALYSIS TECHNIQUES.
US5579404A (en) * 1993-02-16 1996-11-26 Dolby Laboratories Licensing Corporation Digital audio limiter
DE4405659C1 (en) * 1994-02-22 1995-04-06 Fraunhofer Ges Forschung Method for the cascaded coding and decoding of audio data
US5537510A (en) * 1994-12-30 1996-07-16 Daewoo Electronics Co., Ltd. Adaptive digital audio encoding apparatus and a bit allocation method thereof
IT1281001B1 (en) * 1995-10-27 1998-02-11 Cselt Centro Studi Lab Telecom PROCEDURE AND EQUIPMENT FOR CODING, HANDLING AND DECODING AUDIO SIGNALS.
US5778335A (en) * 1996-02-26 1998-07-07 The Regents Of The University Of California Method and apparatus for efficient multiband celp wideband speech and music coding and decoding
US6061793A (en) * 1996-08-30 2000-05-09 Regents Of The University Of Minnesota Method and apparatus for embedding data, including watermarks, in human perceptible sounds
KR100261254B1 (en) * 1997-04-02 2000-07-01 윤종용 Scalable audio data encoding/decoding method and apparatus
JP3185748B2 (en) * 1997-04-09 2001-07-11 日本電気株式会社 Signal encoding device
US6208962B1 (en) * 1997-04-09 2001-03-27 Nec Corporation Signal coding system
DE69926821T2 (en) * 1998-01-22 2007-12-06 Deutsche Telekom Ag Method for signal-controlled switching between different audio coding systems
JP3199020B2 (en) * 1998-02-27 2001-08-13 日本電気株式会社 Audio music signal encoding device and decoding device
US6424938B1 (en) * 1998-11-23 2002-07-23 Telefonaktiebolaget L M Ericsson Complex signal activity detection for improved speech/noise classification of an audio signal
SG98418A1 (en) * 2000-07-10 2003-09-19 Cyberinc Pte Ltd A method, a device and a system for compressing a musical and voice signal
US6658383B2 (en) * 2001-06-26 2003-12-02 Microsoft Corporation Method for coding speech and music signals
SE521600C2 (en) * 2001-12-04 2003-11-18 Global Ip Sound Ab Lågbittaktskodek
JP2003257125A (en) * 2002-03-05 2003-09-12 Seiko Epson Corp Sound reproducing method and sound reproducing device
WO2003089892A1 (en) * 2002-04-22 2003-10-30 Nokia Corporation Generating lsf vectors
US8359197B2 (en) * 2003-04-01 2013-01-22 Digital Voice Systems, Inc. Half-rate vocoder
KR20060131793A (en) * 2003-12-26 2006-12-20 마츠시타 덴끼 산교 가부시키가이샤 Voice/musical sound encoding device and voice/musical sound encoding method
US7596486B2 (en) * 2004-05-19 2009-09-29 Nokia Corporation Encoding an audio signal using different audio coder modes
KR100854534B1 (en) 2004-05-19 2008-08-26 노키아 코포레이션 Supporting a switch between audio coder modes
KR101171098B1 (en) 2005-07-22 2012-08-20 삼성전자주식회사 Scalable speech coding/decoding methods and apparatus using mixed structure
CN101512639B (en) * 2006-09-13 2012-03-14 艾利森电话股份有限公司 Method and equipment for voice/audio transmitter and receiver
JP5266341B2 (en) * 2008-03-03 2013-08-21 エルジー エレクトロニクス インコーポレイティド Audio signal processing method and apparatus

Also Published As

Publication number Publication date
CA2717584A1 (en) 2009-09-11
WO2009110751A3 (en) 2009-10-29
KR20100134623A (en) 2010-12-23
CN102007534A (en) 2011-04-06
EP2259254A2 (en) 2010-12-08
ES2464722T3 (en) 2014-06-03
RU2010140365A (en) 2012-04-10
EP2259254B1 (en) 2014-04-30
RU2452042C1 (en) 2012-05-27
JP2011514558A (en) 2011-05-06
CN102007534B (en) 2012-11-21
EP2259254A4 (en) 2013-02-20
WO2009110751A2 (en) 2009-09-11
AU2009220341B2 (en) 2011-09-22
US20100070272A1 (en) 2010-03-18
US8135585B2 (en) 2012-03-13
JP5108960B2 (en) 2012-12-26
AU2009220341A1 (en) 2009-09-11

Similar Documents

Publication Publication Date Title
CA2717584C (en) Method and apparatus for processing an audio signal
EP2259253B1 (en) Method and apparatus for processing audio signal
US8504377B2 (en) Method and an apparatus for processing a signal using length-adjusted window
EP2169670B1 (en) An apparatus for processing an audio signal and method thereof
EP2124224A1 (en) A method and an apparatus for processing an audio signal
WO2010058931A2 (en) A method and an apparatus for processing a signal

Legal Events

Date Code Title Description
EEER Examination request