CN100559465C - The variable frame length coding that fidelity is optimized - Google Patents

The variable frame length coding that fidelity is optimized Download PDF

Info

Publication number
CN100559465C
CN100559465C CNB2004800186630A CN200480018663A CN100559465C CN 100559465 C CN100559465 C CN 100559465C CN B2004800186630 A CNB2004800186630 A CN B2004800186630A CN 200480018663 A CN200480018663 A CN 200480018663A CN 100559465 C CN100559465 C CN 100559465C
Authority
CN
China
Prior art keywords
signal
subframe
mono
coding
coding parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2004800186630A
Other languages
Chinese (zh)
Other versions
CN1816847A (en
Inventor
S·布鲁恩
I·约翰松
A·塔莱布
D·恩斯特伦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Telefonaktiebolaget LM Ericsson AB
Original Assignee
Telefonaktiebolaget LM Ericsson AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from SE0303501A external-priority patent/SE0303501D0/en
Application filed by Telefonaktiebolaget LM Ericsson AB filed Critical Telefonaktiebolaget LM Ericsson AB
Publication of CN1816847A publication Critical patent/CN1816847A/en
Application granted granted Critical
Publication of CN100559465C publication Critical patent/CN100559465C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes

Abstract

Use multi-tone signal to produce a main signal and a side signal (x who is generally monophonic signal Side).For described side signal (x Side) multiple encoding scheme (81) is provided.Each encoding scheme (81) is characterized as being the subframe (90) of one group of different length.The total length of these subframes (90) equals the length of the coded frame (80) of this encoding scheme (81).Selecting according to the current demand signal content of described multi-tone signal will be at side signal (x Side) the last encoding scheme of using (81).In a preferred embodiment, create a side residual signal with as at the side signal and utilize poor between the main signal of a balance factor convergent-divergent.Select described balance factor to minimize described side residual signal.The side residual signal and the balance factor of code optimization, and with its provide as the expression described side signal coding parameter.

Description

The variable frame length coding that fidelity is optimized
Technical field
Present invention relates in general to the coding of sound signal, especially the coding of multi-channel audio signal.
Background technology
To when keeping the high audio quality, there being the market demand of height with low bit rate transmission and stored audio signal.Particularly, under the limited situation of transfer resource or storer, the low bit rate operation is the cost factor an of necessity.For example the stream in the mobile communication system such as GSM, UMTS or CDMA sends and message sends normally this situation in the application.
Now, do not have that interesting economically bit rate provides the standardization codec of high stereo audio quality to use with the use in mobile communication system.For available codec, can carry out transmission of mono to sound signal.Stereo transmission to a certain extent also is available.Yet bit rate constraints needs very to limit tempestuously stereo expression usually.
Sound signal the simplest mode stereo or multi-channel encoder be with the signal of different sound channels as single and independently signal separately encode.That in stereo FM wireless radio transmission, use and guarantee another basic mode with traditional monophony radio receiver compatibility be transmit two related sound channels with signal and difference signal.
The audio codec of prior art (for example MPEG-1/2 layer III and MPEG-2/4AAC) has used so-called joint stereo coding.According to this technology, the signal of the different sound channels of Combined Treatment, rather than separately and handle separately.Two kinds the joint stereo coding techniques of frequent use be called as " in/side " (M/S) stereo coding and intensity-stereo encoding, they are used on the stereosonic subband or multi-channel signal that will be encoded usually.
The M/S stereo coding is similar in the process described in the stereo FM radio on following meaning: its coding and send the sound channel subband with signal and difference signal, thereby utilized redundancy between the sound channel subband.A kind of structure and operation of the scrambler based on the M/S stereo coding have for example been described in the United States Patent (USP) 5,285,498 of J.D.Johnston.
On the other hand, intensity stereo can utilize stereosonic irrelevance.It sends the combined strength of (different sub-band) a plurality of sound channels and indicates described intensity is certain positional information that how to be distributed in a plurality of sound channels.Intensity stereo only provides the spectrum amplitude information of sound channel.Do not transmit phase information.For this reason and since between temporal sound channel information (more specifically being the mistiming between sound channel) particularly when lower frequency, have main psychologic acoustics correlativity, therefore can only be in that to be higher than on the high frequency of 2KHz for example working strength stereo.In people's such as for example R.Veldhuis European patent 0497413, a kind of intensity-stereo encoding method has been described.
A kind of stereo encoding method of recent development has been described in the meeting paper of delivering by people such as C.Faller in the 112nd the AES meeting that Munich, Germany is held in May, 2002 that is entitled as " Binaural cue coding applied to stereoand multi-channel audio compression (to technology psychologic acoustics coding stereo and that the multichannel audio compression is implemented) ".This method is a kind of parametric multi-channel audio coding method.Ultimate principle is, the coding side from N sound channel C1, C2 ... the input signal of CN is combined into a monophonic signal m.Use any traditional monophonic audio codec that described monophonic signal is carried out audio coding.Simultaneously, from sound channel signal, obtain parameter, this multichannel image of these parametric descriptions.These parameters are encoded and are sent to demoder with audio bit stream.Demoder is decoding mono signal m ' at first, then based on the parametric description of multichannel image produce again sound channel signal C1 ', C2 ' ..., CN '.
The principle of technology psychologic acoustics coding (BCC:Binaural cue coding) method is that it sends the monophonic signal and the so-called BCC parameter of coding.The BCC parameter comprises the mistiming between the sound channel of each subband of level difference between the sound channel of coding and original multichannel input signal.Demoder produces different sound channel signals by the level that monophonic signal is applied sub-band levels based on the BCC parameter again with the phase place adjustment.The advantage of comparing with for example M/S or intensity stereo is, transmits the stereo information that comprises information between temporal sound channel with much lower bit rate.Yet this technology all needs the high temporal frequency conversion of calculation requirement at the encoder place, on each sound channel.
And BCC does not handle following problem, and many stereo informations (particularly when low frequency) spread, and promptly they are not from arbitrary specific direction.The sound field of diffusion is present in two sound channels of stereo record, but they relative to each other are out-phase to a great extent.If the algorithm such as BCC has been run into the record with a large amount of diffuse sound fields, then the stereo image that is reproduced will become chaotic, because therefore the BCC algorithm can only the situation on the right can occur jumping to from the left side the signal allocation of special frequency band (pan) to L channel or R channel.
Be used for the encoded stereo signal and guarantee that a kind of possible method of the well reproduced of diffuse sound field is to use a kind of technology very similar encoding scheme used with the broadcasting of FM stereo radio electricity, promptly separates encoding mono (left side+right side) and poor (L-R) signal.
Described a kind of technology in people's such as C.E.Holt United States Patent (USP) 5,434,948, it uses with the BCC similar techniques and comes encoding mono signal and side information.In this case, a residual signal is made of and is comprised alternatively to side information predictive filter.Allow these multi-channel audio signals of prediction by the estimated predictive filter of least mean square algorithm in the time of on being applied to this monophonic signal.Utilize this technology, can be with very low bit rate coding multichannel audio source, however this drops to cost with quality, and as discussed further below.
At last, for integrality, mention a kind of technology of in the 3D audio frequency, using.This technology is analyzed the right side and left channel signals by utilizing so-called head correlation filter that sound-source signal is carried out filtering.Yet this Technology Need separates different sound-source signals, thereby can not be applied in stereo or the multi-channel encoder usually.
Summary of the invention
Based on signal particularly the problem of the existing encoding scheme of the coding of the frame of a main signal and one or more side signals be, can introduce tedious appreciable artefact to audio-frequency information branch framing.Information is divided into the frame with relatively long duration has reduced average request bit rate usually.This can be useful for the music that for example comprises a large amount of diffuse sound may.Yet for instantaneous abundant music or voice, transient change will be permeated on frame duration fast, thereby produces illusory sound or even Pre echoes problem.Otherwise, will provide more accurate sound to short frame coding and represent, thus minimization of energy, but need higher transmission bit rate and higher computational resource.Therefore, code efficiency also can reduce along with very short frame length.Introduce more frame boundaries and also can introduce the uncontinuity of coding parameter, this can show as appreciable artefact.
Another problem based on the technical scheme of the coding of the one or several side signals of advocating peace is that they need relatively large computational resource usually.Particularly, when using short frame, the uncontinuity handling from a frame to the parameter of another frame is the task of a complexity.When using long frame, the evaluated error of instantaneous sound can cause very large side signal, from and increase the rate requirement of transmitting.
Therefore an object of the present invention is to provide a kind of coding method and equipment, it has improved the perceived quality of multi-channel audio signal, has particularly avoided the artefact such as Pre echoes, illusory sound or frame uncontinuity artefact.Another object of the present invention provides a kind of coding method and equipment, and it needs less processing power and has more constant transmission bit rate requirement.
Above-mentioned purpose is to realize by the method and apparatus according to appended patent claims.Generally speaking, multi-tone signal is used to create main signal (normally monophonic signal) and side signal.Coding principle according to prior art is encoded to main signal.Provide multiple encoding scheme for the side signal.Every kind of encoding scheme is characterised in that the subframe of one group of different length.The total length of subframe is corresponding to the length of the coded frame of encoding scheme.The subframe of these groups comprises at least one subframe.Select the encoding scheme that on the side signal, to use according to the current demand signal content of multi-tone signal at least in part.
In one embodiment, select or before coding, carry out based on the signature analysis of signal.In another embodiment, encode by every kind of encoding scheme offside signal, and based on best encoding scheme is selected in the measurement of coding quality.
In a preferred embodiment, create a side residual signal poor with as between the main signal behind side signal and the use balance factor convergent-divergent.Described balance factor is selected for and minimizes the side residual signal.Side residual signal and the balance factor optimized are encoded, and it is provided as the parameter of representing this side signal.At decoder-side, side residual signal and main signal are used to recover the side signal.
In a further advantageous embodiment, the coding of offside signal comprises the energy profile convergent-divergent, so that avoid the Pre echoes effect.In addition, different encoding schemes can be included in the different coding process in the subframe separately.
Major advantage of the present invention is, has improved the preservation for the perception of sound signal.And the present invention still allows to carry out the multi-channel signal transmission with low-down bit rate.
Description of drawings
By understanding the present invention and other purpose and advantage thereof best, in the accompanying drawings with reference to following description and accompanying drawing:
Fig. 1 is the block scheme that is used to send the system of multi-tone signal;
Fig. 2 a is the block diagram of the scrambler in transmitter;
Fig. 2 b is the block diagram of the demoder in receiver;
Fig. 3 a is the figure that explanation has the coded frame of different length;
Fig. 3 b and 3c are the block diagrams according to the embodiment of side signal coder of the present invention unit;
Fig. 4 is to use the block diagram of embodiment of the scrambler of balance factor coding side signal;
Fig. 5 is the block diagram of embodiment that is used for the scrambler of many signal systems;
Fig. 6 is the block diagram that is suitable for the embodiment of the demoder of decoding from the signal of the equipment of Fig. 5;
Fig. 7 a and b are the artifactitious figure of a kind of Pre echoes of explanation;
Fig. 8 is that it has adopted different coding principles in different subframes according to the block diagram of the embodiment of side signal coder of the present invention unit;
Fig. 9 has illustrated in different frequency subbands and has used different coding principle;
Figure 10 is the process flow diagram according to the basic step of the embodiment of coding method of the present invention; And
Figure 11 is the process flow diagram according to the basic step of the embodiment of coding/decoding method of the present invention.
Embodiment
Fig. 1 has illustrated an exemplary systems 1, can use the present invention valuably therein.Transmitter 10 comprises an antenna 12, and it comprises that relevant hardware and software is sending radio signal 5 to receiver 20.Transmitter 10 also comprises multi-channel encoder device 14 except a plurality of other parts, it becomes to be suitable for the output signal of wireless radio transmission with the signal transformation of a plurality of input sound channels 16.The example of suitable multi-channel encoder device 14 below will be described in further detail.Can provide the signal of input sound channel 16 from for example sound signal storer 18, for example polyethylene pan of the data file of the numeral of audio recording, tape or audio frequency or the like.The signal of input sound channel 16 can also " live telecast " be provided, for example provides from one group of microphone 19.If sound signal also is not a digital format, then before entering multi-channel encoder device 14, it is carried out digitizing.
In receiver 20 sides, the antenna 22 with related hardware and software is handled the reception of the radio signal 5 of expression multitone sound signal.Carry out common function, for example error correction at this.The radio signal 5 that demoder 24 decodings are received, and the voice data that will carry thus is transformed into the signal of a plurality of output channels 26.Output signal for example can be provided for that loudspeaker 29 presents immediately, perhaps can be stored in the sound signal storer 28 of any kind of.
System 1 can be for example TeleConference Bridge, be used to provide the system of audio service or other voice applications.In some systems, for example in TeleConference Bridge, communication must be duplexing type, and distributing music from a service supplier to the subscriber then can be unidirectional type basically.20 signal transmission also can be carried out with any other mode from transmitter 10 to receiver, for example by different types of electromagnetic wave, cable or optical fiber and their combination.
Fig. 2 a explanation is according to the embodiment of scrambler of the present invention.In this embodiment, two sound channel a that multi-tone signal is included in that input end 16A and 16B place receive and the stereophonic signal of b.The signal of sound channel a and b is provided for pretreatment unit 32, can carry out different Signal Regulation processes there.Signal (perhaps being modified) from the output of pretreatment unit 32 is sued for peace in adder unit 34.Described adder unit 34 is also resulting and divided by the factor 2.The signal x of Chan Shenging by this way MonoBe the main signal of this stereophonic signal, because it consists essentially of all data from two channels.In this embodiment, main signal thereby represent pure " monophony " signal.Main signal x MonoBe provided for main signal cell encoder 38, it is according to any suitable coding principle described main signal of encoding.These principles can obtain in the prior art, thereby do not do further discussion at this.Main signal cell encoder 38 provides output signal p Mono, as the coding parameter of expression main signal.
In subtrator 36, poor (divided by the factor 2) of sound channel signal is provided as side signal x SideIn this embodiment, poor between two sound channels of side signal indication stereophonic signal.Side signal x SideBe provided for side signal encoding unit 30.The preferred embodiment of side signal encoding unit 30 below will further be discussed.According to the side signal encoding process that will further go through below, side signal x SideBe converted into expression side signal x SideCoding parameter p SideIn certain embodiments, also utilize main signal x MonoInformation encode.Arrow 42 has been indicated this equipment, has wherein utilized original uncoded main signal x MonoIn further other embodiment, employed main signal information can be from representing the coding parameter p of this main signal in side signal encoding unit 30 MonoMiddle deduction is come out, and is indicated as dotted line 44.
Expression main signal x MonoCoding parameter p MonoBe first output signal, and expression side signal x SideCoding parameter p SideIt is second output signal.Under common situation, these two output signal p Mono, p SideRepresent complete stereo sound together, they are multiplexed into a transmission signals 52 at multiplexer module 40.Yet, in other embodiments, can separately carry out the first and second output signal p Mono, p SideTransmission.
In Fig. 2 b, with the block diagram formal specification according to the embodiment of demoder 24 of the present invention.The signal 54 that is received (comprise expression advocate peace the coding parameter of side signal message) is provided for demultiplexer unit 56, and it tells first and second input signals respectively.Coding parameter p corresponding to main signal MonoFirst input signal be provided for main signal decoder element 64.In a conventional manner, the coding parameter p of expression main signal MonoBe used to produce the main signal x of a decoding " Mono, it is similar to scrambler 14 (the main signal x of Fig. 2 in a) as much as possible Mono(Fig. 2 a).
Similarly, second input signal corresponding to the side signal is provided for a side decoding signals unit 60.Here, the coding parameter p of expression side signal SideBe used to the side signal x that recovers to decode " SideIn certain embodiments, decode procedure utilizes relevant main signal x " MonoInformation, indicated as arrow.
The side signal x that advocates peace that is decoded " Mono, x " SideBe provided for an adder unit 70, it provides the output signal of the original signal of an expression sound channel a.Similarly, the difference that is provided by subtrator 68 provides the output signal of the original signal of an expression sound channel b.Can in post-processor unit 74, carry out aftertreatment according to the processing procedure of prior art to these sound channel signals.Finally, output terminal 26A and the 26B at demoder provides sound channel signal a and b.
As described in the summary of the invention, encode in the mode of each frame usually.One frame is included in the audio sample in the predetermined period of time.In the bottom of Fig. 3 a, example the duration be the frame SF2 of L.Audio sample in the shadow-free part will be encoded together.The sampling of front and sampling are subsequently encoded in other frame.In any case, will introduce sampling branch framing all that some are discontinuous at the frame boundaries place.Changeable sound will provide changeable coding parameter, thereby change at each frame boundaries place basically.This will produce appreciable error.A kind of method that this situation is compensated a little is to make coding not only based on the sampling that will be encoded, and based near the sampling this frame absolute, as indicated by dash area.In this way, will be softer conversion between different frames.As alternatives or additional, utilize interpositioning to reduce the appreciable artefact that causes by frame boundaries sometimes.Yet all these processes all need a large amount of additional calculations resources, and for some specific coding technology, perhaps are difficult to the resource that provides any.
Therefore, it will be useful using long as far as possible frame, so the number of frame boundaries can be little.And code efficiency can uprise usually, and necessary transmission bit rate also is minimized usually.Yet the problem that long frame brought is Pre echoes artefact and illusory sound.
By alternatively utilizing short frame, for example have respectively L/2 and L/4 duration SF1 or even SF0, those skilled in the art recognizes that code efficiency can be lowered, transmission bit rate must be than higher, and the artifactitious problem of frame boundaries will increase.Yet short less for example other appreciable artefact that stands of frame is such as illusory sound and Pre echoes.In order to minimize encoding error as much as possible, should use short as far as possible frame length.
According to the present invention, the frame length that depends on the current demand signal content by the use side signal of encoding can improve the audio frequency perception.Because different frame length will be according to the characteristic of the sound that will be encoded for the influence of audio frequency perception and difference, therefore can obtain improvement by allowing the characteristic of signal itself influence employed frame length.The coding of main signal is not a purpose of the present invention, therefore is not described in detail.Yet the used frame length of main signal can equate with the employed frame length of side signal, perhaps can be unequal.
Because little transient change is useful so for example use relatively long frame offside signal to encode in some cases.Record for the sound field with a large amount of diffusions this situation can occur such as the concert record.Under other situation, for example in the stereo language session, short frame then may be preferred.Can judge with two kinds of basic skills and choose which kind of frame length.
Explanation has wherein utilized the closed loop judgement according to an embodiment of side signal coder of the present invention unit 30 in Fig. 3 b.Used the basic coding frame of length at this as L.Produced a plurality of encoding schemes 81, characterized by the set that separates 80 of subframe.Each set 80 of subframe comprises one or more subframe, and they have identical or different length.Yet the total length of the set 80 of subframe always equals basic coding frame length L.With reference to figure 3b, the top encoding scheme is characterized as being the subframe set of the subframe that only to comprise a length be L.The subframe that it is L/2 that next subframe set comprises two length.A subframe that it is L/4 that the 3rd set comprises two length and a subframe that length is L/2 of back.
Be provided for the signal x of side signal coder unit 30 by 81 pairs of all encoding schemes SideEncode.In the encoding scheme at top, with the whole basic coding frame of encoding.Yet in other encoding scheme, in separated each subframe to signal x SideEncode.Result from each encoding scheme is provided for selector switch 85.Fidelity measurement mechanism 83 is determined the fidelity measured value (measure) of each coded signal.The fidelity measured value is an objective mass value, the preferably signal to noise ratio (S/N ratio) of signal-to-noise ratio measurements or weighting.Relatively with every kind of fidelity measured value that encoding scheme is relevant, and its result controls a switching device shifter 87, be used for selecting the coding parameter of this side signal of expression, with as output signal p from side signal coder unit 30 from the encoding scheme that provides best fidelity measured value Side
Preferably, all possible combination of test frame length, and selection provides the set of the subframe of best objective quality (for example signal to noise ratio (S/N ratio)).
In the present embodiment, select the length of used subframe according to following formula:
l sf=l f/2 n
L wherein SfBe the length of subframe, l fBe the length of coded frame, and n is an integer.In the present embodiment, between 0 and 3, select n.Yet, may use any frame length, constant as long as the total length of set keeps.
Another embodiment according to side signal coder of the present invention unit 30 has been described in Fig. 3 c.At this, frame length judges it is that an open loop based on the statistical property of signal is judged.In other words, with the spectrum signature that uses the side signal with as the basis that is used to determine to plan to use which kind of encoding scheme.As previously mentioned, can obtain to be characterized as being the different encoding schemes of the set of different subframes.Yet in this embodiment, selector switch 85 is placed on before the actual coding.The side signal x of input Side Enter selector switch 85 and signal analysis unit 84.The result who analyzes becomes the input of switch 86, only uses a kind of encoding scheme 81 in switch.From the output of this encoding scheme also will be output signal p from side signal coder unit 30 Side
The advantage that open loop is judged is as long as carry out an actual coding.Yet shortcoming is that the analysis of signal characteristic in fact can be very complicated, and be difficult to predict possible characteristic in advance so that can provide suitable selection in switch 86.In signal analysis unit 84, must carry out and comprise many sound statistical study.Any little variation all may be put upside down statistical property fully in the encoding scheme.
By using closed loop to select (Fig. 3 b), can exchange encoding scheme and need not the remainder of unit is carried out any variation.On the other hand, if study many encoding schemes, then calculation requirement can be very high.
The benefit that this offside signal carries out variable frame length coding is, can select between two kinds of situations: being meticulous temporal resolution and coarse frequency resolution on the one hand, is coarse temporal resolution and meticulous frequency resolution on the other hand.Above embodiment will keep stereo image in the possible mode of the best.
Also have ask for something for employed actual coding in different encoding schemes.Particularly, when using closed loop to select, being used to carry out a plurality of more or less while calculation of coding resources must be big.Cataloged procedure is complicated more, and needed computing power is just many more.In addition, the low bit rate when transmission also is preferred.
At US 5,434, the method that provides in 948 has been used the filtered version of monophony (master) signal compare side signal or difference signal.The parameter of wave filter is optimised, and allows to change in time.The filter parameter of representing the coding of side signal then is sent out.In one embodiment, also send a residual side signal.Under many situations, this method may be as side coding method within the scope of the present invention.Yet this method has some defectives.Because filter order must very highly provide accurate side signal to estimate, so the quantification of filter coefficient and any residual side signal needs relative higher transmission bit rate usually.The estimation of wave filter self also has problem, particularly in instantaneous abundant music.Evaluated error will provide the side signal of a modification, its sometimes aspect amplitude the signal than unmodified big.This will cause higher bit rate needs.And, if one group of new filter coefficient is calculated in every N sampling, then need these filter coefficients of interpolation to produce level and smooth conversion, as discussed above from one group of filter coefficient to another group.The interpolation of filter coefficient is the task of a complexity, and the error in interpolation will show as big side error signal, thereby causes the required higher bit rate of difference error signal coder.
Avoid a kind of method of the needs of interpolation to be based on sampling one by one and upgrade filter coefficient, and rely on the back to adaptive analysis.In order can well to move, require residual scrambler that quite high bit rate is arranged.Therefore, this is not a good alternatives for the low rate stereo coding.
Very common situation for music for example below the existence, wherein monophonic signal and difference signal almost are incoherent.The very difficulty so wave filter is estimated to become, additional risk just make the worse off of difference error signal coder.
According to US 5,434,948 solution can situation below under works fine: wherein filter coefficient is along with the variation of time is very slow, for example in conference telephone system.Under the situation of music signal, this method is not worked well, because wave filter needs to change fast to follow the tracks of stereo image.This means, the necessary very different subframe lengths of use amplitude, it means that the combined number that will test increases fast.This means that again the requirement that is used to calculate all possible encoding scheme becomes unrealisticly high.
Accordingly, in a preferred embodiment, based on the following thought side signal of encoding: i.e. predictive filter by using a simple balance factor to replace complicated bit rate to consume, thus redundancy between monophonic signal and the side signal reduced.The residual of this operation of encoding then.Described residual amplitude is relatively low, and does not need very high bit-rate requirements to transmit.This thought is very suitable for combining with foregoing variable method of frame aggregation really, because computation complexity is low.
Use the balance factor combine with the variable frame length method to eliminate the relevant issues that needs and interpolation to complicated interpolation may cause.And, use simple balance factor to replace complicated wave filter generation estimation problem still less, because the possible evaluated error of balance factor has influence still less.Preferred solution can with good quality and limited bit rate requires and computational resource reproduces smooth signal (panned signal) and diffuse sound field.
Fig. 4 has illustrated the preferred embodiment according to stereophonic encoder of the present invention.Embodiment shown in this embodiment and Fig. 2 a is very similar, yet, disclosed the details of side signal coder unit 30.The scrambler 14 of this embodiment does not possess any pretreatment unit, and input signal is provided directly to addition and subtrator 34,36.Monophonic signal x in multiplier 33 MonophonyWith a certain balance factor g SmMultiply each other.In subtrator 35, the monophonic signal after multiplying each other is by from side signal x SideIn deduct (promptly being the difference between these two sound channels basically), to produce the side residual signal.Determine balance factor g by optimizer 37 based on the content of monophonic signal and side signal Sm, so that minimize the side residual signal according to quality standard.Described quality standard is preferably the lowest mean square standard.Encode according to arbitrary encoder process offside residual signal in the residual scrambler 39 of side.Preferably, the residual scrambler 39 of side is low bit rate transform coder, perhaps a code book Excited Linear Prediction (CELP:Codebook Excited LinearPrediction) scrambler.The coding parameter p of expression side signal SideThe coding parameter p that has then comprised expression side residual signal Side residualWith the balance factor of optimizing 49.
In the embodiment of Fig. 4, the monophonic signal 42 that is used for synthetic side signal is echo signal x of monophony scrambler 38 Mono(in conjunction with Fig. 2 a), also can utilize the local composite signal of monophony scrambler 38 as mentioned above.Under one situation of back, total scrambler time delay can be increased, and the computation complexity of side signal can be increased.On the other hand, quality can be relatively good, because might repair the code error that produces in the monophony scrambler.
Followingly the basic coding scheme is described with accurate way more.Two sound channel signals are expressed as a and b, and they can be stereo right L channel and R channel.By addition sound channel signal is combined into a monophonic signal, and is combined into a side signal by subtracting each other.This operation is described to the form of equation:
x mono(n)=0.5(a(n)+b(n))
x side(n)=0.5(a(n)-b(n)).
Useful is with 2 is that the factor is dwindled x MonoAnd x SideSignal.At this, this is hinting and is existing other to produce x MonoAnd x SideMethod.For example can use:
x mono(n)=γa(n)+(1-γ)b(n)
x side(n)=γa(n)-(1-γ)b(n)
0≤γ≤1.0.
On the piece of input signal, calculate amended or residual side signal according to following formula:
x sideresidual(n)=x side(n)-f(x mono,x side)x mono(n),
F (x wherein Mono, x Side) be the balance factor function, it strives for eliminating as much as possible from the side signal based on N piece (being subframe) of sampling from side and monophonic signal.In other words, use balance factor to minimize residual side signal.Carrying out under the minimized special case all just to be as the criterion, this is equivalent to and minimizes residual side signal x Side residualEnergy.
Under above-mentioned special case, f ( Xmono, x Side) be described to:
f ( x mono , x side ) = R sm R mm
R mm = [ Σ n = framestart frameend x mono ( n ) x mono ( n ) ]
R sm = [ Σ n = framestart frameend x side ( n ) x mono ( n ) ] ,
X wherein SideBe the side signal, and x MonoIt is monophonic signal.Notice that this function is based on " frame begins " beginning and the piece that finishes with " frame end ".
Might in frequency domain, increase weighting and come the calculated equilibrium factor.This is that impulse response by utilizing weighting filter is to x SideAnd x MonoThe signal convolution is finished.Evaluated error might be moved in the frequency range that more is difficult for being heard like this.This is called as perceptual weighting.
By function f (x Mono, x Side) quantized versions of the balance factor value that provides is sent to demoder.These quantifications preferably have been described when producing the side signal of revising.Obtain following expression formula then:
x sideresidual(n)=x side(n)-g Qx mono(n)
g Q = Q g - 1 ( Q g ( R sm R mm ) ) .
Qg (...) is a quantization function, and it is applied to by function f (x Mono, x Side) on the given balance factor.In transmission channel, send described balance factor.In the smooth signal of the normal left and right sides, balance factor is limited in the interval [1.0 1.0].On the other hand, if sound channel out-phase relative to each other, then balance factor can exceed these restrictions.
An optional method as being used for stablizing stereo image can limit balance factor under following situation, and is if promptly the normalized crosscorrelation between monophonic signal and the side signal is not good, given as following equation:
g Q = Q g - 1 ( Q g ( | R = sm | R sm R mm ) ) ,
Wherein,
R = sm = R sm R ss · R mm
R sm = [ Σ n = framestart frameend x sidc ( n ) x mono ( n ) ] .
These situations occur very frequent in classical music with a large amount of diffuse sound may or broadcasting studio music, and wherein in some cases, perhaps a and b sound channel almost cancel each other out when creating monophonic signal.For the influence of balance factor is exactly can quick saltus step, thereby causes chaotic stereo image.Above-mentioned adjustment has alleviated described problem.
At US 5,434, the method based on wave filter in 948 has similar problem, but solution is so not simple under the sort of situation.
If E sBe the coding function (for example transform coder) of residual side signal, and E mBe the coding function of monophonic signal, then at the decoded a in demoder end " and b " signal can be described to (this hypothesis γ=0.5):
a″(n)=(1+g Q)x″ mono(n)+x″ side(n)
b″(n)=(1-g Q)x″ mono(n)-x″ side(n)
x side ′ ′ = E s - 1 ( E s ( x sideresidual ) )
x mono ′ ′ = E m - 1 ( E m ( x mono ) )
An important benefits for each frame calculated equilibrium factor has been avoided the use interpolation exactly.Replace, usually as mentioned above, utilize overlapping frame to carry out frame and handle.
Under the situation of music signal, use the coding principle work of balance factor good especially, wherein need usually to change fast and follow the tracks of stereo image.
Recently, multi-channel encoder has become general.Example is 5.1 sound channel surround sounds in the DVD film.These sound channels are set to there: a preceding left side, preceding in, the preceding right side, a back left side, the right side, back and sub-woofer speaker.In Fig. 5, show according to the present invention the embodiment of the scrambler of 3 preceding sound channels being encoded with layout redundant between this employing sound channel.
3 sound channel signal L are provided on 3 input end 16A-C, C, R, and by these three signals and produce monophonic signal x MonoIncreased central signal cell encoder 130, it receives central signal x Centre Monophonic signal 42 is monophonic signal x coded and decoding in the present embodiment " Mono, and in multiplier 133 with a certain balance factor g QMultiply each other.In subtrator 135, the monophonic signal after multiplying each other is by from central signal x CentreIn deduct, to produce central residual signal.Determine balance factor g by optimizer 137 based on the content of monophonic signal and central signal Q, so that minimize central residual signal according to quality standard.In the residual scrambler 139 of central authorities, central residual signal is encoded according to any cataloged procedure.Preferably, central residual scrambler 139 is low bit rate transform coder or celp coder.The coding parameter p of expression central signal Centre central authoritiesThen comprise the coding parameter p that represents central residual signal Centre residualAnd the balance factor of optimizing 149.In adder unit 235,, come the compensation coding error thereby produce amended central signal 142 with the monophonic signal addition behind central residual signal and the convergent-divergent.
As among the embodiment of front, side signal x Side(be between left L and the right R sound channel poor) is provided for side signal coder unit 30.Yet here, the amended central signal 142 that is provided by central signal cell encoder 130 also is provided optimizer 37.Therefore will in subtrator 35, produce the side residual signal with optimum linear combination as monophonic signal 42, amended central signal 142 and side signal.
The notion of above-mentioned variable frame length can be applied to side signal and central signal arbitrary go up or all on.
Fig. 6 explanation is suitable for from the decoder element of the sound signal of the cell encoder received code of Fig. 5.The signal 54 that is received is divided into the coding parameter p that represents main signal Mono, the expression central signal coding parameter p CnetreAnd the coding parameter p of expression side signal SideIn demoder 64, the coding parameter p of expression main signal MonoBe used to produce main signal x " MonoIn demoder 160, the coding parameter p of expression central signal CentreBe used to based on main signal x " MonoProduce central signal x " CentreIn demoder 60, according to main signal x " MonoWith central signal x " CentreDecode and represent the coding parameter p of side signal SideThereby, produce side signal x " Side
This process can be expressed as follows on mathematics:
According to following formula with input signal x Left, x RightAnd x CentreBe combined as a monophony:
x mono(n)=αx left(n)+βx right(n)+χx centre(n).
For simplicity, α, β and χ are set to 1.0 in remainder, but they can be set to arbitrary value.The value of α, β and χ can be a constant, perhaps depends on signal content, so that emphasize one or two sound channels, thereby obtains a best in quality.
The normalized simple crosscorrelation of following calculating between monophony and central signal:
R = cm = R cnt R cc · R mm ,
Wherein
R cc = [ Σ n = framestart frameend x centre ( n ) x centre ( n ) ]
R mm = [ Σ n = framestart frameend x mono ( n ) x mono ( n ) ]
R cm = [ Σ n = framestart frameend x centre ( n ) x mono ( n ) ] .
x CentreBe central signal, and x MonoIt is monophonic signal.Monophonic signal comes from the monophony echo signal, but also may use this locality of monophony scrambler synthetic.
The central residual signal of encoding is:
x centreresidual(n)=x centre(n)-g Qx mono(n)
g Q = Q g - 1 ( Q g ( R cm R mm ) ) .
Qg (...) is the quantization function that is applied to balance factor.In transmission channel, send described balance factor.
If E cBe the coding function (for example transform coder) of central residual signal, and E mBe the coding function of monophonic signal, then at the decoded signal x at demoder end " CentreBe described to:
x″ centre(n)=g Qx″ mono(n)+x″ centreresidual(n)
x centreresidual ′ ′ = E c - 1 ( E c ( x centreresidual ) )
x mono ′ ′ = E m - 1 ( E m ( x mono ) )
The side residual signal of encoding is:
x sideresidual(n)=(x left(n)-x right(n))-g Qsmx″ mono(n)-g Qscx″ centre(n),
G wherein QsmAnd g QscBe parameter g SmAnd g ScQuantized value, it has minimized expression formula:
Σ n = framestart frameend [ | ( x left ( n ) - x right ( n ) ) - g sm x mono ′ ′ ( n ) - g sc x centre ′ ′ ( n ) | ] η .
Lowest mean square for error minimizes, and η for example can equal 2.g SmAnd g ScParameter can be quantized jointly or separately be quantized.
If E sBe the coding function of side residual signal, then decoded sound channel signal x " A left sideAnd x " RightBe given:
x″ left(n)=x″ mono(n)-x″ centre(n)+x″ side(n)
x″ right(n)=x″ mono(n)-x″ centre(n)-x″ side(n)
x″ side(n)=x″ sideresidual+g Qsmx″ mono(n)+g Qsx″ centre(n)
x sideresidual ′ ′ = E s - 1 ( E s ( x sideresidual ) ) .
But one of the most tedious perception artefact is the Pre echoes effect.In Fig. 7 a-b, described figure has illustrated this artefact.Suppose that component of signal has the time development shown in curve 100.In beginning (from t0), in audio sample, there is not component of signal.Component of signal appears suddenly in the time t between t1 and t2.When the frame length that uses t2-t1 was encoded to this component of signal, the appearance meeting quilt " infiltration " of this component of signal was on entire frame, shown in curve 101.If produce the decoding of this curve 101, then this component of signal time of occurrence Δ t before the expection of this component of signal occurs perceives " Pre echoes " thus.
If use long coded frame, then the artefact of Pre echoes becomes and further strengthens.By using short frame, this artefact obtains restraining a little.The other method of handling above-mentioned Pre echoes problem is to utilize the following fact, promptly can utilize monophonic signal at the encoder end.This makes might come convergent-divergent side signal according to the energy profile of this monophonic signal.At the demoder end, carry out opposite convergent-divergent, thereby can alleviate some Pre echoes problems.
The energy profile of calculating this monophonic signal on entire frame is:
E c ( m ) = [ Σ n = m - L m + L w ( n ) x mono 2 ( n ) ] , Frame begins≤m≤frame end,
Wherein w (n) is a windowed function.The simplest windowed function is a rectangular window, but perhaps more expects other window type, for example Hamming window.
Convergent-divergent side residual signal is then:
x ‾ sideresidual ( n ) = x sideresidual ( n ) E c ( n ) , Frame begins≤n≤frame end.
Above-mentioned equation can use more generally, and form is written as:
x ‾ sideresidual ( n ) = x sideresidual ( n ) f ( E c ( n ) ) , Frame begins≤n≤frame end,
Wherein f (...) is a monotone continuous function.In demoder,, and described profile is applied on the side signal of decoding the monophonic signal calculating energy profile of being decoded:
X " Side(n)=x " Side(n) f (E c(n)), frame begins≤n≤frame end.
Because this energy profile of convergent-divergent is to use substituting of shorter frame length to a certain extent, so this notion is particularly suitable for combining with the notion of variable frame length, further describes as top.Encoding scheme by having some applied energy profile convergent-divergents, some are not used and some encoding schemes of applied energy profile convergent-divergent during some subframe only, and a set of encoding scheme more flexibly can be provided.Embodiment according to a signal coder unit 30 of the present invention has been described in Fig. 8.At this, different encoding schemes 81 has comprised subframe (coding of energy profile convergent-divergent has been used in expression) that adds shade and the subframe (expression does not have the cataloged procedure of applied energy profile convergent-divergent) that does not add shade.By this way, not only can obtain the combination of the subframe of different length, and can obtain to have the combination of the subframe of different coding principle.In current illustrative example, the energy profile convergent-divergent difference of between different encoding schemes, using.Under situation more generally, can any coding principle be combined with the notion of variable-length with similar mode.
The set of the encoding scheme of Fig. 8 comprises handles for example artifactitious scheme of Pre echoes in a different manner.In some versions, used that to have a Pre echoes according to the energy profile principle minimized than eldest son's frame.In other scheme, utilized the short subframe of not carrying out the energy profile convergent-divergent.According to the content of signal, one of them alternatives can be more useful.For very serious Pre echoes situation, must use the encoding scheme of the short subframe of carrying out the energy profile convergent-divergent.
The solution that is proposed can be used in whole frequency bands or in one or more different subbands.The use of subband can be applied on the two of main signal and side signal or be applied on one of them separately.Preferred embodiment comprises the side signal is divided into several frequency bands.Reason just since in the frequency band of isolating, remove possible redundancy ratio in whole frequency band, remove easier.This point particular importance when decoding has abundant spectral content.
A kind of possible purposes is to utilize said method to encode to be lower than the frequency band of predetermined threshold.Described predetermined threshold preferably can be 2kHz, perhaps even more preferably 1kHz.For the remainder of interested frequency range, can utilize said method that another additional frequency bands is encoded, perhaps use a diverse method.
The sound field that is preferably a motivation that low frequency uses said method and is diffusion is usually in the few of energy content of high frequency.Natural cause is that acoustic absorption increases along with frequency usually.As and if the diffuse sound field component plays not too important effect at upper frequency for the human auditory system.Therefore, when low frequency (be lower than 1 or 2kHz) to adopt described solution be useful, and depend on other condition and use the higher encoding scheme of bit efficiency at upper frequency.Only using described scheme when low frequency can save bit rate in a large number, because the necessary bit rate of method that proposes is directly proportional with needed bandwidth.Under most of situations, the monophony scrambler can be encoded to whole frequency band, and suggestion just frequency band carry out the side signal encoding that is proposed than lower part, schematically illustrate as Fig. 9.Reference number 301 refers to according to side signal encoding scheme of the present invention, and reference number 302 refers to any other side signal encoding scheme, and reference number 303 refers to an encoding scheme of side signal.
Also might use the method that is proposed for several different frequency bands.
In Figure 10, with flowchart text according to the key step of the embodiment of coding method of the present invention.This process starts from step 200.In step 210, the main signal that coding is derived from multi-tone signal.In step 212, encoding scheme is provided, it comprises the subframe with different length and/or order.Utilize an encoding scheme of selecting according to the actual signal content of current multi-tone signal at least in part to come the side signal of deriving from multi-tone signal is encoded in step 214.This process ends at step 299.
In Figure 11, with flowchart text according to the key step of the embodiment of coding/decoding method of the present invention.This process starts from step 200.In step 220, the main signal of the coding that decoding is received.In step 222, encoding scheme is provided, it comprises the subframe with different length and/or order.In step 224 by the side signal decoding of a selected encoding scheme to being received.In step 226, be a multi-tone signal with the side signal combination of advocating peace of being decoded.Described process ends at step 299.
The foregoing description is appreciated that illustrative examples more of the present invention.One skilled in the art will appreciate that and to carry out various modifications, combination and variation and different departing from the scope of the present invention to these embodiment.Particularly, in other scheme, can make up the different part solution among the different embodiment, as long as it is feasible technically.Yet scope of the present invention is limited by appending claims.
List of references
European patent 0497413
United States Patent (USP) 5,285,498
United States Patent (USP) 5,434,948
By " Binaural cue coding applied to stereo and multi-channel audio compression (stereo and multichannel audio compressed applied technology psychologic acoustics coding) " of people such as C.Faller in the 112nd the AES meeting that Munich, Germany is held in May, 2002.

Claims (26)

1. the method for the multi-tone signal of encoding may further comprise the steps:
Based at least the first and second sound channels (a, b; L, signal R) produce the first output signal (p Mono), it is the coding parameter of expression main signal; And
Based on this first and second sound channel at least (a, the b in a coded frame (80); L, signal R) produce the second output signal (p Side), it is the coding parameter of expression side signal,
It is characterized in that further comprising the steps of:
At least two kinds of encoding schemes (81) are provided, each of described at least two kinds of encoding schemes (81) all is characterized as being the one group of corresponding subframe (90) that constitutes this coded frame (80) together, and the length sum of these subframes (90) equals the length of described coded frame (80) in each encoding scheme (81) thus;
Each group subframe (90) comprises at least one subframe (90);
Thus, produce the second output signal (p Side) step comprise to small part according to as front side signal (x Side) signal content select the step of an encoding scheme (81);
The described second output signal (p dividually encodes in each subframe (90) of selected subframe (90) group Side).
2. method according to claim 1 is characterized in that, produces the second output signal (p Side) steps in sequence may further comprise the steps:
In all subframes (90) of each group of described two group subframes (90), produce side signal (x of expression dividually at least Side) coding parameter, it is at least the first and second sound channels (a, b; L, first linear combination of signal R);
At the total fidelity measured value of each calculating of described at least two kinds of encoding schemes (81); And
From encoding scheme (81), select coded signal with coding parameter (p as the described side signal of expression with best fidelity measured value Side).
3. method according to claim 2 is characterized in that the fidelity measured value is based on signal-to-noise ratio measurements.
4. method according to claim 1 is characterized in that, subframe (90) has the length 1 according to following formula Sf:
l sf=l f/2″,
L wherein fBe the length of coded frame (80), and n is an integer.
5. method according to claim 4 is characterized in that, n is less than a predetermined value.
6. method according to claim 5 is characterized in that, described at least two kinds of encoding schemes (81) comprise all arrangements of subframe (90) length.
7. according to any one described method among the claim 1-6, it is characterized in that, produce the coding parameter (p of expression main signal Mono) steps in sequence may further comprise the steps:
Create main signal (x Mono) with as at least the first and second sound channels (a, b; L, second linear combination of signal R); And
Described main signal is encoded to the coding parameter (p of the described main signal of expression Mono),
The steps in sequence of coding side signal may further comprise the steps:
Create a side residual signal (x Side residunt) with as described side signal with by balance factor (g Sm) main signal (x behind the convergent-divergent Mono) between poor;
Described balance factor (g Sm) be confirmed as making the factor of described side residual signal minimum according to a kind of quality standard;
With described side residual signal and balance factor (g Sm) be encoded to the expression described side signal coding parameter (p Side).
8. method according to claim 7 is characterized in that described quality standard is based on the lowest mean square measured value.
9. according to any one described method among the claim 1-6, it is characterized in that the step of coding side signal is further comprising the steps of:
With described side signal (x Side) be scaled described main signal (x Mono) energy profile.
10. method according to claim 9 is characterized in that, the described side signal of convergent-divergent (x Side) be divided by a factor, the described factor is described main signal (x Mono) the monotone continuous function of energy profile.
11. method according to claim 10 is characterized in that, described monotone continuous function is a square root function.
12. method according to claim 10 is characterized in that, calculates described main signal (x according to following formula on a subframe Mono) energy profile E c:
E c ( m ) = [ Σ n = m - L m + L w ( n ) x mono 2 ( n ) ] , Frame begins≤m≤End of Frame
Wherein L is any factor, and n is a summation subscript, and m is the sampling in described subframe, and w (n) is a windowed function.
13. method according to claim 12 is characterized in that, windowed function is a rectangle windowed function.
14. method according to claim 12 is characterized in that, windowed function is a Hamming window function.
15., it is characterized in that described at least two kinds of encoding schemes (81) comprise described side signal (x according to any one described method among the claim 1-6 Side) different coding principle.
16. method according to claim 15 is characterized in that, at least the first encoding scheme of described at least two kinds of encoding schemes (81) comprises the described side signal (x at all subframes (90) Side) first coding principle, and at least the second encoding scheme of described at least two kinds of encoding schemes (81) comprises the described side signal (x at all subframes (90) Side) second coding principle.
17. method according to claim 15 is characterized in that, at least a encoding scheme of described at least two kinds of encoding schemes (81) comprises the described side signal (x at a subframe Side) first coding principle and at the described side signal (x of another subframe Side) second coding principle.
18. method according to claim 1 is characterized in that, produces the second output signal (p Side) steps in sequence may further comprise the steps:
Analyze a side signal (x Side) spectrum signature, this side signal (x Side) be at least the first and second sound channels (a, b; L, first linear combination of signal R);
Select one group of subframe (90) based on the spectrum signature of being analyzed; And
Be coded in the interior side signal (x of all subframes (90) of selected this group subframe (90) dividually Side).
19. according to any one described method among the claim 1-6, it is characterized in that, produce the second output signal (p Side) step be used in the limited frequency band.
20. method according to claim 19 is characterized in that, produces the second output signal (p Side) step only be applied to being lower than the frequency of 2kHz.
21. method according to claim 20 is characterized in that, produces the second output signal (p Side) step only be applied to being lower than the frequency of 1kHz.
22., it is characterized in that described multi-tone signal is represented music signal according to any one described method among the claim 1-6.
23. the method for the multi-tone signal of decoding may further comprise the steps:
Coding parameter (the p of decoding expression main signal Mono);
Decoding table is shown in the coding parameter (p of the side signal in the coded frame (80) Side); And
To main signal that the major general decoded (x " Mono) and the side signal of being decoded (x " Side) combination
Be at least the first and second sound channels (a, b; L, signal R),
It is characterized in that following steps:
At least two kinds of encoding schemes (81) are provided, each of described at least two kinds of encoding schemes (81) all is characterized as being the one group of subframe (90) that constitutes this coded frame (80) together, and the length sum of these subframes (90) equals the length of described coded frame (80) in each encoding scheme (81) thus;
Every group of subframe (90) comprises at least one subframe (90),
Coding parameter (the p of described side signal is represented in decoding thus Side) steps in sequence comprise dividually the coding parameter (p of the described side signal of decoding expression in the subframe (90) of one of described at least two kinds of encoding schemes (81) Side) step.
24. encoder device (14) comprising:
Be used to comprise at least the first and second sound channels (a, b; L, multi-tone signal R) (a, b; L, R, input media (16 C); 16A-C),
Be used for according to described at least first and second sound channels (a, b; L, signal R) produce the first output signal (p Mono) device (38), wherein said first output signal be the expression main signal coding parameter;
Be used for according to described at least first and second sound channels (a, b in a coded frame (80); L, signal R) produce the second output signal (p Side) device (30), wherein said second output signal is the coding parameter of expression side signal; And
Output unit (52);
It is characterized in that
Be used to provide the device of at least two kinds of encoding schemes (81), each of described at least two kinds of encoding schemes (81) all is characterized as being the one group of corresponding subframe (90) that constitutes this coded frame (80) together, and the length sum of these subframes (90) equals the length of described coded frame (80) in each encoding scheme (81) thus;
Each group subframe (90) comprises at least one subframe (90);
Thus, be used to produce the second output signal (p Side) device (30) comprise successively to small part according to as front side signal (x Side) signal content select the device (86 of an encoding scheme; 87);
Be used for dividually the described side signal (x of each subframe (90) coding in selected encoding scheme Side) device.
25. decoder apparatus (24) comprising:
Be used to represent the coding parameter (p of main signal Mono) and the coding parameter (p of expression side signal Side) input media (54);
Described coding parameter (the p of expression main signal is used to decode Mono) device (64);
Be used for the coding parameter (p that decoding table is shown in the side signal in the coded frame (80) Side) device (60);
The main signal that is used for decoding to the major general (x " Mono) and the side signal of being decoded (x " Side) be combined as at least the first and second sound channels (a, b; L, the device of signal R) (68,70); And
Output unit (26; 26A-C),
It is characterized in that the coding parameter (p of the described expression side signal that is used to decode Side) device (60) comprise successively:
Be used to provide the device of at least two kinds of encoding schemes (81), each of described at least two kinds of encoding schemes (81) all is characterized as being the one group of corresponding subframe (90) that constitutes this coded frame (80) together, and the length sum of these subframes (90) equals the length of described coded frame (80) in each encoding scheme thus;
Each group subframe (90) comprises at least one subframe (90); And
Be used for representing in subframe (90) decoding of one of described at least two kinds of encoding schemes (81) dividually coding parameter (the p of described side signal Side) device.
26. audio system (1) comprises following at least one:
Encoder device according to claim 24 (14), and
Decoder apparatus according to claim 25 (24).
CNB2004800186630A 2003-12-19 2004-12-15 The variable frame length coding that fidelity is optimized Active CN100559465C (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
SE03035011 2003-12-19
SE0303501A SE0303501D0 (en) 2003-12-19 2003-12-19 Filter-based parametric multi-channel coding
SE04004172 2004-02-20
SE0400417A SE527670C2 (en) 2003-12-19 2004-02-20 Natural fidelity optimized coding with variable frame length
PCT/SE2004/001867 WO2005059899A1 (en) 2003-12-19 2004-12-15 Fidelity-optimised variable frame length encoding

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN200710138487XA Division CN101118747B (en) 2003-12-19 2004-12-15 Fidelity-optimized pre echoes inhibition encoding

Publications (2)

Publication Number Publication Date
CN1816847A CN1816847A (en) 2006-08-09
CN100559465C true CN100559465C (en) 2009-11-11

Family

ID=31996354

Family Applications (2)

Application Number Title Priority Date Filing Date
CNB2004800186630A Active CN100559465C (en) 2003-12-19 2004-12-15 The variable frame length coding that fidelity is optimized
CN200710138487XA Expired - Fee Related CN101118747B (en) 2003-12-19 2004-12-15 Fidelity-optimized pre echoes inhibition encoding

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN200710138487XA Expired - Fee Related CN101118747B (en) 2003-12-19 2004-12-15 Fidelity-optimized pre echoes inhibition encoding

Country Status (15)

Country Link
EP (2) EP1623411B1 (en)
JP (2) JP4335917B2 (en)
CN (2) CN100559465C (en)
AT (2) ATE443317T1 (en)
AU (1) AU2004298708B2 (en)
BR (2) BRPI0410856B8 (en)
CA (2) CA2690885C (en)
DE (2) DE602004023240D1 (en)
HK (2) HK1115665A1 (en)
MX (1) MXPA05012230A (en)
PL (1) PL1623411T3 (en)
RU (2) RU2305870C2 (en)
SE (1) SE527670C2 (en)
WO (1) WO2005059899A1 (en)
ZA (1) ZA200508980B (en)

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100539742C (en) * 2002-07-12 2009-09-09 皇家飞利浦电子股份有限公司 Multi-channel audio signal decoding method and device
WO2006126858A2 (en) 2005-05-26 2006-11-30 Lg Electronics Inc. Method of encoding and decoding an audio signal
JP4639966B2 (en) * 2005-05-31 2011-02-23 ヤマハ株式会社 Audio data compression method, audio data compression circuit, and audio data expansion circuit
EP1913578B1 (en) 2005-06-30 2012-08-01 LG Electronics Inc. Method and apparatus for decoding an audio signal
JP2009500656A (en) 2005-06-30 2009-01-08 エルジー エレクトロニクス インコーポレイティド Apparatus and method for encoding and decoding audio signals
WO2007004830A1 (en) 2005-06-30 2007-01-11 Lg Electronics Inc. Apparatus for encoding and decoding audio signal and method thereof
US8108219B2 (en) * 2005-07-11 2012-01-31 Lg Electronics Inc. Apparatus and method of encoding and decoding audio signal
JP4859925B2 (en) 2005-08-30 2012-01-25 エルジー エレクトロニクス インコーポレイティド Audio signal decoding method and apparatus
EP1941497B1 (en) 2005-08-30 2019-01-16 LG Electronics Inc. Apparatus for encoding and decoding audio signal and method thereof
JP5173811B2 (en) 2005-08-30 2013-04-03 エルジー エレクトロニクス インコーポレイティド Audio signal decoding method and apparatus
US7788107B2 (en) 2005-08-30 2010-08-31 Lg Electronics Inc. Method for decoding an audio signal
US7751485B2 (en) 2005-10-05 2010-07-06 Lg Electronics Inc. Signal processing using pilot based coding
WO2007040353A1 (en) 2005-10-05 2007-04-12 Lg Electronics Inc. Method and apparatus for signal processing
US8068569B2 (en) 2005-10-05 2011-11-29 Lg Electronics, Inc. Method and apparatus for signal processing and encoding and decoding
US7696907B2 (en) 2005-10-05 2010-04-13 Lg Electronics Inc. Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
KR100857119B1 (en) 2005-10-05 2008-09-05 엘지전자 주식회사 Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
US7672379B2 (en) 2005-10-05 2010-03-02 Lg Electronics Inc. Audio signal processing, encoding, and decoding
US7646319B2 (en) 2005-10-05 2010-01-12 Lg Electronics Inc. Method and apparatus for signal processing and encoding and decoding method, and apparatus therefor
US20070092086A1 (en) 2005-10-24 2007-04-26 Pang Hee S Removing time delays in signal paths
WO2007080211A1 (en) * 2006-01-09 2007-07-19 Nokia Corporation Decoding of binaural audio signals
US8204740B2 (en) 2006-02-06 2012-06-19 Telefonaktiebolaget Lm Ericsson (Publ) Variable frame offset coding
US7461106B2 (en) 2006-09-12 2008-12-02 Motorola, Inc. Apparatus and method for low complexity combinatorial coding of signals
US8576096B2 (en) 2007-10-11 2013-11-05 Motorola Mobility Llc Apparatus and method for low complexity combinatorial coding of signals
US8209190B2 (en) 2007-10-25 2012-06-26 Motorola Mobility, Inc. Method and apparatus for generating an enhancement layer within an audio coding system
US7889103B2 (en) 2008-03-13 2011-02-15 Motorola Mobility, Inc. Method and apparatus for low complexity combinatorial coding of signals
US8639519B2 (en) 2008-04-09 2014-01-28 Motorola Mobility Llc Method and apparatus for selective signal coding based on core encoder performance
EP2124486A1 (en) * 2008-05-13 2009-11-25 Clemens Par Angle-dependent operating device or method for generating a pseudo-stereophonic audio signal
JP5122681B2 (en) 2008-05-23 2013-01-16 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Parametric stereo upmix device, parametric stereo decoder, parametric stereo downmix device, and parametric stereo encoder
US20110137661A1 (en) * 2008-08-08 2011-06-09 Panasonic Corporation Quantizing device, encoding device, quantizing method, and encoding method
EP2347411B1 (en) * 2008-09-17 2012-12-05 France Télécom Pre-echo attenuation in a digital audio signal
JP5309944B2 (en) 2008-12-11 2013-10-09 富士通株式会社 Audio decoding apparatus, method, and program
US8175888B2 (en) 2008-12-29 2012-05-08 Motorola Mobility, Inc. Enhanced layered gain factor balancing within a multiple-channel audio coding system
US8219408B2 (en) 2008-12-29 2012-07-10 Motorola Mobility, Inc. Audio signal decoder and method for producing a scaled reconstructed audio signal
US8200496B2 (en) 2008-12-29 2012-06-12 Motorola Mobility, Inc. Audio signal decoder and method for producing a scaled reconstructed audio signal
US8140342B2 (en) 2008-12-29 2012-03-20 Motorola Mobility, Inc. Selective scaling mask computation based on peak detection
JP5793675B2 (en) * 2009-07-31 2015-10-14 パナソニックIpマネジメント株式会社 Encoding device and decoding device
JP5295380B2 (en) * 2009-10-20 2013-09-18 パナソニック株式会社 Encoding device, decoding device and methods thereof
EP2346028A1 (en) * 2009-12-17 2011-07-20 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. An apparatus and a method for converting a first parametric spatial audio signal into a second parametric spatial audio signal
WO2011076285A1 (en) * 2009-12-23 2011-06-30 Nokia Corporation Sparse audio
US8442837B2 (en) 2009-12-31 2013-05-14 Motorola Mobility Llc Embedded speech and audio coding using a switchable model core
US8423355B2 (en) 2010-03-05 2013-04-16 Motorola Mobility Llc Encoder for audio signal including generic audio and speech frames
US8428936B2 (en) 2010-03-05 2013-04-23 Motorola Mobility Llc Decoder for audio signal including generic audio and speech frames
EP2544465A1 (en) 2011-07-05 2013-01-09 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Method and apparatus for decomposing a stereo recording using frequency-domain processing employing a spectral weights generator
US9129600B2 (en) 2012-09-26 2015-09-08 Google Technology Holdings LLC Method and apparatus for encoding an audio signal
MX345692B (en) * 2012-11-15 2017-02-10 Ntt Docomo Inc Audio coding device, audio coding method, audio coding program, audio decoding device, audio decoding method, and audio decoding program.
MX2021005090A (en) 2015-09-25 2023-01-04 Voiceage Corp Method and system for encoding a stereo sound signal using coding parameters of a primary channel to encode a secondary channel.
CN107742521B (en) 2016-08-10 2021-08-13 华为技术有限公司 Coding method and coder for multi-channel signal
CN109215668B (en) * 2017-06-30 2021-01-05 华为技术有限公司 Method and device for encoding inter-channel phase difference parameters
CN115831130A (en) 2018-06-29 2023-03-21 华为技术有限公司 Coding method, decoding method, coding device and decoding device for stereo signal
CN112233682A (en) * 2019-06-29 2021-01-15 华为技术有限公司 Stereo coding method, stereo decoding method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0497413A1 (en) * 1991-02-01 1992-08-05 Koninklijke Philips Electronics N.V. Subband coding system and a transmitter comprising the coding system
US5285498A (en) * 1992-03-02 1994-02-08 At&T Bell Laboratories Method and apparatus for coding audio signals based on perceptual model
US5434948A (en) * 1989-06-15 1995-07-18 British Telecommunications Public Limited Company Polyphonic coding
US5694332A (en) * 1994-12-13 1997-12-02 Lsi Logic Corporation MPEG audio decoding system with subframe input buffering
US5956674A (en) * 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
CN1357136A (en) * 1999-06-21 2002-07-03 数字剧场系统股份有限公司 Improving sound quality of established low bit-rate audio coding systems without loss of decoder compatibility
US20030061055A1 (en) * 2001-05-08 2003-03-27 Rakesh Taori Audio coding

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5812971A (en) * 1996-03-22 1998-09-22 Lucent Technologies Inc. Enhanced joint stereo coding method using temporal envelope shaping
US5796842A (en) * 1996-06-07 1998-08-18 That Corporation BTSC encoder
US6463410B1 (en) * 1998-10-13 2002-10-08 Victor Company Of Japan, Ltd. Audio signal processing apparatus
JP3335605B2 (en) * 2000-03-13 2002-10-21 日本電信電話株式会社 Stereo signal encoding method
JP2003084790A (en) * 2001-09-17 2003-03-19 Matsushita Electric Ind Co Ltd Speech component emphasizing device
CN1219415C (en) * 2002-07-23 2005-09-14 华南理工大学 5.1 path surround sound earphone repeat signal processing method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5434948A (en) * 1989-06-15 1995-07-18 British Telecommunications Public Limited Company Polyphonic coding
EP0497413A1 (en) * 1991-02-01 1992-08-05 Koninklijke Philips Electronics N.V. Subband coding system and a transmitter comprising the coding system
US5285498A (en) * 1992-03-02 1994-02-08 At&T Bell Laboratories Method and apparatus for coding audio signals based on perceptual model
US5694332A (en) * 1994-12-13 1997-12-02 Lsi Logic Corporation MPEG audio decoding system with subframe input buffering
US5956674A (en) * 1995-12-01 1999-09-21 Digital Theater Systems, Inc. Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
CN1357136A (en) * 1999-06-21 2002-07-03 数字剧场系统股份有限公司 Improving sound quality of established low bit-rate audio coding systems without loss of decoder compatibility
US20030061055A1 (en) * 2001-05-08 2003-03-27 Rakesh Taori Audio coding

Also Published As

Publication number Publication date
RU2005134365A (en) 2006-05-27
EP1845519B1 (en) 2009-09-16
BRPI0410856A (en) 2006-07-04
ATE443317T1 (en) 2009-10-15
EP1845519A3 (en) 2007-11-07
ZA200508980B (en) 2007-03-28
RU2425340C2 (en) 2011-07-27
CA2527971C (en) 2011-03-15
PL1623411T3 (en) 2008-01-31
JP4589366B2 (en) 2010-12-01
AU2004298708A1 (en) 2005-06-30
BRPI0410856B8 (en) 2019-10-15
BRPI0419281B1 (en) 2018-08-14
MXPA05012230A (en) 2006-02-10
RU2007121143A (en) 2008-12-10
BRPI0410856B1 (en) 2019-10-01
WO2005059899A1 (en) 2005-06-30
CA2527971A1 (en) 2005-06-30
AU2004298708B2 (en) 2008-01-03
SE0400417D0 (en) 2004-02-20
EP1845519A2 (en) 2007-10-17
CN101118747A (en) 2008-02-06
CA2690885C (en) 2014-01-21
JP2008026914A (en) 2008-02-07
SE527670C2 (en) 2006-05-09
DE602004008613T2 (en) 2008-06-12
HK1115665A1 (en) 2008-12-05
CA2690885A1 (en) 2005-06-30
RU2305870C2 (en) 2007-09-10
JP2007529021A (en) 2007-10-18
SE0400417L (en) 2005-06-20
DE602004008613D1 (en) 2007-10-11
HK1091585A1 (en) 2007-01-19
ATE371924T1 (en) 2007-09-15
DE602004023240D1 (en) 2009-10-29
CN1816847A (en) 2006-08-09
EP1623411A1 (en) 2006-02-08
CN101118747B (en) 2011-02-23
EP1623411B1 (en) 2007-08-29
JP4335917B2 (en) 2009-09-30

Similar Documents

Publication Publication Date Title
CN100559465C (en) The variable frame length coding that fidelity is optimized
CN101124740B (en) Multi-channel audio encoding and decoding method and device, audio transmission system
Brandenburg et al. Overview of MPEG audio: Current and future standards for low bit-rate audio coding
US7809579B2 (en) Fidelity-optimized variable frame length encoding
US5701346A (en) Method of coding a plurality of audio signals
CN102177542B (en) Energy conservative multi-channel audio coding
CN101809655B (en) Apparatus and method for encoding a multi channel audio signal
CN100505554C (en) Method for decoding and rebuilding multi-sound channel audio signal from audio data flow after coding
RU2495503C2 (en) Sound encoding device, sound decoding device, sound encoding and decoding device and teleconferencing system
US20080319739A1 (en) Low complexity decoder for complex transform coding of multi-channel sound
JPH06149292A (en) Method and device for high-efficiency encoding
JP4736812B2 (en) Signal encoding apparatus and method, signal decoding apparatus and method, program, and recording medium
US7725324B2 (en) Constrained filter encoding of polyphonic signals
CN1783726B (en) Decoder for decoding and reestablishing multi-channel audio signal from audio data code stream
Noll Digital audio for multimedia
Bosi et al. MPEG-2 AAC
Ferreira The perceptual audio coding concept: from speech to high-quality audio coding
AU2007237227A1 (en) Fidelity-optimised pre-echo suppressing encoding
MX2008009186A (en) Complex-transform channel coding with extended-band frequency coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1091585

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1091585

Country of ref document: HK