CN102144405B

CN102144405B - Interaural time delay restoration system and method

Info

Publication number: CN102144405B
Application number: CN200980134440.3A
Authority: CN
Inventors: J·D·约翰斯顿
Original assignee: DTS BVI Ltd
Current assignee: DTS BVI Ltd
Priority date: 2008-09-04
Filing date: 2009-08-14
Publication date: 2014-12-31
Anticipated expiration: 2029-08-14
Also published as: KR20110063807A; WO2010027403A8; KR101636592B1; US20100054482A1; EP2321977A4; JP2012502550A; EP2321977A1; JP5662318B2; EP2321977B1; HK1156171A1; US8233629B2; WO2010027403A1; CN102144405A; TW201014372A; TWI533718B

Abstract

An apparatus for processing audio data comprising an interaural time delay correction factor unit for receiving a plurality of channels of audio data and generating an interaural time delay correction factor. An interaural time delay correction factor insertion unit for modifying the plurality of channels of audio data as a function of the interaural time delay correction factor.

Description

Interaural time delay restoration system and method

Technical field

The present invention relates to the system for the treatment of voice data, relating more specifically to for recovering stereo or the system and method for the interaural time delay of other multichannel audio data (interaural time delay).

Background technology

When voice data processed to generate Composite tone time, use mixer mixes such voice data usually, and mixer utilizes acoustic image potentiometer (panning potentiometer) or simulated sound as other system of potentiometer function or device.Acoustic image potentiometer may be used for single input channel to be dispensed to two or more output channels, the stereo output in such as left and right, thus simulation is relative to the locus between the leftmost position of audience and least significant.But such acoustic image potentiometer can not be added in the interaural difference usually existed in on-the-spot demonstration usually.

Summary of the invention

According to the present invention, provide the system and method recovered for interaural time delay, it is based on the relative amplitude of two or more voice data passages, between these voice data passages, add the time delay corresponding with Late phase between estimated ear.

According to exemplary embodiment of the present invention, provide the equipment for the treatment of voice data.This equipment comprises interaural time delay correction coefficient unit, for receiving multiple voice data passage and generating interaural time delay correction coefficient, such as when this multiple voice data passage comprises the audio-visual-data without the interaural time delay be associated.Interaural time delay correction coefficient plug-in unit revises this multiple voice data passage according to interaural time delay correction coefficient, to add estimated interaural time delay to improve audio quality.

Those skilled in the art, after reading the specific descriptions below in conjunction with accompanying drawing, will be further understood that advantage of the present invention and advantageous characteristic feature, and other importance.

Accompanying drawing explanation

Fig. 1 is the schematic diagram of the system for time adjustment between ear according to exemplary embodiment of the present invention;

Fig. 2 is the schematic diagram of the system of the peak difference for the left channel audio data and right channel audio data that detect special frequency band according to exemplary embodiment of the present invention;

Fig. 3 is the schematic diagram of system for level and smooth interaural difference and level (level) difference according to exemplary embodiment of the present invention;

Fig. 4 be according to exemplary embodiment of the present invention for the treatment of voice data to introduce the schematic diagram of time or substandard method between ear;

Fig. 5 is the schematic diagram of the system for interaural time delay correction according to exemplary embodiment of the present invention; And

Fig. 6 is the flow chart for controlling to control with acoustic image the method arranging the interaural time delay be associated according to exemplary embodiment of the present invention.

Embodiment

In the following description, the parts that whole specification is identical with in accompanying drawing mark with identical reference number respectively.The accompanying drawing drawn may not be proportional, and clear for simplicity, and some parts may be shown as summary or exemplary form, and may identify by commercial code.

Fig. 1 is the schematic diagram of the system 100 for time adjustment between ear according to exemplary embodiment of the present invention.System 100 can be realized by the appropriately combined of hardware, software or hardware and software, and can be the one or more software systems operated on digital signal processing platform.As used in this, " hardware " can comprise the combination of discrete device, integrated circuit, application-specific integrated circuit (ASIC), field programmable gate array or other suitable hardware.As used in this, " software " can comprise one or more object, agency, thread, code line, subroutine, independent software application, run with two or more software applications or two row that run on two or more processors or more line code or other suitable software configuration or other suitable software configuration.In one exemplary embodiment, software can comprise a line or lines of code or other suitable software configuration of running with common software application program, such as operating system, and a line run with proprietary software application or lines of code or other suitable software configuration.

System 100 comprises low delay filter group 102 and 104, and it receives left channel audio time signal and right channel audio time signal respectively.In one exemplary embodiment, low delay filter group 102 and 104 with a series of samplings of sample frequency audio reception data, and can process based on the sampling of predetermined quantity the voice data sampled.Low delay filter group 102 and 104 is used to determine the time delay between the peak amplitude of multiple frequency band during a time period.In one exemplary embodiment, the quantity of frequency band may be relevant with bark quantity, equivalent rectangular bandwidth (ERB) or other suitable voice data psychologic acoustics frequency band, and therefore total output quantity of low delay filter group 102 and 104 equals bark quantity or the ERB quantity of each input sample.Similarly, over-sampling can be used to reduce the possibility producing audio artifacts (audio artifacts), such as by using multiple filter, wherein each filter is used for one of multiple respective sub-bands of each frequency band (thus producing multiple subband for each frequency band be associated), or in other suitable ways.

Channel delay detector 106 receives input from low delay filter group 102 and 104, and determines the poor correction coefficient of each frequency band in multiple frequency band.In one exemplary embodiment, channel delay detector 106 can generate the amount of the phase difference that will add frequency-region signal to, so that such as between left passage and right passage, generation time is poor, thus interaural time delay is inserted into employ acoustic image but do not comprise in the signal of the time delay be associated.In one exemplary embodiment, voice data can use acoustic image potentiometer to carry out audio mixing, to make input channel have obvious locus between the most left passage and the rightest passage of stereo data, or other appropriate ways can be used, comprise the situation that there is two or more passage.The Sound image localization of even now can be used to virtual space position, movement or other effect, but the interaural time delay be associated with live audio data cannot be rebuild by such Sound image localization.Such as, when sound source is positioned at the left side of audience, receives at the left ear of audience between the time receiving this audio signal from the time of the audio signal in this source and the auris dextra of audience and life period is postponed.Similarly, when sound source moves to the right side of audience from the left side of audience, the time delay be associated will be reduced to zero when sound source is positioned at audience dead ahead, then will increase relative to auris dextra.Use simple acoustic image potentiometer to come virtual space position or movement can not create these time delays be associated, and such time delay be associated can use channel delay detector 106 to carry out modeling and be inserted in stereo or other multi-channel audio signal.

Similarly, channel delay detector 106 also may be used for correcting level error between ear, such as between left passage and right passage, but life period postpones there is not the amplitude difference be associated.Such as, audio frequency process may make the level be associated with the audio signal of carrying out Sound image localization change, thus accurate recording has the audio signal of the time delay be associated between left passage and right passage that left channel sound level and right channel sound level still can be made not to reflect live audio signal.Channel delay detector 106 or alternatively can also be used to modeling and is inserted in stereo or other multi-channel audio signal by the level correction coefficient be associated.

Channel delay detector 106 exports multiple M correction coefficient, and it is used to interaural difference or level error to insert in multiple voice data passage.The quantity of correction coefficient can be less than the quantity of the output of the low delay filter group 102 or 104 of the change utilized in the level and smooth perception frequency band of over-sampling.In one exemplary embodiment, perception frequency band is sampled with the frequency being three times in bandwidth, and N will equal three times of M.

System 100 comprises delay 108 and 110, to change voice cpich signal when it receives left and right, and by a certain amount of for these signal delays, the delay that this amount corresponds to through too low delay filter group 102 and 104 and channel delay detector 106 deducts the delay produced by zero padding Hann window (zero-padded Hann window) 112 and 114 and fast Fourier transformer 116 and 118.

The a certain amount of time audio-variable signal revising left and right passage of zero padding Hann window 112 and 114, thus produce Hann window amendment signal.Zero padding Hann window 112 and 114 can be used to prevent to produce in handled signal discontinuous, thisly discontinuously may generate phase shift variations, causes generating audio artifacts in handled voice data.Can use or alternatively use the Hann window of other type or other suitable process to prevent discontinuous in addition.

Time domain left and right channel audio data are converted to frequency domain data by fast Fourier transformer 116 and 118.In one exemplary embodiment, fast Fourier transformer 116 and 118 receives the time sampling (it is revised to increase the quantity of sampling by zero padding Hann window 112 and 114) of the predetermined quantity of time-domain signal, and generates the frequency component of the respective numbers of this time-domain signal.

The Fast Fourier Transform data of 120 receptions from fast Fourier transformer 116 and 118 is inserted in phase shift, and insert phase shift in the signal based on the correction coefficient received from channel delay detector 106, the amplitude be associated of each window or group of windows is not such as revised by the real part of Fourier transform data of the single frequency window of amendment (frequency bin) or frequency window group and imaginary.In one exemplary embodiment, phase shift can be relevant to the differential seat angle between the electron channel determined by channel delay detector 106, thus the half of the phase place of main channel this differential seat angle in advance, and the half of this differential seat angle of delayed phase of subchannel.

Inverse fast Fourier transform device 122 and 124 receives the dephased frequency-region signal from phase shift insertion 120, and performs inverse fast Fourier transform to generate time varying signal to this signal.Then left and right passage time varying signal is separately provided to overlap-add (overlap add) 126 and 128, and it performs overlap-add operation with the process explaining (account for) zero padding Hann window 112 and 114 to signal.Overlap-add 126 and 128 outputs signal to shifter-adder register 130 and 132, and it exports the time signal L of displacement ^idc(t) and R ^idc(t).

In operation, the signal that system 100 makes to comprise the acoustic image without the interaural difference be associated can be compensated, thus inserts interaural difference.Therefore, system 100 recovers the interaural difference be usually found in audio signal, thus improves audio quality.

Fig. 2 is the schematic diagram of the system 200 of the peak difference of left and right channel audio data for detecting special frequency band according to exemplary embodiment of the present invention.System 200 can be used to the peak value between the left channel data of each the independent frequency band detecting voice data and right channel data, and is that each frequency band generates correction coefficient.

System 200 comprises Hilbert envelope (Hilbert envelope) 202 and 204, and it receives left and right time-domain signal and generates the Hilbert envelope of the predetermined frequency band of these signals.In one exemplary embodiment, with comparing handled by the fast Fourier transformer 116 and 118 of system 100, Hilbert envelope 202 can operate the time-domain sampling of lesser amt, therefore make system 200 can generate correction coefficient rapidly, and avoid may owing to being transformed into the extra delay that frequency domain causes to generate the correction coefficient that is associated by time channel time domain data.

Peak detector 206 and 208 receives left and right passage Hilbert envelope respectively, and determines the peak amplitude of each signal and the time be associated of this peak amplitude.Then peak value and time data are provided to amplitude and time difference detector 210, and it determines for corresponding peak amplitude, and whether life period is poor.If it is poor accordingly that amplitude and time difference detector 210 are determined not exist between the peak amplitude time, interaural difference then can be used to correct 214, to be determined the correction coefficient angle T that will be inserted in frequency domain audio data by the range value comparing left and right channel peak amplitude ^cOR.In one exemplary embodiment, correction coefficient angle T ^cORcan by determining that the angle that atan2 (left channel amplitude, right channel amplitude) deducts 45 degree is determined.Similarly, other suitable process can be used to determine correction coefficient angle.Suitable threshold value also can be employed, and can generate correction coefficient angle during to make the there is little time difference between peak amplitude value.

Life period difference between the peak value of left and right channel data but amplitude equal, level error can be used between ear to correct 212.In this exemplary embodiment, amplitude can pass through correction coefficient L ^cORregulate, thus the value that the passage with leading audio peak is higher, and the value that the passage with backward audio peak is lower, such as, by deducting L from delayed passage ^cOR, increase 0.5*L ^cORto leading passage and deduct 0.5*L from delayed passage ^cOR, or with other appropriate ways.Between ear, level error corrects 212 and can also use threshold value, to establish when poor higher than the threshold time then applying level correction during its value, and when poor lower than the threshold time not applying level correction during its value.

In operation, system 200 can be utilized to generate time and the level error correction coefficient of left and right signal, but thus for having left or right acoustic image the signal of the time difference be not associated generation interaural difference correction coefficient, and the signal that still there is not the acoustic image amplitude be associated for there is interaural difference generates level correction.

Fig. 3 is the schematic diagram for time and substandard system 300 between level and smooth ear according to exemplary embodiment of the present invention.System 300 comprises time and level error correcting unit 302 to 306 between ear, and it generates time and/or level error correction coefficient between the ear for different frequency bands respectively.In one exemplary embodiment, described frequency band can be the some parts of bark, ERB or other suitable psychoacoustic frequency band, thus system 300 can be used to generate for the single correction coefficient of this psychologic acoustics frequency band based on the subcomponent of psychologic acoustics frequency band.

Time smoothing unit 308 to 312 is used to carry out time smoothing to the output from time between ear or level error corrective system 302 to 306 respectively.In one exemplary embodiment, time smoothing unit 308 to 312 can receive the output sequence from time between ear and level error correcting unit 302 to 306, and the sampling of the predetermined quantity of this sequence can be stored, thus make change between continuous sampling can by average, or otherwise by smoothly.

Frequency band smooth unit 314 receives time or level error correction coefficient between each ear from time between ear or level error correcting unit 302 to 306, and to the time between these ears or level error correction coefficient smoothing.In one exemplary embodiment, when bark or ERB frequency band has been divided into three, frequency band level and smooth 314 can three frequency correction coefficients of average this frequency band be associated, weighted average can be determined, by level and smooth coefficient on service time, or can perform other suitable smoothing processing.Frequency band level and smooth 314 generates single phase correction coefficient for each frequency band.

In operation, system 300 is based on time, frequency, time and frequency or other is suitably basic, for time between ear or level error correction coefficient perform level and smooth, wherein, arrange to detect the acoustic image without level or the time difference be associated by analyzing left and right channel audio data, thus generate time or level error correction coefficient between ear.Therefore, system 300, by guaranteeing that the change between time between ear or level error correction coefficient can not promptly change, helps avoid producing audio artifacts.

Fig. 4 be according to exemplary embodiment of the present invention for the treatment of voice data to introduce the schematic diagram of time or substandard method 400 between ear.Method 400 starts 402, determines left and right amplitude envelops.In one exemplary embodiment, Hilbert envelope detector or other suitable system can be used to the amplitude of the peak value determining frequency band, the time be associated with this peak value and other proper data.Then method proceeds to 404.

404, the time be associated of the peak value in detected amplitude envelope and peak value.In one exemplary embodiment, the simple peak detector of such as range detector and so on can be used to detect the time interval be associated occurring peak value place.Method proceeds to 406.

406, determine whether life period is poor between the peak value of left and right channel data.In one exemplary embodiment, the time difference can comprise the buffering be associated, thus, if the time between peak value is less than scheduled volume, then determine that life period is not poor.If determine that life period is poor, thus do not need interaural time delay to recover, then method proceeds to 408, determines whether there is level error between the amplitude of two signals.If determine to there is level error, then method proceeds to 410.Otherwise method proceeds to 412, correct the level between the channel audio data of left and right.In one exemplary embodiment, leading channel amplitude can be retained and not change, and the channel amplitude fallen behind is reduced and leading and fall behind the coefficient of the difference correlation between passage, or other can be used suitably to process.

If determine that life period is not poor between the channel peak range value of left and right, then method proceeds to 414, and at this, level error is converted into phasing angle.In one exemplary embodiment, phasing angle by deducting 45 degree to determine from atan2 (left channel amplitude, right channel amplitude), or can use other suitable relation.Then method proceeds to 416, and phase difference is assigned to left and right passage.In one exemplary embodiment, this distribution can be performed by this phase difference of decile, thus these passages are shifted to an earlier date and delayed identical amount.In addition, in due course, can weighted difference be used, or other can be used suitably to process.Then method proceeds to 418.

418, the difference between the channel phases correction angle of level and smooth left and right.In one exemplary embodiment, this difference can be temporally level and smooth, next level and smooth or next level and smooth with other appropriate ways based on the phasing angle of adjacent channel.Then method proceeds to 420.

420, difference correction coefficient is applied to audio signal.In one exemplary embodiment, the phase difference corresponding with the time difference can be added in a frequency domain, such as, use known add in time signal by adding in a frequency domain or deducting the phase in-migration be associated or deduct the method for time difference.Similarly, other suitable process can also be used.

In operation, method 400 makes phase place or amplitude correction coefficient between ear can be determined and be applied to multiple voice data passage.Although show two exemplary path, in due course, more Multi-audio-frequency data channel also can be processed, and such as in 5.1 sound systems, 7.1 sound systems or other suitable sound system, between increase ear, phase place or amplitude correction coefficient are to voice data.

Fig. 5 is the schematic diagram of the system 500 for interaural time delay correction according to exemplary embodiment of the present invention.System 500 makes interaural time delay can be compensated before audio mixing, thus the acoustic image generating the interaural time delay that more accurately reflection and the sound source produced in the physical locations be associated are associated controls to export.

System 500 comprises left passage variable delay 502, right passage variable delay 504 and acoustic image and controls 506, its each can realize with the appropriately combined of hardware, software or hardware and software, and can be the one or more software systems run on digital signal processing platform.Acoustic image control 506 allow users select acoustic image arrange with by time become voice data input distribute to left-channel signal and right channel signal.In one exemplary embodiment, acoustic image control 506 can comprise arrange for the multiple positions be associated between virtual left position and virtual right position in each the time-delay value be associated.In this exemplary embodiment, acoustic image controls 506 can when the most left, center or least significant be selected, and forbidding variable delay controls, and do not need to postpone because arrange for these.For control at acoustic image 506 the most left, center or the rightest between setting, the length of delay corresponding with the interaural time delay that the sound source for being positioned at the position be associated generates can be generated.

Acoustic image controls 506 also can comprise activity (active) acoustic image adjustment feature, and it makes user that live audio-video can be selected to adjust, such as, when user intends from left to right or adjust acoustic image from right to left.In this exemplary embodiment, time delay can be provided for the most left or the rightest acoustic image controls 506 settings, thus make user control can adjust the acoustic image of audio frequency input when 506 settings are removed from the most left or the rightest setting at acoustic image and not generate audio artifacts, otherwise time delay controls 506 maximum delay value arranged by jumping to from the zero-lag being used for the most left or the rightest setting for the acoustic image of the most left or the rightest contiguous setting.

Left passage variable delay 502 and right passage variable delay 504 can use the interaural time delay correction coefficient plug-in unit of system 100 or realize with other appropriate ways.

In operation, system 500 makes can add interaural time delay when voice-grade channel adjusts acoustic image between two output channels (such as, left passage and right passage or other suitable passage).System 500 can forbid the time delay of setting when not needing time delay.

Fig. 6 is the flow chart for controlling the method 600 controlling to arrange the interaural time delay be associated with acoustic image according to exemplary embodiment of the present invention.Method 600 starts 602, receives the time-domain audio channel data of such as user-selected passage.Then the method proceeds to 604, detects acoustic image and controls to arrange.It can be potentiometer that acoustic image controls, virtual sound image controls or other suitably controls.Then method proceeds to 606.

606, determine whether to need acoustic image to postpone to arrange.In one exemplary embodiment, the acoustic image can forbidding predetermined acoustic image control position postpones, such as the most left, the rightest or center, described predetermined acoustic image control position.In a further exemplary embodiment, it can be the most left or the delay of least significant generation acoustic image, such as when user have selected acoustic image control to arrange to allow user to adjust acoustic image actively between the most left and least significant, thus avoid when acoustic image control from the rightest or leftmost position leaves time discontinuity rise time delay.If determine not need acoustic image to postpone, method proceeds to 612, otherwise method proceeds to 608.

608, control to arrange computing relay amount based on acoustic image.In one exemplary embodiment, maximum time delay can be generated when acoustic image controls to be in the most left or least significant, such as, when have selected live audio-video adjustment.Similarly, when to have selected fixed sound picture and arrange, time delay (because not generating for relative passage the signal be associated) is not needed for the most left or the rightest setting.Control to arrange for the acoustic image between the rightest and leftmost position is arranged, the time delay corresponding with the time delay of the position between them is calculated, and this time delay reduces close to center along with acoustic image control position.Then method proceeds to 610.

610, the delay calculated is applied to one or more variable delay.In one exemplary embodiment, postpone to be added to one of left passage or right passage, or other suitable delay can be used to arrange.In a further exemplary embodiment, the interaural time delay correction coefficient plug-in unit of system 100 can be used or add delay with other appropriate ways.Then method proceeds to 612.

612, determine whether that other voice-grade channel data need process, such as, by determining whether there is other data sampling in data buffer, or by other appropriate ways.If need other data processing, method is back to 602, otherwise method proceeds to 614 and terminates.

In operation, method 600 allows to control to arrange generation interaural time delay based on acoustic image.Method 600 is by using acoustic image to control to make sound position can be simulated closer to the mode of actual sound source position than the simple Sound image localization between left passage and right passage not carrying out time adjustment.

Although be described herein the exemplary embodiment of system and method for the present invention, but those skilled in the art should also be appreciated that, when not deviating from the scope and spirit of claims, various substitutions and modifications can be carried out to this system and method.

Claims

1., for the treatment of an equipment for voice data, comprising:

Acoustic image control unit, for distributing to voice data left channel audio data and right channel audio data;

Interaural time delay correction coefficient unit, for receiving described left channel audio data and described right channel audio data, and based on acoustic image control unit voice data distributed to left channel audio data and right channel audio data generate interaural time delay correction coefficient, wherein said interaural time delay correction coefficient unit comprises:

Time difference detector, for receiving in the left channel audio data of predetermined frequency band and right channel audio data the peak amplitude value of each and time of being associated, and generates difference correction data between ear;

Interaural difference correcting unit, for receiving difference correction data between described ear, and generates the time adjustment coefficient being used for interaural time delay correction coefficient plug-in unit; And

Level error correcting unit between ear, for generating the level correction coefficient for described interaural time delay correction coefficient plug-in unit;

Described interaural time delay correction coefficient plug-in unit, for according to described interaural time delay correction coefficient, revises described left channel audio data and right channel audio data, to generate amended left channel audio data and amended right channel audio data;

Multiple time smoothing unit, for carrying out time smoothing to from the output of level error correcting unit between interaural difference correcting unit or ear respectively; And

Frequency band smooth unit, for smoothing to level error correction coefficient between interaural difference correction coefficient and ear.

2. equipment according to claim 1, wherein said interaural time delay correction coefficient unit comprises low delay filter group, for receiving one of left channel audio data and right channel audio data, and it is the amplitude envelops that predetermined frequency band generates as the function of time.

3. equipment according to claim 1, wherein said interaural time delay correction coefficient unit comprises peak detector, for receiving one of left channel audio data and right channel audio data, and it is predetermined frequency band generation peak amplitude value and the time be associated.

4. equipment according to claim 1, wherein said interaural time delay correction coefficient plug-in unit comprises delay cell, for left channel audio data are postponed the amount relevant to the delay of described interaural time delay correction coefficient unit with one of right channel audio data.

5. equipment according to claim 1, wherein said interaural time delay correction coefficient plug-in unit comprises Hann window unit, for receiving one of left channel audio data and right channel audio data, and Hann window is applied to received channel audio data.

6. equipment according to claim 1, wherein said interaural time delay correction coefficient plug-in unit comprises phase shift plug-in unit, for inserting phase shift in multiple frequency domain audio channel signal.

7., for the treatment of a method for voice data, comprising:

By acoustic image control unit, voice data is distributed to left channel audio data and right channel audio data;

To determine in left channel audio data and right channel audio data the peak amplitude of each;

Detect the delay be associated with described peak amplitude; And

If the delay detected is less than threshold value, inserts between left channel audio data and right channel audio data and postpone, comprising:

When there is interaural difference but there is not the acoustic image amplitude be associated between described left channel audio data and described right channel audio data, level error correcting unit between ear is used to generate level error correction coefficient between ear, and between described left channel audio data and described right channel audio data, there is left or right acoustic image but the time difference be not associated, use interaural difference correcting unit to generate interaural difference correction coefficient;

Time smoothing unit is utilized to carry out time smoothing to from the output of level error correcting unit between interaural difference correcting unit or ear respectively; And

Utilize frequency band smooth unit smoothing to level error correction coefficient between interaural difference correction coefficient and ear respectively.

8. method according to claim 7, wherein determines that in left channel audio data and right channel audio data, the amplitude envelops of each comprises: the amplitude envelops determining the predetermined frequency band of each in left channel audio data and right channel audio data.

9. method according to claim 7, wherein determines that in left channel audio data and right channel audio data, the amplitude envelops of each comprises: with each predetermined frequency band in Hilbert envelope unit process left channel audio data and right channel audio data.

10. method according to claim 7, wherein detects the delay be associated with the peak value of each amplitude envelops and comprises: the time that the time be associated by the peak amplitude with a passage is associated with the peak amplitude with second channel compares.

11. methods according to claim 7, also comprise and generate the delay of inserting based on peak amplitude.

12. methods according to claim 7, also comprise and generate the delay of inserting based on peak amplitude, it comprises by determining atan2 (peak1, peak2) 45 degree are deducted to generate inserted delay, wherein atan2 is two variable arctan functions, obtain angle to export, peak1 is the value of the first peak amplitude, and peak2 is the value of the second peak amplitude.

13. methods according to claim 7, if the delay wherein detected is less than threshold value, then between left channel audio data and right channel audio data, inserts delay and comprise:

Described left channel audio data and right channel audio data are transformed into frequency domain from time domain;

Inserted delay is converted to phase-shift value;

In a frequency domain, the Part I of described phase-shift value is added to left channel audio data; And

In a frequency domain, from right channel audio data, deduct the Part II of described phase-shift value.