CN102144405B - Interaural time delay restoration system and method - Google Patents
Interaural time delay restoration system and method Download PDFInfo
- Publication number
- CN102144405B CN102144405B CN200980134440.3A CN200980134440A CN102144405B CN 102144405 B CN102144405 B CN 102144405B CN 200980134440 A CN200980134440 A CN 200980134440A CN 102144405 B CN102144405 B CN 102144405B
- Authority
- CN
- China
- Prior art keywords
- audio data
- channel audio
- correction coefficient
- interaural
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/002—Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/07—Synergistic effects of band splitting and sub-band processing
Abstract
An apparatus for processing audio data comprising an interaural time delay correction factor unit for receiving a plurality of channels of audio data and generating an interaural time delay correction factor. An interaural time delay correction factor insertion unit for modifying the plurality of channels of audio data as a function of the interaural time delay correction factor.
Description
Technical field
The present invention relates to the system for the treatment of voice data, relating more specifically to for recovering stereo or the system and method for the interaural time delay of other multichannel audio data (interaural time delay).
Background technology
When voice data processed to generate Composite tone time, use mixer mixes such voice data usually, and mixer utilizes acoustic image potentiometer (panning potentiometer) or simulated sound as other system of potentiometer function or device.Acoustic image potentiometer may be used for single input channel to be dispensed to two or more output channels, the stereo output in such as left and right, thus simulation is relative to the locus between the leftmost position of audience and least significant.But such acoustic image potentiometer can not be added in the interaural difference usually existed in on-the-spot demonstration usually.
Summary of the invention
According to the present invention, provide the system and method recovered for interaural time delay, it is based on the relative amplitude of two or more voice data passages, between these voice data passages, add the time delay corresponding with Late phase between estimated ear.
According to exemplary embodiment of the present invention, provide the equipment for the treatment of voice data.This equipment comprises interaural time delay correction coefficient unit, for receiving multiple voice data passage and generating interaural time delay correction coefficient, such as when this multiple voice data passage comprises the audio-visual-data without the interaural time delay be associated.Interaural time delay correction coefficient plug-in unit revises this multiple voice data passage according to interaural time delay correction coefficient, to add estimated interaural time delay to improve audio quality.
Those skilled in the art, after reading the specific descriptions below in conjunction with accompanying drawing, will be further understood that advantage of the present invention and advantageous characteristic feature, and other importance.
Accompanying drawing explanation
Fig. 1 is the schematic diagram of the system for time adjustment between ear according to exemplary embodiment of the present invention;
Fig. 2 is the schematic diagram of the system of the peak difference for the left channel audio data and right channel audio data that detect special frequency band according to exemplary embodiment of the present invention;
Fig. 3 is the schematic diagram of system for level and smooth interaural difference and level (level) difference according to exemplary embodiment of the present invention;
Fig. 4 be according to exemplary embodiment of the present invention for the treatment of voice data to introduce the schematic diagram of time or substandard method between ear;
Fig. 5 is the schematic diagram of the system for interaural time delay correction according to exemplary embodiment of the present invention; And
Fig. 6 is the flow chart for controlling to control with acoustic image the method arranging the interaural time delay be associated according to exemplary embodiment of the present invention.
Embodiment
In the following description, the parts that whole specification is identical with in accompanying drawing mark with identical reference number respectively.The accompanying drawing drawn may not be proportional, and clear for simplicity, and some parts may be shown as summary or exemplary form, and may identify by commercial code.
Fig. 1 is the schematic diagram of the system 100 for time adjustment between ear according to exemplary embodiment of the present invention.System 100 can be realized by the appropriately combined of hardware, software or hardware and software, and can be the one or more software systems operated on digital signal processing platform.As used in this, " hardware " can comprise the combination of discrete device, integrated circuit, application-specific integrated circuit (ASIC), field programmable gate array or other suitable hardware.As used in this, " software " can comprise one or more object, agency, thread, code line, subroutine, independent software application, run with two or more software applications or two row that run on two or more processors or more line code or other suitable software configuration or other suitable software configuration.In one exemplary embodiment, software can comprise a line or lines of code or other suitable software configuration of running with common software application program, such as operating system, and a line run with proprietary software application or lines of code or other suitable software configuration.
System 100 comprises low delay filter group 102 and 104, and it receives left channel audio time signal and right channel audio time signal respectively.In one exemplary embodiment, low delay filter group 102 and 104 with a series of samplings of sample frequency audio reception data, and can process based on the sampling of predetermined quantity the voice data sampled.Low delay filter group 102 and 104 is used to determine the time delay between the peak amplitude of multiple frequency band during a time period.In one exemplary embodiment, the quantity of frequency band may be relevant with bark quantity, equivalent rectangular bandwidth (ERB) or other suitable voice data psychologic acoustics frequency band, and therefore total output quantity of low delay filter group 102 and 104 equals bark quantity or the ERB quantity of each input sample.Similarly, over-sampling can be used to reduce the possibility producing audio artifacts (audio artifacts), such as by using multiple filter, wherein each filter is used for one of multiple respective sub-bands of each frequency band (thus producing multiple subband for each frequency band be associated), or in other suitable ways.
Channel delay detector 106 receives input from low delay filter group 102 and 104, and determines the poor correction coefficient of each frequency band in multiple frequency band.In one exemplary embodiment, channel delay detector 106 can generate the amount of the phase difference that will add frequency-region signal to, so that such as between left passage and right passage, generation time is poor, thus interaural time delay is inserted into employ acoustic image but do not comprise in the signal of the time delay be associated.In one exemplary embodiment, voice data can use acoustic image potentiometer to carry out audio mixing, to make input channel have obvious locus between the most left passage and the rightest passage of stereo data, or other appropriate ways can be used, comprise the situation that there is two or more passage.The Sound image localization of even now can be used to virtual space position, movement or other effect, but the interaural time delay be associated with live audio data cannot be rebuild by such Sound image localization.Such as, when sound source is positioned at the left side of audience, receives at the left ear of audience between the time receiving this audio signal from the time of the audio signal in this source and the auris dextra of audience and life period is postponed.Similarly, when sound source moves to the right side of audience from the left side of audience, the time delay be associated will be reduced to zero when sound source is positioned at audience dead ahead, then will increase relative to auris dextra.Use simple acoustic image potentiometer to come virtual space position or movement can not create these time delays be associated, and such time delay be associated can use channel delay detector 106 to carry out modeling and be inserted in stereo or other multi-channel audio signal.
Similarly, channel delay detector 106 also may be used for correcting level error between ear, such as between left passage and right passage, but life period postpones there is not the amplitude difference be associated.Such as, audio frequency process may make the level be associated with the audio signal of carrying out Sound image localization change, thus accurate recording has the audio signal of the time delay be associated between left passage and right passage that left channel sound level and right channel sound level still can be made not to reflect live audio signal.Channel delay detector 106 or alternatively can also be used to modeling and is inserted in stereo or other multi-channel audio signal by the level correction coefficient be associated.
Channel delay detector 106 exports multiple M correction coefficient, and it is used to interaural difference or level error to insert in multiple voice data passage.The quantity of correction coefficient can be less than the quantity of the output of the low delay filter group 102 or 104 of the change utilized in the level and smooth perception frequency band of over-sampling.In one exemplary embodiment, perception frequency band is sampled with the frequency being three times in bandwidth, and N will equal three times of M.
System 100 comprises delay 108 and 110, to change voice cpich signal when it receives left and right, and by a certain amount of for these signal delays, the delay that this amount corresponds to through too low delay filter group 102 and 104 and channel delay detector 106 deducts the delay produced by zero padding Hann window (zero-padded Hann window) 112 and 114 and fast Fourier transformer 116 and 118.
The a certain amount of time audio-variable signal revising left and right passage of zero padding Hann window 112 and 114, thus produce Hann window amendment signal.Zero padding Hann window 112 and 114 can be used to prevent to produce in handled signal discontinuous, thisly discontinuously may generate phase shift variations, causes generating audio artifacts in handled voice data.Can use or alternatively use the Hann window of other type or other suitable process to prevent discontinuous in addition.
Time domain left and right channel audio data are converted to frequency domain data by fast Fourier transformer 116 and 118.In one exemplary embodiment, fast Fourier transformer 116 and 118 receives the time sampling (it is revised to increase the quantity of sampling by zero padding Hann window 112 and 114) of the predetermined quantity of time-domain signal, and generates the frequency component of the respective numbers of this time-domain signal.
The Fast Fourier Transform data of 120 receptions from fast Fourier transformer 116 and 118 is inserted in phase shift, and insert phase shift in the signal based on the correction coefficient received from channel delay detector 106, the amplitude be associated of each window or group of windows is not such as revised by the real part of Fourier transform data of the single frequency window of amendment (frequency bin) or frequency window group and imaginary.In one exemplary embodiment, phase shift can be relevant to the differential seat angle between the electron channel determined by channel delay detector 106, thus the half of the phase place of main channel this differential seat angle in advance, and the half of this differential seat angle of delayed phase of subchannel.
Inverse fast Fourier transform device 122 and 124 receives the dephased frequency-region signal from phase shift insertion 120, and performs inverse fast Fourier transform to generate time varying signal to this signal.Then left and right passage time varying signal is separately provided to overlap-add (overlap add) 126 and 128, and it performs overlap-add operation with the process explaining (account for) zero padding Hann window 112 and 114 to signal.Overlap-add 126 and 128 outputs signal to shifter-adder register 130 and 132, and it exports the time signal L of displacement
idc(t) and R
idc(t).
In operation, the signal that system 100 makes to comprise the acoustic image without the interaural difference be associated can be compensated, thus inserts interaural difference.Therefore, system 100 recovers the interaural difference be usually found in audio signal, thus improves audio quality.
Fig. 2 is the schematic diagram of the system 200 of the peak difference of left and right channel audio data for detecting special frequency band according to exemplary embodiment of the present invention.System 200 can be used to the peak value between the left channel data of each the independent frequency band detecting voice data and right channel data, and is that each frequency band generates correction coefficient.
System 200 comprises Hilbert envelope (Hilbert envelope) 202 and 204, and it receives left and right time-domain signal and generates the Hilbert envelope of the predetermined frequency band of these signals.In one exemplary embodiment, with comparing handled by the fast Fourier transformer 116 and 118 of system 100, Hilbert envelope 202 can operate the time-domain sampling of lesser amt, therefore make system 200 can generate correction coefficient rapidly, and avoid may owing to being transformed into the extra delay that frequency domain causes to generate the correction coefficient that is associated by time channel time domain data.
Peak detector 206 and 208 receives left and right passage Hilbert envelope respectively, and determines the peak amplitude of each signal and the time be associated of this peak amplitude.Then peak value and time data are provided to amplitude and time difference detector 210, and it determines for corresponding peak amplitude, and whether life period is poor.If it is poor accordingly that amplitude and time difference detector 210 are determined not exist between the peak amplitude time, interaural difference then can be used to correct 214, to be determined the correction coefficient angle T that will be inserted in frequency domain audio data by the range value comparing left and right channel peak amplitude
cOR.In one exemplary embodiment, correction coefficient angle T
cORcan by determining that the angle that atan2 (left channel amplitude, right channel amplitude) deducts 45 degree is determined.Similarly, other suitable process can be used to determine correction coefficient angle.Suitable threshold value also can be employed, and can generate correction coefficient angle during to make the there is little time difference between peak amplitude value.
Life period difference between the peak value of left and right channel data but amplitude equal, level error can be used between ear to correct 212.In this exemplary embodiment, amplitude can pass through correction coefficient L
cORregulate, thus the value that the passage with leading audio peak is higher, and the value that the passage with backward audio peak is lower, such as, by deducting L from delayed passage
cOR, increase 0.5*L
cORto leading passage and deduct 0.5*L from delayed passage
cOR, or with other appropriate ways.Between ear, level error corrects 212 and can also use threshold value, to establish when poor higher than the threshold time then applying level correction during its value, and when poor lower than the threshold time not applying level correction during its value.
In operation, system 200 can be utilized to generate time and the level error correction coefficient of left and right signal, but thus for having left or right acoustic image the signal of the time difference be not associated generation interaural difference correction coefficient, and the signal that still there is not the acoustic image amplitude be associated for there is interaural difference generates level correction.
Fig. 3 is the schematic diagram for time and substandard system 300 between level and smooth ear according to exemplary embodiment of the present invention.System 300 comprises time and level error correcting unit 302 to 306 between ear, and it generates time and/or level error correction coefficient between the ear for different frequency bands respectively.In one exemplary embodiment, described frequency band can be the some parts of bark, ERB or other suitable psychoacoustic frequency band, thus system 300 can be used to generate for the single correction coefficient of this psychologic acoustics frequency band based on the subcomponent of psychologic acoustics frequency band.
Time smoothing unit 308 to 312 is used to carry out time smoothing to the output from time between ear or level error corrective system 302 to 306 respectively.In one exemplary embodiment, time smoothing unit 308 to 312 can receive the output sequence from time between ear and level error correcting unit 302 to 306, and the sampling of the predetermined quantity of this sequence can be stored, thus make change between continuous sampling can by average, or otherwise by smoothly.
Frequency band smooth unit 314 receives time or level error correction coefficient between each ear from time between ear or level error correcting unit 302 to 306, and to the time between these ears or level error correction coefficient smoothing.In one exemplary embodiment, when bark or ERB frequency band has been divided into three, frequency band level and smooth 314 can three frequency correction coefficients of average this frequency band be associated, weighted average can be determined, by level and smooth coefficient on service time, or can perform other suitable smoothing processing.Frequency band level and smooth 314 generates single phase correction coefficient for each frequency band.
In operation, system 300 is based on time, frequency, time and frequency or other is suitably basic, for time between ear or level error correction coefficient perform level and smooth, wherein, arrange to detect the acoustic image without level or the time difference be associated by analyzing left and right channel audio data, thus generate time or level error correction coefficient between ear.Therefore, system 300, by guaranteeing that the change between time between ear or level error correction coefficient can not promptly change, helps avoid producing audio artifacts.
Fig. 4 be according to exemplary embodiment of the present invention for the treatment of voice data to introduce the schematic diagram of time or substandard method 400 between ear.Method 400 starts 402, determines left and right amplitude envelops.In one exemplary embodiment, Hilbert envelope detector or other suitable system can be used to the amplitude of the peak value determining frequency band, the time be associated with this peak value and other proper data.Then method proceeds to 404.
404, the time be associated of the peak value in detected amplitude envelope and peak value.In one exemplary embodiment, the simple peak detector of such as range detector and so on can be used to detect the time interval be associated occurring peak value place.Method proceeds to 406.
406, determine whether life period is poor between the peak value of left and right channel data.In one exemplary embodiment, the time difference can comprise the buffering be associated, thus, if the time between peak value is less than scheduled volume, then determine that life period is not poor.If determine that life period is poor, thus do not need interaural time delay to recover, then method proceeds to 408, determines whether there is level error between the amplitude of two signals.If determine to there is level error, then method proceeds to 410.Otherwise method proceeds to 412, correct the level between the channel audio data of left and right.In one exemplary embodiment, leading channel amplitude can be retained and not change, and the channel amplitude fallen behind is reduced and leading and fall behind the coefficient of the difference correlation between passage, or other can be used suitably to process.
If determine that life period is not poor between the channel peak range value of left and right, then method proceeds to 414, and at this, level error is converted into phasing angle.In one exemplary embodiment, phasing angle by deducting 45 degree to determine from atan2 (left channel amplitude, right channel amplitude), or can use other suitable relation.Then method proceeds to 416, and phase difference is assigned to left and right passage.In one exemplary embodiment, this distribution can be performed by this phase difference of decile, thus these passages are shifted to an earlier date and delayed identical amount.In addition, in due course, can weighted difference be used, or other can be used suitably to process.Then method proceeds to 418.
418, the difference between the channel phases correction angle of level and smooth left and right.In one exemplary embodiment, this difference can be temporally level and smooth, next level and smooth or next level and smooth with other appropriate ways based on the phasing angle of adjacent channel.Then method proceeds to 420.
420, difference correction coefficient is applied to audio signal.In one exemplary embodiment, the phase difference corresponding with the time difference can be added in a frequency domain, such as, use known add in time signal by adding in a frequency domain or deducting the phase in-migration be associated or deduct the method for time difference.Similarly, other suitable process can also be used.
In operation, method 400 makes phase place or amplitude correction coefficient between ear can be determined and be applied to multiple voice data passage.Although show two exemplary path, in due course, more Multi-audio-frequency data channel also can be processed, and such as in 5.1 sound systems, 7.1 sound systems or other suitable sound system, between increase ear, phase place or amplitude correction coefficient are to voice data.
Fig. 5 is the schematic diagram of the system 500 for interaural time delay correction according to exemplary embodiment of the present invention.System 500 makes interaural time delay can be compensated before audio mixing, thus the acoustic image generating the interaural time delay that more accurately reflection and the sound source produced in the physical locations be associated are associated controls to export.
System 500 comprises left passage variable delay 502, right passage variable delay 504 and acoustic image and controls 506, its each can realize with the appropriately combined of hardware, software or hardware and software, and can be the one or more software systems run on digital signal processing platform.Acoustic image control 506 allow users select acoustic image arrange with by time become voice data input distribute to left-channel signal and right channel signal.In one exemplary embodiment, acoustic image control 506 can comprise arrange for the multiple positions be associated between virtual left position and virtual right position in each the time-delay value be associated.In this exemplary embodiment, acoustic image controls 506 can when the most left, center or least significant be selected, and forbidding variable delay controls, and do not need to postpone because arrange for these.For control at acoustic image 506 the most left, center or the rightest between setting, the length of delay corresponding with the interaural time delay that the sound source for being positioned at the position be associated generates can be generated.
Acoustic image controls 506 also can comprise activity (active) acoustic image adjustment feature, and it makes user that live audio-video can be selected to adjust, such as, when user intends from left to right or adjust acoustic image from right to left.In this exemplary embodiment, time delay can be provided for the most left or the rightest acoustic image controls 506 settings, thus make user control can adjust the acoustic image of audio frequency input when 506 settings are removed from the most left or the rightest setting at acoustic image and not generate audio artifacts, otherwise time delay controls 506 maximum delay value arranged by jumping to from the zero-lag being used for the most left or the rightest setting for the acoustic image of the most left or the rightest contiguous setting.
Left passage variable delay 502 and right passage variable delay 504 can use the interaural time delay correction coefficient plug-in unit of system 100 or realize with other appropriate ways.
In operation, system 500 makes can add interaural time delay when voice-grade channel adjusts acoustic image between two output channels (such as, left passage and right passage or other suitable passage).System 500 can forbid the time delay of setting when not needing time delay.
Fig. 6 is the flow chart for controlling the method 600 controlling to arrange the interaural time delay be associated with acoustic image according to exemplary embodiment of the present invention.Method 600 starts 602, receives the time-domain audio channel data of such as user-selected passage.Then the method proceeds to 604, detects acoustic image and controls to arrange.It can be potentiometer that acoustic image controls, virtual sound image controls or other suitably controls.Then method proceeds to 606.
606, determine whether to need acoustic image to postpone to arrange.In one exemplary embodiment, the acoustic image can forbidding predetermined acoustic image control position postpones, such as the most left, the rightest or center, described predetermined acoustic image control position.In a further exemplary embodiment, it can be the most left or the delay of least significant generation acoustic image, such as when user have selected acoustic image control to arrange to allow user to adjust acoustic image actively between the most left and least significant, thus avoid when acoustic image control from the rightest or leftmost position leaves time discontinuity rise time delay.If determine not need acoustic image to postpone, method proceeds to 612, otherwise method proceeds to 608.
608, control to arrange computing relay amount based on acoustic image.In one exemplary embodiment, maximum time delay can be generated when acoustic image controls to be in the most left or least significant, such as, when have selected live audio-video adjustment.Similarly, when to have selected fixed sound picture and arrange, time delay (because not generating for relative passage the signal be associated) is not needed for the most left or the rightest setting.Control to arrange for the acoustic image between the rightest and leftmost position is arranged, the time delay corresponding with the time delay of the position between them is calculated, and this time delay reduces close to center along with acoustic image control position.Then method proceeds to 610.
610, the delay calculated is applied to one or more variable delay.In one exemplary embodiment, postpone to be added to one of left passage or right passage, or other suitable delay can be used to arrange.In a further exemplary embodiment, the interaural time delay correction coefficient plug-in unit of system 100 can be used or add delay with other appropriate ways.Then method proceeds to 612.
612, determine whether that other voice-grade channel data need process, such as, by determining whether there is other data sampling in data buffer, or by other appropriate ways.If need other data processing, method is back to 602, otherwise method proceeds to 614 and terminates.
In operation, method 600 allows to control to arrange generation interaural time delay based on acoustic image.Method 600 is by using acoustic image to control to make sound position can be simulated closer to the mode of actual sound source position than the simple Sound image localization between left passage and right passage not carrying out time adjustment.
Although be described herein the exemplary embodiment of system and method for the present invention, but those skilled in the art should also be appreciated that, when not deviating from the scope and spirit of claims, various substitutions and modifications can be carried out to this system and method.
Claims (13)
1., for the treatment of an equipment for voice data, comprising:
Acoustic image control unit, for distributing to voice data left channel audio data and right channel audio data;
Interaural time delay correction coefficient unit, for receiving described left channel audio data and described right channel audio data, and based on acoustic image control unit voice data distributed to left channel audio data and right channel audio data generate interaural time delay correction coefficient, wherein said interaural time delay correction coefficient unit comprises:
Time difference detector, for receiving in the left channel audio data of predetermined frequency band and right channel audio data the peak amplitude value of each and time of being associated, and generates difference correction data between ear;
Interaural difference correcting unit, for receiving difference correction data between described ear, and generates the time adjustment coefficient being used for interaural time delay correction coefficient plug-in unit; And
Level error correcting unit between ear, for generating the level correction coefficient for described interaural time delay correction coefficient plug-in unit;
Described interaural time delay correction coefficient plug-in unit, for according to described interaural time delay correction coefficient, revises described left channel audio data and right channel audio data, to generate amended left channel audio data and amended right channel audio data;
Multiple time smoothing unit, for carrying out time smoothing to from the output of level error correcting unit between interaural difference correcting unit or ear respectively; And
Frequency band smooth unit, for smoothing to level error correction coefficient between interaural difference correction coefficient and ear.
2. equipment according to claim 1, wherein said interaural time delay correction coefficient unit comprises low delay filter group, for receiving one of left channel audio data and right channel audio data, and it is the amplitude envelops that predetermined frequency band generates as the function of time.
3. equipment according to claim 1, wherein said interaural time delay correction coefficient unit comprises peak detector, for receiving one of left channel audio data and right channel audio data, and it is predetermined frequency band generation peak amplitude value and the time be associated.
4. equipment according to claim 1, wherein said interaural time delay correction coefficient plug-in unit comprises delay cell, for left channel audio data are postponed the amount relevant to the delay of described interaural time delay correction coefficient unit with one of right channel audio data.
5. equipment according to claim 1, wherein said interaural time delay correction coefficient plug-in unit comprises Hann window unit, for receiving one of left channel audio data and right channel audio data, and Hann window is applied to received channel audio data.
6. equipment according to claim 1, wherein said interaural time delay correction coefficient plug-in unit comprises phase shift plug-in unit, for inserting phase shift in multiple frequency domain audio channel signal.
7., for the treatment of a method for voice data, comprising:
By acoustic image control unit, voice data is distributed to left channel audio data and right channel audio data;
To determine in left channel audio data and right channel audio data the peak amplitude of each;
Detect the delay be associated with described peak amplitude; And
If the delay detected is less than threshold value, inserts between left channel audio data and right channel audio data and postpone, comprising:
When there is interaural difference but there is not the acoustic image amplitude be associated between described left channel audio data and described right channel audio data, level error correcting unit between ear is used to generate level error correction coefficient between ear, and between described left channel audio data and described right channel audio data, there is left or right acoustic image but the time difference be not associated, use interaural difference correcting unit to generate interaural difference correction coefficient;
Time smoothing unit is utilized to carry out time smoothing to from the output of level error correcting unit between interaural difference correcting unit or ear respectively; And
Utilize frequency band smooth unit smoothing to level error correction coefficient between interaural difference correction coefficient and ear respectively.
8. method according to claim 7, wherein determines that in left channel audio data and right channel audio data, the amplitude envelops of each comprises: the amplitude envelops determining the predetermined frequency band of each in left channel audio data and right channel audio data.
9. method according to claim 7, wherein determines that in left channel audio data and right channel audio data, the amplitude envelops of each comprises: with each predetermined frequency band in Hilbert envelope unit process left channel audio data and right channel audio data.
10. method according to claim 7, wherein detects the delay be associated with the peak value of each amplitude envelops and comprises: the time that the time be associated by the peak amplitude with a passage is associated with the peak amplitude with second channel compares.
11. methods according to claim 7, also comprise and generate the delay of inserting based on peak amplitude.
12. methods according to claim 7, also comprise and generate the delay of inserting based on peak amplitude, it comprises by determining atan2 (peak1, peak2) 45 degree are deducted to generate inserted delay, wherein atan2 is two variable arctan functions, obtain angle to export, peak1 is the value of the first peak amplitude, and peak2 is the value of the second peak amplitude.
13. methods according to claim 7, if the delay wherein detected is less than threshold value, then between left channel audio data and right channel audio data, inserts delay and comprise:
Described left channel audio data and right channel audio data are transformed into frequency domain from time domain;
Inserted delay is converted to phase-shift value;
In a frequency domain, the Part I of described phase-shift value is added to left channel audio data; And
In a frequency domain, from right channel audio data, deduct the Part II of described phase-shift value.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/204,471 | 2008-09-04 | ||
US12/204,471 US8233629B2 (en) | 2008-09-04 | 2008-09-04 | Interaural time delay restoration system and method |
PCT/US2009/004673 WO2010027403A1 (en) | 2008-09-04 | 2009-08-14 | Interaural time delay restoration system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102144405A CN102144405A (en) | 2011-08-03 |
CN102144405B true CN102144405B (en) | 2014-12-31 |
Family
ID=41725480
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200980134440.3A Expired - Fee Related CN102144405B (en) | 2008-09-04 | 2009-08-14 | Interaural time delay restoration system and method |
Country Status (8)
Country | Link |
---|---|
US (1) | US8233629B2 (en) |
EP (1) | EP2321977B1 (en) |
JP (1) | JP5662318B2 (en) |
KR (1) | KR101636592B1 (en) |
CN (1) | CN102144405B (en) |
HK (1) | HK1156171A1 (en) |
TW (1) | TWI533718B (en) |
WO (1) | WO2010027403A1 (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8971551B2 (en) * | 2009-09-18 | 2015-03-03 | Dolby International Ab | Virtual bass synthesis using harmonic transposition |
WO2011029984A1 (en) * | 2009-09-11 | 2011-03-17 | Nokia Corporation | Method, apparatus and computer program product for audio coding |
US8571232B2 (en) * | 2009-09-11 | 2013-10-29 | Barry Stephen Goldfarb | Apparatus and method for a complete audio signal |
WO2011129655A2 (en) * | 2010-04-16 | 2011-10-20 | Jeong-Hun Seo | Method, apparatus, and program-containing medium for assessment of audio quality |
FR2966634A1 (en) * | 2010-10-22 | 2012-04-27 | France Telecom | ENHANCED STEREO PARAMETRIC ENCODING / DECODING FOR PHASE OPPOSITION CHANNELS |
CN103796150B (en) * | 2012-10-30 | 2017-02-15 | 华为技术有限公司 | Processing method, device and system of audio signals |
JP6216553B2 (en) * | 2013-06-27 | 2017-10-18 | クラリオン株式会社 | Propagation delay correction apparatus and propagation delay correction method |
WO2015035093A1 (en) | 2013-09-05 | 2015-03-12 | Daly George William | Systems and methods for acoustic processing of recorded sounds |
CN106999710B (en) | 2014-12-03 | 2020-03-20 | Med-El电气医疗器械有限公司 | Bilateral hearing implant matching of ILD based on measured ITD |
CN108877815B (en) | 2017-05-16 | 2021-02-23 | 华为技术有限公司 | Stereo signal processing method and device |
TWI689708B (en) * | 2018-12-24 | 2020-04-01 | 財團法人工業技術研究院 | Vibration sensor with monitoring function and vibration signal monitoring method thereof |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5136650A (en) * | 1991-01-09 | 1992-08-04 | Lexicon, Inc. | Sound reproduction |
US5652770A (en) * | 1992-09-21 | 1997-07-29 | Noise Cancellation Technologies, Inc. | Sampled-data filter with low delay |
US6424939B1 (en) * | 1997-07-14 | 2002-07-23 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Method for coding an audio signal |
CN1647157A (en) * | 2002-04-22 | 2005-07-27 | 皇家飞利浦电子股份有限公司 | Signal synthesizing |
US7027601B1 (en) * | 1999-09-28 | 2006-04-11 | At&T Corp. | Perceptual speaker directivity |
CN1810015A (en) * | 2003-03-10 | 2006-07-26 | 坦德伯格电信公司 | Echo canceller with reduced requirement for processing power |
CN101002505A (en) * | 2004-08-03 | 2007-07-18 | 杜比实验室特许公司 | Combining audio signals using auditory scene analysis |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4890065A (en) | 1987-03-26 | 1989-12-26 | Howe Technologies Corporation | Relative time delay correction system utilizing window of zero correction |
JPH0522798A (en) * | 1991-07-10 | 1993-01-29 | Toshiba Corp | Phase correcting device |
JP2973764B2 (en) * | 1992-04-03 | 1999-11-08 | ヤマハ株式会社 | Sound image localization control device |
JP2893563B2 (en) * | 1992-12-11 | 1999-05-24 | 松下電器産業株式会社 | Sound image localization coefficient calculator |
JP2900985B2 (en) * | 1994-05-31 | 1999-06-02 | 日本ビクター株式会社 | Headphone playback device |
JP3276528B2 (en) * | 1994-08-24 | 2002-04-22 | シャープ株式会社 | Sound image enlargement device |
US5796844A (en) * | 1996-07-19 | 1998-08-18 | Lexicon | Multichannel active matrix sound reproduction with maximum lateral separation |
JPH10126898A (en) * | 1996-10-22 | 1998-05-15 | Kawai Musical Instr Mfg Co Ltd | Device and method for localizing sound image |
JP4463905B2 (en) * | 1999-09-28 | 2010-05-19 | 隆行 荒井 | Voice processing method, apparatus and loudspeaker system |
JP4021124B2 (en) * | 2000-05-30 | 2007-12-12 | 株式会社リコー | Digital acoustic signal encoding apparatus, method and recording medium |
US8498422B2 (en) * | 2002-04-22 | 2013-07-30 | Koninklijke Philips N.V. | Parametric multi-channel audio representation |
CN101093661B (en) * | 2006-06-23 | 2011-04-13 | 凌阳科技股份有限公司 | Pitch tracking and playing method and system |
RU2551797C2 (en) * | 2006-09-29 | 2015-05-27 | ЭлДжи ЭЛЕКТРОНИКС ИНК. | Method and device for encoding and decoding object-oriented audio signals |
-
2008
- 2008-09-04 US US12/204,471 patent/US8233629B2/en active Active
-
2009
- 2009-08-14 KR KR1020117007537A patent/KR101636592B1/en active IP Right Grant
- 2009-08-14 CN CN200980134440.3A patent/CN102144405B/en not_active Expired - Fee Related
- 2009-08-14 EP EP09811797.1A patent/EP2321977B1/en not_active Not-in-force
- 2009-08-14 WO PCT/US2009/004673 patent/WO2010027403A1/en active Application Filing
- 2009-08-14 JP JP2011526031A patent/JP5662318B2/en not_active Expired - Fee Related
- 2009-08-20 TW TW098128032A patent/TWI533718B/en not_active IP Right Cessation
-
2011
- 2011-10-03 HK HK11110410.8A patent/HK1156171A1/en not_active IP Right Cessation
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5136650A (en) * | 1991-01-09 | 1992-08-04 | Lexicon, Inc. | Sound reproduction |
US5652770A (en) * | 1992-09-21 | 1997-07-29 | Noise Cancellation Technologies, Inc. | Sampled-data filter with low delay |
US6424939B1 (en) * | 1997-07-14 | 2002-07-23 | Fraunhofer-Gesellschaft Zur Forderung Der Angewandten Forschung E.V. | Method for coding an audio signal |
US7027601B1 (en) * | 1999-09-28 | 2006-04-11 | At&T Corp. | Perceptual speaker directivity |
CN1647157A (en) * | 2002-04-22 | 2005-07-27 | 皇家飞利浦电子股份有限公司 | Signal synthesizing |
CN1810015A (en) * | 2003-03-10 | 2006-07-26 | 坦德伯格电信公司 | Echo canceller with reduced requirement for processing power |
CN101002505A (en) * | 2004-08-03 | 2007-07-18 | 杜比实验室特许公司 | Combining audio signals using auditory scene analysis |
Also Published As
Publication number | Publication date |
---|---|
KR20110063807A (en) | 2011-06-14 |
WO2010027403A8 (en) | 2011-01-06 |
KR101636592B1 (en) | 2016-07-05 |
US20100054482A1 (en) | 2010-03-04 |
EP2321977A4 (en) | 2013-10-09 |
JP2012502550A (en) | 2012-01-26 |
EP2321977A1 (en) | 2011-05-18 |
JP5662318B2 (en) | 2015-01-28 |
EP2321977B1 (en) | 2017-10-04 |
HK1156171A1 (en) | 2012-06-01 |
US8233629B2 (en) | 2012-07-31 |
WO2010027403A1 (en) | 2010-03-11 |
CN102144405A (en) | 2011-08-03 |
TW201014372A (en) | 2010-04-01 |
TWI533718B (en) | 2016-05-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102144405B (en) | Interaural time delay restoration system and method | |
KR101210797B1 (en) | audio spatial environment engine | |
EP1787495B1 (en) | Combining audio signals using auditory scene analysis | |
US7853022B2 (en) | Audio spatial environment engine | |
KR101870058B1 (en) | Generating binaural audio in response to multi-channel audio using at least one feedback delay network | |
ES2399058T3 (en) | Apparatus and procedure for generating a multi-channel synthesizer control signal and apparatus and procedure for synthesizing multiple channels | |
EP2946572B1 (en) | Binaural audio processing | |
US20060106620A1 (en) | Audio spatial environment down-mixer | |
US8363847B2 (en) | Device and method for simulation of WFS systems and compensation of sound-influencing properties | |
US20070223740A1 (en) | Audio spatial environment engine using a single fine structure | |
EP2899997A1 (en) | Sound system calibration | |
US20060093164A1 (en) | Audio spatial environment engine | |
US9913036B2 (en) | Apparatus and method and computer program for generating a stereo output signal for providing additional output channels | |
EP2961088A1 (en) | System and method for blending multi-channel signals | |
EP3386126A1 (en) | Audio processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1156171 Country of ref document: HK |
|
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1156171 Country of ref document: HK |
|
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20141231 Termination date: 20200814 |
|
CF01 | Termination of patent right due to non-payment of annual fee |