US20090089054A1 - Apparatus and method of noise and echo reduction in multiple microphone audio systems - Google Patents

Apparatus and method of noise and echo reduction in multiple microphone audio systems Download PDF

Info

Publication number
US20090089054A1
US20090089054A1 US11/864,906 US86490607A US2009089054A1 US 20090089054 A1 US20090089054 A1 US 20090089054A1 US 86490607 A US86490607 A US 86490607A US 2009089054 A1 US2009089054 A1 US 2009089054A1
Authority
US
United States
Prior art keywords
signal
noise
echo
speech
microphone
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US11/864,906
Other versions
US8175871B2 (en
Inventor
Song Wang
Samir Kumar Gupta
Eddie L. T. Choy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Priority to US11/864,906 priority Critical patent/US8175871B2/en
Assigned to QUALCOMM INCORPORATED reassignment QUALCOMM INCORPORATED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GUPTA, SAMIR KUMAR, CHOY, EDDIE L. T., WANG, SONG
Publication of US20090089054A1 publication Critical patent/US20090089054A1/en
Application granted granted Critical
Publication of US8175871B2 publication Critical patent/US8175871B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Definitions

  • the disclosure relates to the field of audio processing. More particularly, the disclosure relates to acoustic echo cancellation and noise reduction in multiple microphone audio systems.
  • Mobile speech communication can be conducted under various environments.
  • the microphones on the mobile device receive not only the desired speech, but also background noise. In many situations, background noise can be abundant. It reduces intelligibility of desired speech. Acoustical echo is another problem in mobile speech communications. Not only it reduces desired speech's intelligibility, it also distracts the far end talk and is very annoying. To improve intelligibility of desired speech, it is necessary to reduce background noise and acoustic echo without distorting the desired speech. Many echo and noise reduction methods have been developed.
  • noise suppression is achieved using only one microphone.
  • One of such noise suppression methods uses spectral subtraction to suppress background noise. The method assumes that the background noise is short-term stationary, i.e. the noise statistics do not change in a short period regardless the activity of the desired speech. Noise statistics are estimated when a desired speech signal is absent and the noise estimates are used to suppress noise in the signal regardless of the activity of desired speech. Spectral subtraction estimates noise statistic and suppress noise in the frequency domain. Each frequency bin is processed independently. This method finds success in stationary noise reduction. However, it is not capable of reducing non-stationary noise.
  • Another single-microphone noise reduction method uses a directional microphone.
  • uni-directional microphones are more expensive than omni-directional microphones.
  • uni-directional microphones may limit the way the speech communications devices are used since the mobile device may need to be placed properly to ensure its functionality.
  • Echo cancellation is typically achieved by de-correlating microphone signal from far end signal using adaptive filtering. Some aggressive echo cancellation algorithms reduce communication into half-duplex mode, where only one user talks at a time. In mobile speech communication, background noise and acoustic echo reduce intelligibility of desired speech. Therefore, it is desirable to reduce both background noise and echo without distorting desired speech.
  • the apparatus and methods implement a variety of noise and echo reduction techniques and apparatus that can be selectively applied to signals received using multiple microphones.
  • the microphone signals received at each of the multiple microphones can be independently processed to cancel acoustic echo that can be generated due to acoustic or mechanical coupling.
  • the echo cancelled signals may be processed by some or all modules within a signal separator that operates to separate or otherwise isolate a speech signal from noise signals.
  • the signal separator can optionally include a pre-processing de-correlator followed by a blind source separator.
  • the output of the blind source separator can be post filtered to provide post separation de-correlation.
  • the separated speech and noise signals can be non-linearly processed for further noise reduction, and additional post processing can be implemented following the non-linear processing.
  • aspects of the invention include a method of noise reduction in multiple microphone communication devices.
  • the method includes receiving multiple microphone signals, de-correlating the multiple microphone signals, separating a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals, and performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
  • aspects of the invention include a method of noise reduction in multiple microphone communication devices.
  • the method includes receiving a first microphone signal, receiving a second microphone signal, performing echo cancellation on each of the first microphone signal and the second microphone signal, de-correlating the first microphone signal from the second microphone signal, separating a speech reference signal from a noise reference signal based on the first and second microphone signals, de-correlating a residual noise in the speech reference signal from the noise reference signal, and performing non-linear processing on at least the speech reference signal.
  • the apparatus includes a first echo canceller configured to cancel an echo in a first microphone signal to generate a first echo canceled microphone signal, a second echo canceller configured to cancel an echo in a second microphone signal to generate a second echo canceled microphone signal, a signal separator configured to receive the first and second echo canceled microphone signals and separate a speech signal component from a noise signal component to generate a speech reference signal and a noise reference signal, and a non-linear processing module configured to receive the speech reference signal and noise reference signal and perform non-linear processing on the speech reference signal.
  • aspects of the invention include an apparatus for noise reduction in multiple microphone systems.
  • the apparatus includes means for receiving multiple microphone signals, means for de-correlating the multiple microphone signals, means for separating a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals, and means for performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
  • aspects of the invention include a processor readable media including instructions that may be utilized by one or more processors.
  • the instructions include instructions for de-correlating multiple received microphone signals, instructions for separating a speech signal component from a noise signal in at least one of the multiple received microphone signals to generate separated microphone signals, and instructions for performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
  • FIG. 1 is simplified functional block diagram of an environment having background noise and acoustic echo in speech communication and a noise suppressor and a typical echo canceller based on an adaptive filter.
  • FIG. 2 is a simplified functional block diagram of an embodiment of a two-microphone noise and echo reduction system.
  • FIGS. 3A-3B are simplified functional block diagrams of embodiments of non-linear processing modules implementing spectral subtraction.
  • FIG. 4 is a simplified functional block diagram of an embodiment of a speech post-processing module.
  • FIG. 5 is a simplified flowchart of an embodiment of a method of noise and echo reduction.
  • FIG. 6 is a simplified functional block diagram of an embodiment of a two-microphone noise and echo reduction system.
  • a two-microphone noise and echo reduction system uses two microphones to receive acoustic signals, such as speech signals. Each microphone receives a different mixture of desired speech, background noise and acoustic echo.
  • the noise suppression system uses echo cancellers to reduce acoustic echo in each of the microphone signals.
  • the signal after echo cancellation is fed to an enhanced Blind Source Separation (BSS) module, which substantially separates desired speech signal components from background noise and residual acoustic echo.
  • BSS Blind Source Separation
  • nonlinear noise and echo reduction is used to further reduce background noise and acoustic echo in the desired speech signal.
  • Post-processing is used to further reduce residue noise and echo.
  • FIG. 1 is a simplified functional block diagram of an embodiment of a reverberant noise environment 100 in which a communication device 110 operates.
  • the communication device 110 can be, for example, a mobile device, portable device, or stationary device.
  • the communication device 110 can be a mobile telephone, personal digital assistant, notebook computer, sound recorder, headsets, and the like or some other communication device that can receive and process audio signals and optionally output audio signals.
  • the communication device 110 illustrated in FIG. 1 includes multiple microphones 112 - 1 and 112 - 2 and at least one audio output device 130 .
  • the audio environment can include multiple noise and interference sources, e.g. 162 , and can include one or more near end speech sources 150 .
  • a single near end speech source 150 can be a user of the communication device 110 .
  • the speech source 150 is positioned in the near field of the microphones 112 - 1 and 112 - 2 .
  • a number of noise sources 162 , 164 , and 164 may generate signals incident on the microphones 112 - 1 and 112 - 2 .
  • the noise sources, 162 , 164 , and 166 may be positioned throughout the operating environment, as shown in FIG. 1 , or one or more noise sources may be positioned close together.
  • each of the noise sources 162 , 164 , and 166 is positioned in the far field of the microphones 112 - 1 and 112 - 2 .
  • the noise sources 162 , 164 , and 166 can be independent noise sources or can be related noise sources.
  • the speaker 130 local to the communication device 110 can originate one or more echo signals, 132 , 134 , and 136 .
  • An echo signal 132 may traverse substantially a direct path from the speaker 132 to the microphones 112 - 1 and 112 - 2 .
  • An echo signal may traverse a reflected path 134 , where the audio from the speaker 130 reflects off of a surface 170 .
  • the echo signal may also traverse a multiply reflected path 136 , where the audio from the speaker reflects off of multiple surfaces 170 prior to reaching the microphones 112 - 1 and 112 - 2 .
  • the signal path from each of the noise sources 162 , 164 , and 166 is depicted as a single path, the signal from each noise source 162 , 164 , and 166 may traverse multiple paths.
  • the signal incident on the microphones 112 - 1 and 112 - 2 may include multiple signals, including some signals that traverse multiple paths before arriving at the microphones 112 - 1 and 112 - 2 .
  • the position of the speech source 150 in the near field of the microphones 112 - 1 and 112 - 2 may permit its signal to be more prevalent at some of the microphones 122 - 1 or 112 - 2 .
  • the small physical size of typical mobile communication devices 110 may not permit isolation of the speech source 150 signal from a portion of the microphones 112 - 1 and 112 - 2 through physical placement alone in order to establish a noise reference signal.
  • the position of speaker 130 may cause its signal to be a near field signal, although one or more of the reflected signals may appear as far field signals.
  • the noise sources 162 , 164 , and 166 may be in the far field and their noise signal levels may be similarly on all microphones 112 - 1 and 112 - 2 .
  • the communication device 110 utilizes a combination of echo cancellation and noise suppression to reduce the noise signals and echo signal from the speech signal.
  • the resultant speech signal can be coupled to one or more far end processors or outputs.
  • the microphones 112 - 1 and 112 - 2 couple the received signals to respective signal combiners 122 - 1 and 122 - 2 that operates as part of, or in conjunction with, adaptive filters 120 - 1 and 120 - 2 to cancel at least a predominant echo signal that originates from the speaker 130 .
  • the adaptive filter receives an input signal that is substantially the same as the signal coupled to the speaker 130 .
  • the output of the adaptive filters 120 - 1 and 120 - 2 may be coupled to a second input of the respective signal combiner 122 - 1 and 122 - 2 .
  • the signal combiners 122 - 1 and 122 - 2 can be configured as a summer or subtracter.
  • the signal combiners 122 - 1 and 122 - 2 sum the filtered signal or a negated filtered signal to the signal from the microphones 112 - 1 and 112 - 2 .
  • the adaptive filters 120 - 1 and 120 - 2 can be configured to converge on a set of tap weights that minimizes the echo signal component in the signal combiner 122 - 1 and 122 - 2 outputs.
  • the outputs from the signal combiners 122 - 1 and 122 - 2 can be fed back to the associated adaptive filter 120 - 1 or 120 - 2 and used to determine an error or metric related to minimizing the echo signals.
  • the output of the signal summers 122 - 1 and 122 - 2 represent the echo canceled input signals.
  • the echo canceled input signals may be coupled to a noise and echo suppressor 140 .
  • the noise and echo suppressor 140 can be configured to reduce noise signals and echo signals from the speech signals and may perform suppression of the noise component in order to optimize or otherwise enhance the speech component. Embodiments illustrating details and operation of the noise and echo suppressor are described in association with FIG. 2 .
  • the speech signal output from the noise and echo suppressor 140 is coupled to one or more far end devices or modules (not shown) for further processing, output, or some combination thereof.
  • FIG. 2 is a simplified functional block diagram of an embodiment of communication device 110 implementing a two-microphone noise and echo reduction system.
  • the communication device 110 embodiment illustrates two microphones 112 - 1 and 112 - 2 , the noise suppression methods and apparatus can similarly operate on a greater number of microphones.
  • the communication device 110 includes two microphones 112 - 1 and 112 - 2 coupled to an input of a noise and echo reduction system 200 .
  • the noise and echo reduction system 200 is configured to remove echo signals from the received audio signals, separate the speech from the noise components, and further improve the speech signal by reducing the residual noise and echo.
  • the output of the noise and echo reduction system 200 is typically a speech reference signal, but can include a noise reference signal.
  • the output signals may be coupled to a back end signal processing module 280 , which can be, for example, a baseband signal processor of a wireless communication device.
  • the back end signal processing module 280 can be configured to couple some or all of the speech reference signal to an air interface 290 , which can be configured to process the speech signal to generate a signal in accordance with a media access control standard and a physical layer standard for wireless transmission over a link.
  • the communication device 110 may support duplex communication over the air interface 290 and may be configured to receive one or more communication signals that include speech signals for output by the communication device 110 .
  • the signals received by the air interface 290 may be coupled to the backend signal processing module 280 .
  • the back end signal processing module 280 processes the received signals to extract and condition the speech and audio signals in the received signals.
  • the back end signal processing module 280 couples the speech and audio portions to a volume control module 282 that can be configured, for example, to provide user configurable gain.
  • the volume control module 282 can also be configured to provide filtering.
  • the signal processing modules within the noise and echo reduction system 200 may be implemented as analog signal processing modules, digital signal processing modules, or a combination of analog and digital signal processing. Where a module performs digital signal processing, an Analog to Digital Converter (ADC) is implemented at some signal processing point prior to digital processing. Similarly, where analog signal processing occurs following a digital signal processing module, a Digital to Analog Converter (DAC) is used to convert digital signals to their analog representations.
  • the speaker 130 can include a DAC where the volume control module 282 outputs a digital signal.
  • the volume control module 282 couples the amplified and conditioned output audio signal to the input of the speaker and to at least one input of the noise and echo reduction system 200 .
  • the speaker 130 converts the output audio signal from an electrical signal to an audible signal.
  • the noise and echo reduction system 200 utilizes the output audio as an input to one or more echo cancellers 220 - 1 and 220 - 2 .
  • each of the microphones 112 - 1 and 112 - 2 may receive echo signals that are based on the signal output by the speaker 130 .
  • the acoustic echo reduces speech intelligibility and may also substantially hinder separation of the speech and noise signal components when the echo is strong.
  • the echo is substantially eliminated, canceled, or otherwise reduced before signal separation to prevent the acoustic echo from confusing speech separation portions of the noise and echo reduction system 200 .
  • a first microphone 112 - 1 couples its received signal to a first input of a first signal combiner 222 - 1 .
  • the first echo canceller 220 - 1 couples the echo cancellation signal to a second input of the first signal combiner 222 - 1 .
  • the second microphone 112 - 2 couples its received signal to a first input of a second signal combiner 222 - 2 .
  • the second echo canceller 220 - 2 couples the echo cancellation signal to a second input of the second signal combiner 222 - 2 .
  • One of the first or second echo cancellers 220 - 1 and 220 - 2 can be configured to couple its respective echo cancellation signal to an input of a nonlinear processing module 260 .
  • the first echo canceller 220 - 1 is configured to couple its echo cancellation signal to the nonlinear processing module 260 .
  • Each signal combiner 222 - 1 and 222 - 2 can negate the signal from the respective echo canceller 220 - 1 and 220 - 2 before summing with the corresponding microphone signal.
  • Each signal combiner 222 - 1 and 222 - 2 outputs an echo canceled signal.
  • the first signal combiner 222 - 1 couples the first echo canceled signal to a first input of a signal separator 230 and to a feedback input of the first echo canceller 220 - 1 .
  • the second signal combiner 222 - 2 couples the second echo canceled signal to a second input of the signal separator 230 and to a feedback input of the second echo canceller 220 - 2 .
  • each echo canceller 220 - 1 and 220 - 2 implements linear echo cancellation.
  • each echo canceller 220 - 1 and 220 - 2 can implement an adaptive filter. More particularly, each echo canceller 220 - 1 and 220 - 2 can use a normalized least mean square (NLMS) algorithm to minimize the echo signal component in the echo canceled signal.
  • NLMS normalized least mean square
  • echo cancellers e.g. 220 - 1 and 220 - 2
  • adaptive filters The performance of echo cancellers, e.g. 220 - 1 and 220 - 2 , based on adaptive filters is limited by linearity of the echo path, including speaker and microphone and their related circuits, and reverberant environment. Echo cancellation performance is also limited by the length of the adaptive filter and the algorithm's capability to deal with echo path change and double talk in which both near end and far end talkers are speaking.
  • the echo cancellers 220 - 1 and 220 - 2 typically implement echo cancellation based on time domain processing of the microphone and speaker signals
  • one or more of the echo cancellers 220 - 1 and 220 - 2 can implement frequency domain and subband domain processing for echo cancellation.
  • the signals from a microphone, e.g. 112 - 1 may be transformed to frequency domain or subband domain.
  • the echo canceller, e.g. 220 - 1 can implement an adaptive filter for each frequency bin or subband.
  • the echo canceller, e.g. 220 - 1 can adjust the tap weights of each adaptive filter to minimize the echo signal component in the output of each frequency bin or subband.
  • the signal separator 230 operates to generate a speech reference signal and a noise reference signal.
  • the signal separator 230 embodiment illustrated in FIG. 2 includes a pre-processing module 232 , a source separator 240 , and a post processing module 234 .
  • the signal separator 230 may optionally include a voice activity detection module 250 that operates on the signal at the input, output, or an intermediate point within the signal separator 230 .
  • the voice activity detection module 230 may alternatively be implemented external and distinct from the signal separator 230 .
  • the signal separator 230 may implement a controller (not shown) that selectively activates or omits each of the signal processing modules within the signal separator 230 , for example, depending on signal conditions, operating modes, external control, and the like.
  • the microphones 112 - 1 and 112 - 2 may be placed very close to each other due to limited space. Often, the differences in the signals from each of the microphones 112 - 1 and 112 - 2 are very small. Therefore, the instantaneous correlation among microphone signals is very high. When instantaneous correlation is significant, a blind source separator may not perform adequately and may end to cancel the most prominent signal in both microphone signals for two-microphone applications. Sometimes, a blind source separator generates annoying tonal artifacts when operating on signals having high instantaneous correlation.
  • the pre-processing module 232 de-correlates the signals.
  • the pre-processing module 232 is configured as a digital filter having a small number (fewer than about five) of taps. One to three taps may be sufficient, although a different number of taps may be used. If three taps are used, one tap can be designated to be non-causal.
  • the pre-processing module 232 can include an adaptive de-correlator, which can be implemented as an adaptive filter with a small number of taps.
  • the adaptive de-correlator can adjust the tap weights in order to minimize correlation or other wise maximize de-correlation.
  • the adaptive de-correlator can be configured to select among a predetermined tap weights, predetermined sets of tap weights and configurations, or can be configured to adjust each tap weight substantially continuously and independently of other tap weight adjustments.
  • the pre-processing module 232 can also include a calibrator that scales the output of the de-correlator in order to speed up convergence of a subsequent blind source separator.
  • the pre-processing module 232 couples the de-correlated microphone signals to a source separator 240 that can perform filtering based on, for example, Blind Source Separation (BSS).
  • BSS Blind Source Separation
  • mobile communication device 110 may be small in dimension. The small dimension not only limits the distance between microphones, but it also may limit the number of microphones that can be reasonably mounted on the communication device 110 . Usually, two or, at most, three microphones are used. In general, this number of microphones does not meet the requirements for complete signal separation when there are multiple noise sources.
  • the BSS source separator 240 typically operates to separate the most prominent signal of all from all other signals. After echo cancellation, the desired speech may be expected to be the most prominent component of all signals. After signal separation, two signals are generated by the BSS source separator 240 . One signal typically contains the most prominent signal and somewhat attenuated all other signals. Another signal contains all other signals and somewhat attenuated the most prominent signal.
  • Blind source separation sometimes referred to as independent component analysis (ICA) is a method to reconstruct unknown signals based on their mixtures. These unknown signals are referred to as source signals.
  • the adjective ‘blind’ has two folds of meaning. First, the source signals are not known or partially known. Only measurements of sources signal mixtures are available. Second, the mixing process is not known. Signal separation is achieved by exploring a priori statistics of source signals and/or statistics observed in signal measurements.
  • P S 1 , . . . S m (s 1 , . . . s m ) is the joint probability density function (PDF) of all random variables S 1 , . . . , S m and P S 1 (s j ) is the PDF of the jth random variable S j .
  • PDF probability density function
  • the lengths of the filters inside the BSS source separator 240 can range, for example, from 5 taps to 60 taps.
  • the tap length of the BSS source separator is not a limitation, but rather, is selected based on a tradeoff of factors, including convergence time and steady state performance.
  • a post-processing module 234 may be used to further improve the separation performance by de-correlating the separated signals. Because only one signal from the source separator 240 , the signal having the desired speech, is of interest, the post processing module 234 may implement only one post-filter. The post processing module 234 can filter the signal having the speech component and may perform no additional processing of the signal substantially representative of the noise component. The length of the post-filter can be configured, for example, to be longer than that of each of the two filters in the BSS source separator 240 .
  • One signal contains primarily background noise and residual echo, in which the desired speech has been reduced. This signal is referred to as the noise reference signal.
  • the other signal contains the desired speech signal and attenuated or otherwise reduced noise, interference, and echo signal components. This signal is referred to as the speech reference signal.
  • the signal separator 230 can include a voice activity detection module 250 that makes a voice activity detection decision based on the speech reference signal and noise reference signal.
  • Voice activity detection module 250 may be coupled to the signals at the output of the signal separator 230 , because these signals exhibit the greatest differential of speech and noise. However, the voice activity detection module 250 can make the voice activity decision based on the two signals at the output of any of the intermediate modules within the signal separator 230 .
  • the voice activity detection module 250 can be implemented external to the signal separator 230 , and may operate on the signals at the output of the signal separator 230 .
  • the signal separator 230 can provide access to some or all of the intermediate signal outputs, and the voice activity detection module 250 can be coupled to the signal separator 230 output or an intermediate output.
  • the voice activity detection indication can be used by a subsequent signal processing module, as described below, to modify the signal processing performed on the speech or noise signals.
  • the signal separator 230 couples the speech reference signal and noise reference signal to a nonlinear processing module 260 .
  • the first echo canceller 220 - 1 may couple the echo cancellation signal to the nonlinear processing module 260 .
  • the speech reference signal still contains residual background noise and acoustic echo, whose correlation with noise reference signal is typically low due to the post-processing module 234 inside the signal separator 230 . Therefore, it is typically not possible to use linear filtering to remove residual noise and echo from the speech reference signal. However, the residual noise and echo still may have some similarity to the noise reference signal.
  • the spectral amplitude of the residue noise and echo may be similar to that of the noise reference signal. When similar, this similarity can be exploited to further reduce noise in the speech reference signal using nonlinear noise suppression techniques.
  • the nonlinear processing module 260 can implement spectral subtraction to further suppress residual noise and echo.
  • the noise statistics can be estimated based on the noise reference signal and echo cancellation signal.
  • the estimated noise statistics cover non-stationary noise, stationary noise as well as residual acoustic echo.
  • the estimated noise statistics based on the noise reference signal are typically considered more accurate than noise estimates based on one microphone signal. With more accurate noise statistics, spectral subtraction is capable of performing better noise suppression. Dual-microphone spectral subtraction suppresses not only stationary noise but also non-stationary noise and residual acoustic echo.
  • the nonlinear processing module 260 couples at least the speech reference signal to a post processing module 270 for further noise shaping.
  • the residue noise can be further reduced or masked in the post-processing module 270 .
  • the post-processing module 270 can be configured to perform, for example, center clipping, comfort noise injection, and the like.
  • the post-processing methods can be any one or combination of commonly used speech communications processing techniques.
  • the post processing module 270 can implement center clipping to apply different gains to signals at different level.
  • the gain can be set to be unity when signal level is above a threshold. Otherwise, it is set to be less than unity.
  • the prost processing module 270 assumes that the signal level is low when there is no desired speech. However, this assumption may fail in a noisy environment where the background noise level can be higher than the threshold.
  • the post processing module 270 applies center clipping based in part on the presence of desired speech.
  • the post processing module 270 receives the voice activity decision from the voice activity detection module 250 .
  • the post processing module 270 can apply center clipping in the presence of voice activity.
  • the post processing module 270 selectively applies center clipping based on the voice activity state.
  • the post processing module 270 may also use the voice activity state to selectively apply comfort noise injection.
  • the post processing module 270 may be configured to selectively quiet the voice channel when there is an absence of voice activity.
  • the post processing module may, for example, decrease the gain applied to the speech reference signal or decouple the speech reference signal from subsequent stages when the voice activity detection module 250 indicates with the voice activity state a lack of voice activity.
  • the lack of any significant signal may be disconcerting to a listener, as the listener may wonder if the communication device 110 has dropped the communication link.
  • the post processing module 270 can insert a low level of noise in the absence of speech, referred to as “comfort noise” to indicate or otherwise reassure a listener of the presence of the communication link.
  • the post processing module 270 output represents the output of the noise and echo reduction system 200 .
  • the processed speech reference signal is coupled to the back end processing module 280 such as a speech encoder or an audio encoder. If desired, the post processing module 270 may also couple the noise reference signal to subsequent stages, although seldom is this necessary.
  • FIG. 3A is a simplified functional block diagram of an embodiment of a non-linear processing module 260 implementing spectral subtraction.
  • the non-linear processing module 260 transforms the speech reference signal to the frequency domain and performs frequency selective gain, where the frequency selectivity is based on the number of frequency bins or subbands in the frequency domain.
  • the embodiment of FIG. 3A can be used, for example, in the noise and echo reduction system 200 of FIG. 2 .
  • the non-linear processing module 260 includes a first frequency transform module 312 configured to receive the speech reference signal and transform it to the frequency domain.
  • the first frequency transform module 312 can be configured, for example, to accept a serial signal input and provide a parallel signal output, where each of the output signals is representative of signals within a particular frequency subband.
  • the outputs of the first frequency transform module 312 may be coupled to frequency selective variable gain modules 340 - 1 to 340 -N that are each configured to selectively apply a gain to corresponding frequency bins.
  • the first variable gain module 340 - 1 receives a first output from the first frequency transform module 312 and applies a controllable gain to the first frequency bin.
  • the output of the variable gain modules 340 - 1 to 340 -N may be coupled to a time transform module 350 configured to transform the frequency domain processed speech reference signal back to a time domain representation.
  • the non-linear processing module 260 also includes a second frequency transform module 314 configured to receive the noise reference signal and transform it to a frequency domain representation.
  • the second frequency transform module 314 is illustrated as generating the same number of frequency bins as produced by the first frequency transform module 312 .
  • the second frequency transform module 314 may couple the frequency domain representation of the noise reference signal to noise estimators 320 - 1 to 320 -N. Each frequency bin output from the second frequency transform module 314 may be coupled to a distinct noise estimator, e.g. 320 - 1 .
  • the noise estimators 320 - 1 to 320 -N can be configured to estimate the noise within its associated frequency bin.
  • the noise estimators 320 - 1 to 320 -N couple the noise estimate values to respective spectrum gain controllers 330 - 1 to 330 -N.
  • the spectrum gain controllers 330 - 1 to 330 -N operate to vary the frequency selective gain of the variable gain modules 340 - 1 to 340 -N based at least in part on the noise estimate values.
  • Each of the frequency transform modules 312 and 314 can be configured to perform the frequency transform as a Discrete Fourier Transform, Fast Fourier Transform, or some other transform.
  • the first and second frequency transform modules 312 and 314 are configured to generate the same number of frequency bins, although that is not a limitation.
  • the noise estimators 320 - 1 to 320 -N can be configured to determine a noise magnitude, noise power, noise energy, noise floor, and the like, or some other measure of noise within each frequency bin.
  • the noise estimators 320 - 1 to 320 -N can include memory (not shown) to store one or more previous noise estimates.
  • the noise estimators 320 - 1 to 320 -N can be configured to generate a time moving average or some other weighted average of noise.
  • the spectrum gain controllers 330 - 1 to 330 -N can be configured to apply a gain to each of the frequency bins based on the value of the noise estimate and the corresponding speech reference signal within that frequency bin.
  • each of the spectrum gain controllers 330 - 1 to 330 -N is configured to apply one of a predetermined number of gain values based on the noise estimate value and the corresponding speech reference signal.
  • each of the gain controllers 330 - 1 to 330 -N can generate a substantially continuous gain control value based on the value of the noise estimate and the corresponding speech reference signal within a particular frequency bin. Discussions regarding the general concept of spectral subtraction, may be found in S. F. Boll, “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Trans. Acoustics, Speech and Signal Processing, 27(2): 112-120, April 1979.
  • variable gain modules 340 - 1 to 340 -N can be configured to apply an independent gain to each of the frequency bins based on the control value applied by the respective gain controller 330 - 1 to 330 -N.
  • the first variable gain module 340 - 1 can be configured to apply a gain in the range of 0-1 to the corresponding frequency bin based on the gain control value associated with the frequency bin.
  • the time transform module 350 may be configured to perform substantially the complement of the process performed by the first frequency transform module 312 .
  • the time transform module 350 can be configured to perform an Inverse Discrete Fourier Transform or an Inverse Fast Fourier Transform.
  • FIG. 3B is a simplified functional block diagram of another embodiment of a non-linear processing module 260 implementing spectral subtraction.
  • the non-linear processing module 260 transforms the speech reference signal to the frequency domain and performs frequency selective gain.
  • the embodiment of FIG. 3B can be used, for example, in the noise and echo reduction system 200 of FIG. 2 .
  • the non-linear processing module 260 embodiment of FIG. 3B includes a first frequency transform module 312 configured to receive the speech reference signal and transform it to the frequency domain.
  • the first frequency transform module 312 can be configured to generate a parallel output having a predetermined number, N, of outputs, where each output corresponds to a frequency bin or band.
  • the first frequency transform module 312 can be configured as an N-point FFT.
  • the outputs from the first frequency transform module 312 may be coupled to a frequency selective variable gain module 340 that is configured to selectively apply a gain to each of the frequency bins.
  • the outputs of the variable gain module 340 may be coupled to a time transform module 350 configured to transform the frequency domain processed speech reference signal back to a time domain representation.
  • Each of the frequency bin outputs may also be coupled to an input of a corresponding spectral gain controller 330 - 1 through 330 -N.
  • Each of the spectral gain controllers 330 - 1 through 330 -N is configured to generate a gain control signal for its corresponding frequency bin.
  • the gain control signal from each of the spectral gain controllers 330 - 1 through 330 -N may be coupled to a gain control input of the variable gain module 340 associated with the corresponding frequency bin.
  • the non-linear processing module 260 also includes a second frequency transform module 314 configured to receive the noise reference signal and transform it to a frequency domain representation.
  • the second frequency transform module 314 may be configured to output the same number of frequency bins, N, that are output from the first frequency transform module 312 , but this is not an absolute requirement.
  • Each output from the second frequency transform module 314 representing the noise in a corresponding frequency bin, may be coupled to an input of a corresponding spectral gain controller 330 - 1 through 330 -N.
  • a third frequency transform module 316 may be configured to receive the echo estimate signal from an echo canceller, such as the first echo canceller shown in the system of FIG. 1 .
  • the third frequency transform module 31 may be configured to transform the echo estimate signal to a frequency domain representation, and typically transforms the echo estimate signal to the same number of frequency bins determined by the first and second frequency transform modules 312 and 314 .
  • Each output from the third frequency transform module 316 representing the echo estimate spectral component in a corresponding frequency bin, may be coupled to an input of a corresponding spectral gain controller 330 - 1 through 330 -N.
  • Each spectral gain controller 330 - 1 through 330 -N may be configured to process the speech reference spectral component, noise reference spectral component, and echo estimate spectral component for a particular frequency bin.
  • the non-linear processing module 260 embodiment of FIG. 3B utilizes N distinct spectral gain controllers 330 - 1 through 330 -N.
  • the noise and residual echo present in the speech reference signal may be similar to the noise reference signal and echo estimate signal.
  • Each spectral gain controller 330 - 1 through 330 -N can determine the level of similarity on an individual frequency bin basis to determine the level of gain control to apply to the frequency bin.
  • each spectral gain controller 330 - 1 through 330 -N may control the gain that the frequency selective variable gain module 340 applies to the corresponding frequency bin. Therefore, in the embodiment of FIG. 3B , the frequency selective variable gain module 340 can independently control the gain in N distinct frequency bins.
  • the outputs of the frequency selective variable gain module 340 may be coupled to a time transform module 350 for transform back to a time domain signal, as described in the embodiment of FIG. 3A .
  • FIG. 4 is a simplified functional block diagram of an embodiment of a speech post-processing module 270 .
  • the embodiment of FIG. 4 can be used, for example, in the noise and echo reduction system 200 of FIG. 2 .
  • the speech post-processing module 270 is configured to provide both center clipping and comfort noise injection in the absence of voice activity.
  • the post-processing module 270 includes a variable gain module 410 configured to receive the speech reference signal and apply a gain based at least in part on the voice activity state.
  • the variable gain module 410 may couple the amplified/attenuated output to the first input of a signal combiner 440 , illustrated as a signal summer.
  • the post-processing module 270 also includes a gain controller configured to receive the voice activity state from a voice activity detection module (not shown).
  • the gain controller 420 may control the gain of the variable gain module 410 based in part on the voice activity state.
  • the gain controller 420 can be configured to control the gain of the variable gain module 410 to be unity or some other predetermine value if the voice activity state indicates the presence of voice activity.
  • the gain control module 420 can be configured to control the gain of the variable gain module 410 to be less than unity or less than the predetermined value when the voice activity state indicates the absence of voice activity.
  • the gain control module 420 can be configured to control the gain of the variable gain module 410 to substantially attenuate the speech reference signal in the absence of voice activity.
  • a comfort noise generator 430 may receive the voice activity state as a control input.
  • the comfort noise generator 430 can be configured to generate a noise signal, such as a white noise signal, that can be injected into the audio channel in the absence of voice activity.
  • the gain controller 420 and comfort noise generator 430 may each be active on complementary states of the voice activity decision.
  • the voice activity state indicates presence of voice activity
  • the post-processing module 270 may output substantially the speech reference signal.
  • the post-processing module 270 may output substantially the comfort noise signal.
  • FIG. 5 is a simplified flowchart of an embodiment of a method 500 of noise and echo reduction.
  • the method 500 can be performed by the communication device of FIGS. 1 or 2 or by the noise and echo reduction system within the communication device of FIG. 2 .
  • the method 500 begins at block 510 where the communication device receives multiple microphone signals, for example, from two distinct microphones.
  • the communication device proceeds to block 520 and cancels the echo in each of the received microphone signals.
  • the echo can be considered to be a signal that originates at the communication device that couples to the received microphone signal path.
  • the coupling can be acoustic, mechanical, or can be electrical, via a coupling path within the communication device.
  • the communication device can be configured to independently cancel the echo in each microphone path, as the coupling of the echo signal to each of the paths is likely independent.
  • the communication device can be configured to cancel the echo using an adaptive filter whose taps are varied to minimize a metric of the echo canceled signal.
  • each echo canceller can utilize a normalized least mean square (NLMS) algorithm to minimize the echo signal component in the echo canceled signal.
  • NLMS normalized least mean square
  • the communication device After canceling or otherwise reducing the echo signal component within the microphone signals, the communication device performs signal separation, where the speech signal component is separated or otherwise isolated from the noise signal component.
  • the communication device proceeds to block 530 and de-correlates the microphone signals, for example, by passing at least one of the microphone signals through a linear filter.
  • the linear filter can be an adaptive filter comprising a number of taps, but typically one to three taps are used. The tap weights can be adjusted to minimize the instantaneous correlation between two microphone signals.
  • the filter can be a fixed filter that is configured to de-correlate the two microphone signals.
  • the communication device proceeds to block 540 and separates the speech from the noise by performing Blind Source Separation (BSS) on the two microphone signals.
  • BSS Blind Source Separation
  • the result of BSS may be two distinct signals, one having substantially the speech signal and the other having substantially the noise signal.
  • the communication device proceeds to block 550 and performs post separation processing by passing one of the speech signal or noise signal through a linear filter to de-correlate any residual noise remaining on the two signals.
  • the communication device proceeds to block 560 and performs non-linear noise suppression.
  • the communication device can be configured to perform spectral subtraction.
  • the communication device can perform spectral subtraction by adjusting a frequency selective gain to the speech reference signal that operates, effectively, to reduce noise and residual echo in the speech reference signal.
  • the communication device proceeds to block 570 and performs any additional post processing of the speech reference signal that may be desired.
  • the communication device can perform center clipping and can perform center clipping based on the voice activity state.
  • the communication device can perform comfort noise injection and can inject the comfort noise signal in the absence of voice activity.
  • the output of the post processing stage or stages represents the processed speech signal.
  • FIG. 6 is a simplified functional block diagram of an embodiment of communication device 110 implementing a two-microphone noise and echo reduction system.
  • the communication device 110 includes two microphones 112 - 1 and 112 - 2 and a speaker 130 as in the embodiment of FIG. 2 .
  • the communication device 110 includes a means for reducing noise and echo 600 configured as a means for receiving the multiple microphone signals.
  • the means for reducing noise and echo 600 includes first and second means for performing echo cancellation 620 - 1 and 620 - 2 on each of the two microphone signals.
  • Each of the means for performing echo cancellation 620 - 1 and 620 - 2 operates in conjunction with a corresponding means for combining signals 622 - 1 and 622 - 2 .
  • the communication device 110 includes means for signal separation 630 that includes means for de-correlating the multiple microphone signals 632 that can be configured as an adaptive filter for de-correlating the first and second echo canceled microphone signals.
  • the means for signal separation 630 further includes means for separating 640 a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals that can be configured as a means for Blind Source Separating the speech signal component for the noise signal component.
  • a means for post processing 634 in the means for signal separation 630 can be configured to de-correlate a residual noise signal in the speech reference signal from the noise reference signal.
  • the communication device 110 also includes means for performing non-linear noise suppression 660 on a speech reference signal of the separated microphone signals.
  • the means for performing non-linear noise suppression 660 can be followed by a means for performing post processing 670 of the speech reference signal.
  • a means for voice activity detecting 650 may operate in conjunction with the means for performing post processing 670 and may determine and provide a voice activity state.
  • the output of the means for reducing noise 600 may be coupled to a means for back end signal processing 680 which operates to process the speech reference signal and couple it to a means for providing an air interface 690 .
  • Speech signals received by the means for providing an air interface 690 are coupled to the means for back end signal processing 680 , which formats the signal for output.
  • the output signal is coupled to a means for volume control and speaker compensation 682 , which adjusts the amplitude of the signal to adjust the speaker volume.
  • the output signal may be coupled to the speaker 130 as well as to each of the means for echo canceling 620 - 1 and 820 - 2 .
  • the BSS algorithm separates multiple mixed signals into multiple separated signals. Among all separated signals, typically only one signal, the speech reference signal, is of interest. All other signals are considered different version of noise reference signals.
  • the various noise reference signals can be used to further reduce residue noise and echo in the speech reference signal.
  • coupled or connected is used to mean an indirect coupling as well as a direct coupling or connection. Where two or more blocks, modules, devices, or apparatus are coupled, there may be one or more intervening blocks between the two coupled blocks.
  • DSP digital signal processor
  • RISC Reduced Instruction Set Computer
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, for example, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • the steps of a method, process, or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two.
  • the various steps or acts in a method or process may be performed in the order shown, or may be performed in another order.
  • a circuit or a number of circuits may be used to implement the various steps or acts in a method or process.
  • the circuits may all be part of an integrated circuit, or some of the circuit may be used outside an integrated circuit, or each circuit may be implemented as an integrated circuit.
  • one or more process or method steps may be omitted or one or more process or method steps may be added to the methods and processes.
  • An additional step, block, or action may be added in the beginning, end, or intervening existing elements of the methods and processes.

Abstract

Multiple microphone noise suppression apparatus and methods are described herein. The apparatus and methods implement a variety of noise suppression techniques and apparatus that can be selectively applied to signals received using multiple microphones. The microphone signals received at each of the multiple microphones can be independently processed to cancel echo signal components that can be generated from a local audio source. The echo cancelled signals may be processed by some or all modules within a signal separator that operates to separate or otherwise isolate a speech signal from noise signals. The signal separator can include a pre-processing de-correlator followed by a blind source separator. The output of the blind source separator can be post filtered to provide post separation de-correlation. The separated speech and noise signals can be non-linearly processed for further noise reduction, and additional post processing can be implemented following the non-linear processing.

Description

    CROSS-RELATED APPLICATIONS
  • This application relates to co-pending application “Enhancement Techniques for Blind Source Separation” (Attorney Docket No. 061193), commonly assigned U.S. patent application Ser. No. 11/551,509, filed Oct. 20, 2006, and co-pending application “Multiple Microphone Voice Activity Detector” (Attorney Docket No. 061497), co-filed with this application.
  • BACKGROUND
  • 1. Field of the Invention
  • The disclosure relates to the field of audio processing. More particularly, the disclosure relates to acoustic echo cancellation and noise reduction in multiple microphone audio systems.
  • 2. Description of Related Art
  • Mobile speech communication can be conducted under various environments. The microphones on the mobile device receive not only the desired speech, but also background noise. In many situations, background noise can be abundant. It reduces intelligibility of desired speech. Acoustical echo is another problem in mobile speech communications. Not only it reduces desired speech's intelligibility, it also distracts the far end talk and is very annoying. To improve intelligibility of desired speech, it is necessary to reduce background noise and acoustic echo without distorting the desired speech. Many echo and noise reduction methods have been developed.
  • Traditionally, noise suppression is achieved using only one microphone. One of such noise suppression methods uses spectral subtraction to suppress background noise. The method assumes that the background noise is short-term stationary, i.e. the noise statistics do not change in a short period regardless the activity of the desired speech. Noise statistics are estimated when a desired speech signal is absent and the noise estimates are used to suppress noise in the signal regardless of the activity of desired speech. Spectral subtraction estimates noise statistic and suppress noise in the frequency domain. Each frequency bin is processed independently. This method finds success in stationary noise reduction. However, it is not capable of reducing non-stationary noise.
  • Another single-microphone noise reduction method uses a directional microphone. Usually, uni-directional microphones are more expensive than omni-directional microphones. Also, uni-directional microphones may limit the way the speech communications devices are used since the mobile device may need to be placed properly to ensure its functionality.
  • Echo cancellation is typically achieved by de-correlating microphone signal from far end signal using adaptive filtering. Some aggressive echo cancellation algorithms reduce communication into half-duplex mode, where only one user talks at a time. In mobile speech communication, background noise and acoustic echo reduce intelligibility of desired speech. Therefore, it is desirable to reduce both background noise and echo without distorting desired speech.
  • BRIEF SUMMARY
  • Multiple microphone noise and echo reduction apparatus and methods are described herein. The apparatus and methods implement a variety of noise and echo reduction techniques and apparatus that can be selectively applied to signals received using multiple microphones. The microphone signals received at each of the multiple microphones can be independently processed to cancel acoustic echo that can be generated due to acoustic or mechanical coupling. The echo cancelled signals may be processed by some or all modules within a signal separator that operates to separate or otherwise isolate a speech signal from noise signals. The signal separator can optionally include a pre-processing de-correlator followed by a blind source separator. The output of the blind source separator can be post filtered to provide post separation de-correlation. The separated speech and noise signals can be non-linearly processed for further noise reduction, and additional post processing can be implemented following the non-linear processing.
  • Aspects of the invention include a method of noise reduction in multiple microphone communication devices. The method includes receiving multiple microphone signals, de-correlating the multiple microphone signals, separating a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals, and performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
  • Aspects of the invention include a method of noise reduction in multiple microphone communication devices. The method includes receiving a first microphone signal, receiving a second microphone signal, performing echo cancellation on each of the first microphone signal and the second microphone signal, de-correlating the first microphone signal from the second microphone signal, separating a speech reference signal from a noise reference signal based on the first and second microphone signals, de-correlating a residual noise in the speech reference signal from the noise reference signal, and performing non-linear processing on at least the speech reference signal.
  • Aspects of the invention include an apparatus for noise reduction in multiple microphone systems. The apparatus includes a first echo canceller configured to cancel an echo in a first microphone signal to generate a first echo canceled microphone signal, a second echo canceller configured to cancel an echo in a second microphone signal to generate a second echo canceled microphone signal, a signal separator configured to receive the first and second echo canceled microphone signals and separate a speech signal component from a noise signal component to generate a speech reference signal and a noise reference signal, and a non-linear processing module configured to receive the speech reference signal and noise reference signal and perform non-linear processing on the speech reference signal.
  • Aspects of the invention include an apparatus for noise reduction in multiple microphone systems. The apparatus includes means for receiving multiple microphone signals, means for de-correlating the multiple microphone signals, means for separating a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals, and means for performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
  • Aspects of the invention include a processor readable media including instructions that may be utilized by one or more processors. The instructions include instructions for de-correlating multiple received microphone signals, instructions for separating a speech signal component from a noise signal in at least one of the multiple received microphone signals to generate separated microphone signals, and instructions for performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The features, objects, and advantages of embodiments of the disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings, in which like elements bear like reference numerals.
  • FIG. 1 is simplified functional block diagram of an environment having background noise and acoustic echo in speech communication and a noise suppressor and a typical echo canceller based on an adaptive filter.
  • FIG. 2 is a simplified functional block diagram of an embodiment of a two-microphone noise and echo reduction system.
  • FIGS. 3A-3B are simplified functional block diagrams of embodiments of non-linear processing modules implementing spectral subtraction.
  • FIG. 4 is a simplified functional block diagram of an embodiment of a speech post-processing module.
  • FIG. 5 is a simplified flowchart of an embodiment of a method of noise and echo reduction.
  • FIG. 6 is a simplified functional block diagram of an embodiment of a two-microphone noise and echo reduction system.
  • DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
  • In the present disclosure, a two-microphone noise and echo reduction system is described. It uses two microphones to receive acoustic signals, such as speech signals. Each microphone receives a different mixture of desired speech, background noise and acoustic echo.
  • The noise suppression system uses echo cancellers to reduce acoustic echo in each of the microphone signals. The signal after echo cancellation is fed to an enhanced Blind Source Separation (BSS) module, which substantially separates desired speech signal components from background noise and residual acoustic echo. Then, nonlinear noise and echo reduction is used to further reduce background noise and acoustic echo in the desired speech signal. Post-processing is used to further reduce residue noise and echo. e
  • FIG. 1 is a simplified functional block diagram of an embodiment of a reverberant noise environment 100 in which a communication device 110 operates. The communication device 110 can be, for example, a mobile device, portable device, or stationary device. For example, the communication device 110 can be a mobile telephone, personal digital assistant, notebook computer, sound recorder, headsets, and the like or some other communication device that can receive and process audio signals and optionally output audio signals. The communication device 110 illustrated in FIG. 1 includes multiple microphones 112-1 and 112-2 and at least one audio output device 130.
  • The audio environment can include multiple noise and interference sources, e.g. 162, and can include one or more near end speech sources 150. For example, a single near end speech source 150 can be a user of the communication device 110. Typically, the speech source 150 is positioned in the near field of the microphones 112-1 and 112-2.
  • A number of noise sources 162, 164, and 164 may generate signals incident on the microphones 112-1 and 112-2. The noise sources, 162, 164, and 166 may be positioned throughout the operating environment, as shown in FIG. 1, or one or more noise sources may be positioned close together. Typically, each of the noise sources 162, 164, and 166 is positioned in the far field of the microphones 112-1 and 112-2. The noise sources 162, 164, and 166 can be independent noise sources or can be related noise sources.
  • The speaker 130 local to the communication device 110 can originate one or more echo signals, 132, 134, and 136. An echo signal 132 may traverse substantially a direct path from the speaker 132 to the microphones 112-1 and 112-2. An echo signal may traverse a reflected path 134, where the audio from the speaker 130 reflects off of a surface 170. The echo signal may also traverse a multiply reflected path 136, where the audio from the speaker reflects off of multiple surfaces 170 prior to reaching the microphones 112-1 and 112-2.
  • Although the signal path from each of the noise sources 162, 164, and 166 is depicted as a single path, the signal from each noise source 162, 164, and 166 may traverse multiple paths. Thus, the signal incident on the microphones 112-1 and 112-2 may include multiple signals, including some signals that traverse multiple paths before arriving at the microphones 112-1 and 112-2.
  • The position of the speech source 150 in the near field of the microphones 112-1 and 112-2 may permit its signal to be more prevalent at some of the microphones 122-1 or 112-2. However, the small physical size of typical mobile communication devices 110 may not permit isolation of the speech source 150 signal from a portion of the microphones 112-1 and 112-2 through physical placement alone in order to establish a noise reference signal.
  • The position of speaker 130 may cause its signal to be a near field signal, although one or more of the reflected signals may appear as far field signals. The noise sources 162, 164, and 166 may be in the far field and their noise signal levels may be similarly on all microphones 112-1 and 112-2.
  • The communication device 110 utilizes a combination of echo cancellation and noise suppression to reduce the noise signals and echo signal from the speech signal. The resultant speech signal can be coupled to one or more far end processors or outputs.
  • The microphones 112-1 and 112-2 couple the received signals to respective signal combiners 122-1 and 122-2 that operates as part of, or in conjunction with, adaptive filters 120-1 and 120-2 to cancel at least a predominant echo signal that originates from the speaker 130. The adaptive filter receives an input signal that is substantially the same as the signal coupled to the speaker 130.
  • The output of the adaptive filters 120-1 and 120-2 may be coupled to a second input of the respective signal combiner 122-1 and 122-2. The signal combiners 122-1 and 122-2 can be configured as a summer or subtracter. The signal combiners 122-1 and 122-2 sum the filtered signal or a negated filtered signal to the signal from the microphones 112-1 and 112-2.
  • The adaptive filters 120-1 and 120-2 can be configured to converge on a set of tap weights that minimizes the echo signal component in the signal combiner 122-1 and 122-2 outputs. The outputs from the signal combiners 122-1 and 122-2 can be fed back to the associated adaptive filter 120-1 or 120-2 and used to determine an error or metric related to minimizing the echo signals.
  • The output of the signal summers 122-1 and 122-2 represent the echo canceled input signals. The echo canceled input signals may be coupled to a noise and echo suppressor 140. The noise and echo suppressor 140 can be configured to reduce noise signals and echo signals from the speech signals and may perform suppression of the noise component in order to optimize or otherwise enhance the speech component. Embodiments illustrating details and operation of the noise and echo suppressor are described in association with FIG. 2. The speech signal output from the noise and echo suppressor 140 is coupled to one or more far end devices or modules (not shown) for further processing, output, or some combination thereof.
  • FIG. 2 is a simplified functional block diagram of an embodiment of communication device 110 implementing a two-microphone noise and echo reduction system. Although the communication device 110 embodiment illustrates two microphones 112-1 and 112-2, the noise suppression methods and apparatus can similarly operate on a greater number of microphones.
  • The communication device 110 includes two microphones 112-1 and 112-2 coupled to an input of a noise and echo reduction system 200. The noise and echo reduction system 200 is configured to remove echo signals from the received audio signals, separate the speech from the noise components, and further improve the speech signal by reducing the residual noise and echo.
  • The output of the noise and echo reduction system 200 is typically a speech reference signal, but can include a noise reference signal. The output signals may be coupled to a back end signal processing module 280, which can be, for example, a baseband signal processor of a wireless communication device. The back end signal processing module 280 can be configured to couple some or all of the speech reference signal to an air interface 290, which can be configured to process the speech signal to generate a signal in accordance with a media access control standard and a physical layer standard for wireless transmission over a link.
  • The communication device 110 may support duplex communication over the air interface 290 and may be configured to receive one or more communication signals that include speech signals for output by the communication device 110. The signals received by the air interface 290 may be coupled to the backend signal processing module 280.
  • The back end signal processing module 280 processes the received signals to extract and condition the speech and audio signals in the received signals. The back end signal processing module 280 couples the speech and audio portions to a volume control module 282 that can be configured, for example, to provide user configurable gain. The volume control module 282 can also be configured to provide filtering. In general, the signal processing modules within the noise and echo reduction system 200 may be implemented as analog signal processing modules, digital signal processing modules, or a combination of analog and digital signal processing. Where a module performs digital signal processing, an Analog to Digital Converter (ADC) is implemented at some signal processing point prior to digital processing. Similarly, where analog signal processing occurs following a digital signal processing module, a Digital to Analog Converter (DAC) is used to convert digital signals to their analog representations. As an example, the speaker 130 can include a DAC where the volume control module 282 outputs a digital signal.
  • The volume control module 282 couples the amplified and conditioned output audio signal to the input of the speaker and to at least one input of the noise and echo reduction system 200. The speaker 130 converts the output audio signal from an electrical signal to an audible signal. The noise and echo reduction system 200 utilizes the output audio as an input to one or more echo cancellers 220-1 and 220-2.
  • As described earlier, each of the microphones 112-1 and 112-2 may receive echo signals that are based on the signal output by the speaker 130. The acoustic echo reduces speech intelligibility and may also substantially hinder separation of the speech and noise signal components when the echo is strong. The echo is substantially eliminated, canceled, or otherwise reduced before signal separation to prevent the acoustic echo from confusing speech separation portions of the noise and echo reduction system 200.
  • One echo canceller is included for each microphone signal. A first microphone 112-1 couples its received signal to a first input of a first signal combiner 222-1. The first echo canceller 220-1 couples the echo cancellation signal to a second input of the first signal combiner 222-1. Similarly, the second microphone 112-2 couples its received signal to a first input of a second signal combiner 222-2. The second echo canceller 220-2 couples the echo cancellation signal to a second input of the second signal combiner 222-2.
  • One of the first or second echo cancellers 220-1 and 220-2 can be configured to couple its respective echo cancellation signal to an input of a nonlinear processing module 260. In the embodiment of FIG. 2, the first echo canceller 220-1 is configured to couple its echo cancellation signal to the nonlinear processing module 260.
  • Each signal combiner 222-1 and 222-2 can negate the signal from the respective echo canceller 220-1 and 220-2 before summing with the corresponding microphone signal. Each signal combiner 222-1 and 222-2 outputs an echo canceled signal. The first signal combiner 222-1 couples the first echo canceled signal to a first input of a signal separator 230 and to a feedback input of the first echo canceller 220-1. The second signal combiner 222-2 couples the second echo canceled signal to a second input of the signal separator 230 and to a feedback input of the second echo canceller 220-2.
  • Because there are linear signal processing modules after echo cancellation, each echo canceller 220-1 and 220-2 implements linear echo cancellation. For example, each echo canceller 220-1 and 220-2 can implement an adaptive filter. More particularly, each echo canceller 220-1 and 220-2 can use a normalized least mean square (NLMS) algorithm to minimize the echo signal component in the echo canceled signal.
  • The performance of echo cancellers, e.g. 220-1 and 220-2, based on adaptive filters is limited by linearity of the echo path, including speaker and microphone and their related circuits, and reverberant environment. Echo cancellation performance is also limited by the length of the adaptive filter and the algorithm's capability to deal with echo path change and double talk in which both near end and far end talkers are speaking.
  • Although the echo cancellers 220-1 and 220-2 typically implement echo cancellation based on time domain processing of the microphone and speaker signals, one or more of the echo cancellers 220-1 and 220-2 can implement frequency domain and subband domain processing for echo cancellation. In such cases, the signals from a microphone, e.g. 112-1, may be transformed to frequency domain or subband domain. The echo canceller, e.g. 220-1 can implement an adaptive filter for each frequency bin or subband. The echo canceller, e.g. 220-1, can adjust the tap weights of each adaptive filter to minimize the echo signal component in the output of each frequency bin or subband.
  • After echo cancellation, part of the linear echo has typically been removed. The remaining linear echo and nonlinear echo can be treated as part of the background noise.
  • The signal separator 230 operates to generate a speech reference signal and a noise reference signal. The signal separator 230 embodiment illustrated in FIG. 2 includes a pre-processing module 232, a source separator 240, and a post processing module 234. The signal separator 230 may optionally include a voice activity detection module 250 that operates on the signal at the input, output, or an intermediate point within the signal separator 230. The voice activity detection module 230 may alternatively be implemented external and distinct from the signal separator 230.
  • For particular applications, it may not be necessary to use all the modules in the signal separator 230. In one example, only the BSS source separator 240 is used. In another example, all but the BSS source separator 240 is used. In a third example, the BSS source separator 240 and the post-filter module 234 are used. The signal separator 230 may implement a controller (not shown) that selectively activates or omits each of the signal processing modules within the signal separator 230, for example, depending on signal conditions, operating modes, external control, and the like.
  • On communication device 110, the microphones 112-1 and 112-2 may be placed very close to each other due to limited space. Often, the differences in the signals from each of the microphones 112-1 and 112-2 are very small. Therefore, the instantaneous correlation among microphone signals is very high. When instantaneous correlation is significant, a blind source separator may not perform adequately and may end to cancel the most prominent signal in both microphone signals for two-microphone applications. Sometimes, a blind source separator generates annoying tonal artifacts when operating on signals having high instantaneous correlation.
  • To prevent high instantaneous correlation among the signals from the microphones 112-1 and 112-2, the pre-processing module 232 de-correlates the signals. In one embodiment, the pre-processing module 232 is configured as a digital filter having a small number (fewer than about five) of taps. One to three taps may be sufficient, although a different number of taps may be used. If three taps are used, one tap can be designated to be non-causal.
  • As an example, the pre-processing module 232 can include an adaptive de-correlator, which can be implemented as an adaptive filter with a small number of taps. The adaptive de-correlator can adjust the tap weights in order to minimize correlation or other wise maximize de-correlation. The adaptive de-correlator can be configured to select among a predetermined tap weights, predetermined sets of tap weights and configurations, or can be configured to adjust each tap weight substantially continuously and independently of other tap weight adjustments. The pre-processing module 232 can also include a calibrator that scales the output of the de-correlator in order to speed up convergence of a subsequent blind source separator.
  • The pre-processing module 232 couples the de-correlated microphone signals to a source separator 240 that can perform filtering based on, for example, Blind Source Separation (BSS). As stated above, mobile communication device 110 may be small in dimension. The small dimension not only limits the distance between microphones, but it also may limit the number of microphones that can be reasonably mounted on the communication device 110. Usually, two or, at most, three microphones are used. In general, this number of microphones does not meet the requirements for complete signal separation when there are multiple noise sources. In two-microphone configurations, as illustrated in FIG. 2, the BSS source separator 240 typically operates to separate the most prominent signal of all from all other signals. After echo cancellation, the desired speech may be expected to be the most prominent component of all signals. After signal separation, two signals are generated by the BSS source separator 240. One signal typically contains the most prominent signal and somewhat attenuated all other signals. Another signal contains all other signals and somewhat attenuated the most prominent signal.
  • Blind source separation (BSS), sometimes referred to as independent component analysis (ICA), is a method to reconstruct unknown signals based on their mixtures. These unknown signals are referred to as source signals. The adjective ‘blind’ has two folds of meaning. First, the source signals are not known or partially known. Only measurements of sources signal mixtures are available. Second, the mixing process is not known. Signal separation is achieved by exploring a priori statistics of source signals and/or statistics observed in signal measurements.
  • Early work regarding BSS can be found in many papers. For example, S. Choi, “Blind source separation and independent component analysis: A review,” Neural Information Processing—Letters and Review, 6(1):1-57, January 2005, provides a comprehensive paper on BSS.
  • The assumption used to blindly separate signals is that all source signals are considered independent random variables, i.e. the joint distribution of all random variables is the product of that of individual random variables. This assumption can be formulated as:

  • P S 1 , . . . S m (s 1, . . . sm)=PS 1 (s 1) . . . P S m (s m),
  • where PS 1 , . . . S m (s1, . . . sm) is the joint probability density function (PDF) of all random variables S1, . . . , Sm and PS 1 (sj) is the PDF of the jth random variable Sj.
  • Many BSS algorithms have been developed for differing applications. For example, a paper by K. Torkkola, “Blind separation of convolved sources based on information maximization,” IEEE workshop on Neural Networks for Signal Processing, Kyoto, Japan, September 1996, described an algorithm to separate convolutive signals. In this algorithm, the scalar coefficients in the recurrent neural network are replaced by FIR filters. These filters are updated recursively using adaptive filtering algorithms during signal separation. M. Girolami, “Symmetric adaptive maximum likelihood estimation for noise cancellation and signal separation,” Electronics Letters, 33(17):1437-1438, 1997, describes a similar algorithm for blind source separation. The algorithms described in the cited papers do not represent an exhaustive list of the literature describing BSS, but are provided to illustrate typically BSS algorithms that may be implemented by the source separator 240.
  • The lengths of the filters inside the BSS source separator 240 can range, for example, from 5 taps to 60 taps. The tap length of the BSS source separator is not a limitation, but rather, is selected based on a tradeoff of factors, including convergence time and steady state performance.
  • After signal separation, a post-processing module 234 may be used to further improve the separation performance by de-correlating the separated signals. Because only one signal from the source separator 240, the signal having the desired speech, is of interest, the post processing module 234 may implement only one post-filter. The post processing module 234 can filter the signal having the speech component and may perform no additional processing of the signal substantially representative of the noise component. The length of the post-filter can be configured, for example, to be longer than that of each of the two filters in the BSS source separator 240.
  • Two signal remain after signal separation and post processing. One signal contains primarily background noise and residual echo, in which the desired speech has been reduced. This signal is referred to as the noise reference signal. The other signal contains the desired speech signal and attenuated or otherwise reduced noise, interference, and echo signal components. This signal is referred to as the speech reference signal.
  • The signal separator 230 can include a voice activity detection module 250 that makes a voice activity detection decision based on the speech reference signal and noise reference signal. Voice activity detection module 250 may be coupled to the signals at the output of the signal separator 230, because these signals exhibit the greatest differential of speech and noise. However, the voice activity detection module 250 can make the voice activity decision based on the two signals at the output of any of the intermediate modules within the signal separator 230.
  • In other embodiments, the voice activity detection module 250 can be implemented external to the signal separator 230, and may operate on the signals at the output of the signal separator 230. In other embodiments, the signal separator 230 can provide access to some or all of the intermediate signal outputs, and the voice activity detection module 250 can be coupled to the signal separator 230 output or an intermediate output. The voice activity detection indication can be used by a subsequent signal processing module, as described below, to modify the signal processing performed on the speech or noise signals.
  • The signal separator 230 couples the speech reference signal and noise reference signal to a nonlinear processing module 260. As described earlier, the first echo canceller 220-1 may couple the echo cancellation signal to the nonlinear processing module 260. After signal separation, the speech reference signal still contains residual background noise and acoustic echo, whose correlation with noise reference signal is typically low due to the post-processing module 234 inside the signal separator 230. Therefore, it is typically not possible to use linear filtering to remove residual noise and echo from the speech reference signal. However, the residual noise and echo still may have some similarity to the noise reference signal. The spectral amplitude of the residue noise and echo may be similar to that of the noise reference signal. When similar, this similarity can be exploited to further reduce noise in the speech reference signal using nonlinear noise suppression techniques.
  • As an example, the nonlinear processing module 260 can implement spectral subtraction to further suppress residual noise and echo. In a dual-microphone noise and echo reduction application, such as shown in FIG. 2, the noise statistics can be estimated based on the noise reference signal and echo cancellation signal. The estimated noise statistics cover non-stationary noise, stationary noise as well as residual acoustic echo. The estimated noise statistics based on the noise reference signal are typically considered more accurate than noise estimates based on one microphone signal. With more accurate noise statistics, spectral subtraction is capable of performing better noise suppression. Dual-microphone spectral subtraction suppresses not only stationary noise but also non-stationary noise and residual acoustic echo.
  • After spectral subtraction or some other nonlinear processing, there typically is still residue noise and echo in the speech reference signal. The nonlinear processing module 260 couples at least the speech reference signal to a post processing module 270 for further noise shaping.
  • The residue noise can be further reduced or masked in the post-processing module 270. The post-processing module 270 can be configured to perform, for example, center clipping, comfort noise injection, and the like. The post-processing methods can be any one or combination of commonly used speech communications processing techniques.
  • The post processing module 270 can implement center clipping to apply different gains to signals at different level. For example, the gain can be set to be unity when signal level is above a threshold. Otherwise, it is set to be less than unity.
  • In one embodiment, the prost processing module 270 assumes that the signal level is low when there is no desired speech. However, this assumption may fail in a noisy environment where the background noise level can be higher than the threshold.
  • In an alternative embodiment, the post processing module 270 applies center clipping based in part on the presence of desired speech. The post processing module 270 receives the voice activity decision from the voice activity detection module 250. The post processing module 270 can apply center clipping in the presence of voice activity. Thus, the post processing module 270 selectively applies center clipping based on the voice activity state.
  • The post processing module 270 may also use the voice activity state to selectively apply comfort noise injection. The post processing module 270 may be configured to selectively quiet the voice channel when there is an absence of voice activity. The post processing module may, for example, decrease the gain applied to the speech reference signal or decouple the speech reference signal from subsequent stages when the voice activity detection module 250 indicates with the voice activity state a lack of voice activity. The lack of any significant signal may be disconcerting to a listener, as the listener may wonder if the communication device 110 has dropped the communication link. The post processing module 270 can insert a low level of noise in the absence of speech, referred to as “comfort noise” to indicate or otherwise reassure a listener of the presence of the communication link.
  • The post processing module 270 output represents the output of the noise and echo reduction system 200. The processed speech reference signal is coupled to the back end processing module 280 such as a speech encoder or an audio encoder. If desired, the post processing module 270 may also couple the noise reference signal to subsequent stages, although seldom is this necessary.
  • FIG. 3A is a simplified functional block diagram of an embodiment of a non-linear processing module 260 implementing spectral subtraction. In the embodiment of FIG. 3A, the non-linear processing module 260 transforms the speech reference signal to the frequency domain and performs frequency selective gain, where the frequency selectivity is based on the number of frequency bins or subbands in the frequency domain. The embodiment of FIG. 3A can be used, for example, in the noise and echo reduction system 200 of FIG. 2.
  • The non-linear processing module 260 includes a first frequency transform module 312 configured to receive the speech reference signal and transform it to the frequency domain. The first frequency transform module 312 can be configured, for example, to accept a serial signal input and provide a parallel signal output, where each of the output signals is representative of signals within a particular frequency subband. The outputs of the first frequency transform module 312 may be coupled to frequency selective variable gain modules 340-1 to 340-N that are each configured to selectively apply a gain to corresponding frequency bins. For example, the first variable gain module 340-1 receives a first output from the first frequency transform module 312 and applies a controllable gain to the first frequency bin. The output of the variable gain modules 340-1 to 340-N may be coupled to a time transform module 350 configured to transform the frequency domain processed speech reference signal back to a time domain representation.
  • The non-linear processing module 260 also includes a second frequency transform module 314 configured to receive the noise reference signal and transform it to a frequency domain representation. The second frequency transform module 314 is illustrated as generating the same number of frequency bins as produced by the first frequency transform module 312.
  • The second frequency transform module 314 may couple the frequency domain representation of the noise reference signal to noise estimators 320-1 to 320-N. Each frequency bin output from the second frequency transform module 314 may be coupled to a distinct noise estimator, e.g. 320-1. The noise estimators 320-1 to 320-N can be configured to estimate the noise within its associated frequency bin.
  • The noise estimators 320-1 to 320-N couple the noise estimate values to respective spectrum gain controllers 330-1 to 330-N. The spectrum gain controllers 330-1 to 330-N operate to vary the frequency selective gain of the variable gain modules 340-1 to 340-N based at least in part on the noise estimate values.
  • Each of the frequency transform modules 312 and 314 can be configured to perform the frequency transform as a Discrete Fourier Transform, Fast Fourier Transform, or some other transform. Typically, the first and second frequency transform modules 312 and 314 are configured to generate the same number of frequency bins, although that is not a limitation.
  • The noise estimators 320-1 to 320-N can be configured to determine a noise magnitude, noise power, noise energy, noise floor, and the like, or some other measure of noise within each frequency bin. The noise estimators 320-1 to 320-N can include memory (not shown) to store one or more previous noise estimates. The noise estimators 320-1 to 320-N can be configured to generate a time moving average or some other weighted average of noise.
  • The spectrum gain controllers 330-1 to 330-N can be configured to apply a gain to each of the frequency bins based on the value of the noise estimate and the corresponding speech reference signal within that frequency bin. In one embodiment, each of the spectrum gain controllers 330-1 to 330-N is configured to apply one of a predetermined number of gain values based on the noise estimate value and the corresponding speech reference signal. In another embodiment, each of the gain controllers 330-1 to 330-N can generate a substantially continuous gain control value based on the value of the noise estimate and the corresponding speech reference signal within a particular frequency bin. Discussions regarding the general concept of spectral subtraction, may be found in S. F. Boll, “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Trans. Acoustics, Speech and Signal Processing, 27(2): 112-120, April 1979.
  • The variable gain modules 340-1 to 340-N can be configured to apply an independent gain to each of the frequency bins based on the control value applied by the respective gain controller 330-1 to 330-N. For example, the first variable gain module 340-1 can be configured to apply a gain in the range of 0-1 to the corresponding frequency bin based on the gain control value associated with the frequency bin.
  • The time transform module 350 may be configured to perform substantially the complement of the process performed by the first frequency transform module 312. For example, the time transform module 350 can be configured to perform an Inverse Discrete Fourier Transform or an Inverse Fast Fourier Transform.
  • FIG. 3B is a simplified functional block diagram of another embodiment of a non-linear processing module 260 implementing spectral subtraction. In the embodiment of FIG. 3B, the non-linear processing module 260 transforms the speech reference signal to the frequency domain and performs frequency selective gain. The embodiment of FIG. 3B can be used, for example, in the noise and echo reduction system 200 of FIG. 2.
  • Similar to the embodiment shown in FIG. 3A, the non-linear processing module 260 embodiment of FIG. 3B includes a first frequency transform module 312 configured to receive the speech reference signal and transform it to the frequency domain. The first frequency transform module 312 can be configured to generate a parallel output having a predetermined number, N, of outputs, where each output corresponds to a frequency bin or band. For example, the first frequency transform module 312 can be configured as an N-point FFT.
  • The outputs from the first frequency transform module 312 may be coupled to a frequency selective variable gain module 340 that is configured to selectively apply a gain to each of the frequency bins. The outputs of the variable gain module 340 may be coupled to a time transform module 350 configured to transform the frequency domain processed speech reference signal back to a time domain representation.
  • Each of the frequency bin outputs may also be coupled to an input of a corresponding spectral gain controller 330-1 through 330-N. Each of the spectral gain controllers 330-1 through 330-N is configured to generate a gain control signal for its corresponding frequency bin. The gain control signal from each of the spectral gain controllers 330-1 through 330-N may be coupled to a gain control input of the variable gain module 340 associated with the corresponding frequency bin.
  • The non-linear processing module 260 also includes a second frequency transform module 314 configured to receive the noise reference signal and transform it to a frequency domain representation. Typically, the second frequency transform module 314 may be configured to output the same number of frequency bins, N, that are output from the first frequency transform module 312, but this is not an absolute requirement. Each output from the second frequency transform module 314, representing the noise in a corresponding frequency bin, may be coupled to an input of a corresponding spectral gain controller 330-1 through 330-N.
  • A third frequency transform module 316 may be configured to receive the echo estimate signal from an echo canceller, such as the first echo canceller shown in the system of FIG. 1. The third frequency transform module 31 may be configured to transform the echo estimate signal to a frequency domain representation, and typically transforms the echo estimate signal to the same number of frequency bins determined by the first and second frequency transform modules 312 and 314. Each output from the third frequency transform module 316, representing the echo estimate spectral component in a corresponding frequency bin, may be coupled to an input of a corresponding spectral gain controller 330-1 through 330-N.
  • Each spectral gain controller 330-1 through 330-N may be configured to process the speech reference spectral component, noise reference spectral component, and echo estimate spectral component for a particular frequency bin. Thus, the non-linear processing module 260 embodiment of FIG. 3B utilizes N distinct spectral gain controllers 330-1 through 330-N.
  • The noise and residual echo present in the speech reference signal may be similar to the noise reference signal and echo estimate signal. Each spectral gain controller 330-1 through 330-N can determine the level of similarity on an individual frequency bin basis to determine the level of gain control to apply to the frequency bin.
  • The output from each spectral gain controller 330-1 through 330-N may control the gain that the frequency selective variable gain module 340 applies to the corresponding frequency bin. Therefore, in the embodiment of FIG. 3B, the frequency selective variable gain module 340 can independently control the gain in N distinct frequency bins.
  • The outputs of the frequency selective variable gain module 340 may be coupled to a time transform module 350 for transform back to a time domain signal, as described in the embodiment of FIG. 3A.
  • FIG. 4 is a simplified functional block diagram of an embodiment of a speech post-processing module 270. The embodiment of FIG. 4 can be used, for example, in the noise and echo reduction system 200 of FIG. 2.
  • The speech post-processing module 270 is configured to provide both center clipping and comfort noise injection in the absence of voice activity. The post-processing module 270 includes a variable gain module 410 configured to receive the speech reference signal and apply a gain based at least in part on the voice activity state. The variable gain module 410 may couple the amplified/attenuated output to the first input of a signal combiner 440, illustrated as a signal summer.
  • The post-processing module 270 also includes a gain controller configured to receive the voice activity state from a voice activity detection module (not shown). The gain controller 420 may control the gain of the variable gain module 410 based in part on the voice activity state.
  • The gain controller 420 can be configured to control the gain of the variable gain module 410 to be unity or some other predetermine value if the voice activity state indicates the presence of voice activity. The gain control module 420 can be configured to control the gain of the variable gain module 410 to be less than unity or less than the predetermined value when the voice activity state indicates the absence of voice activity. In one embodiment, the gain control module 420 can be configured to control the gain of the variable gain module 410 to substantially attenuate the speech reference signal in the absence of voice activity.
  • A comfort noise generator 430 may receive the voice activity state as a control input. The comfort noise generator 430 can be configured to generate a noise signal, such as a white noise signal, that can be injected into the audio channel in the absence of voice activity.
  • Thus, the gain controller 420 and comfort noise generator 430 may each be active on complementary states of the voice activity decision. When the voice activity state indicates presence of voice activity, the post-processing module 270 may output substantially the speech reference signal. When the voice activity state indicates absence of voice activity, the post-processing module 270 may output substantially the comfort noise signal.
  • FIG. 5 is a simplified flowchart of an embodiment of a method 500 of noise and echo reduction. The method 500 can be performed by the communication device of FIGS. 1 or 2 or by the noise and echo reduction system within the communication device of FIG. 2.
  • The method 500 begins at block 510 where the communication device receives multiple microphone signals, for example, from two distinct microphones. The communication device proceeds to block 520 and cancels the echo in each of the received microphone signals. The echo can be considered to be a signal that originates at the communication device that couples to the received microphone signal path. The coupling can be acoustic, mechanical, or can be electrical, via a coupling path within the communication device.
  • The communication device can be configured to independently cancel the echo in each microphone path, as the coupling of the echo signal to each of the paths is likely independent. The communication device can be configured to cancel the echo using an adaptive filter whose taps are varied to minimize a metric of the echo canceled signal. For example, each echo canceller can utilize a normalized least mean square (NLMS) algorithm to minimize the echo signal component in the echo canceled signal.
  • After canceling or otherwise reducing the echo signal component within the microphone signals, the communication device performs signal separation, where the speech signal component is separated or otherwise isolated from the noise signal component. The communication device proceeds to block 530 and de-correlates the microphone signals, for example, by passing at least one of the microphone signals through a linear filter. The linear filter can be an adaptive filter comprising a number of taps, but typically one to three taps are used. The tap weights can be adjusted to minimize the instantaneous correlation between two microphone signals. In other embodiments, the filter can be a fixed filter that is configured to de-correlate the two microphone signals.
  • The communication device proceeds to block 540 and separates the speech from the noise by performing Blind Source Separation (BSS) on the two microphone signals. The result of BSS may be two distinct signals, one having substantially the speech signal and the other having substantially the noise signal.
  • The communication device proceeds to block 550 and performs post separation processing by passing one of the speech signal or noise signal through a linear filter to de-correlate any residual noise remaining on the two signals.
  • The communication device proceeds to block 560 and performs non-linear noise suppression. In one embodiment, the communication device can be configured to perform spectral subtraction. The communication device can perform spectral subtraction by adjusting a frequency selective gain to the speech reference signal that operates, effectively, to reduce noise and residual echo in the speech reference signal.
  • The communication device proceeds to block 570 and performs any additional post processing of the speech reference signal that may be desired. For example, the communication device can perform center clipping and can perform center clipping based on the voice activity state. Similarly, the communication device can perform comfort noise injection and can inject the comfort noise signal in the absence of voice activity. The output of the post processing stage or stages represents the processed speech signal.
  • FIG. 6 is a simplified functional block diagram of an embodiment of communication device 110 implementing a two-microphone noise and echo reduction system. The communication device 110 includes two microphones 112-1 and 112-2 and a speaker 130 as in the embodiment of FIG. 2.
  • The communication device 110 includes a means for reducing noise and echo 600 configured as a means for receiving the multiple microphone signals. The means for reducing noise and echo 600 includes first and second means for performing echo cancellation 620-1 and 620-2 on each of the two microphone signals. Each of the means for performing echo cancellation 620-1 and 620-2 operates in conjunction with a corresponding means for combining signals 622-1 and 622-2.
  • The communication device 110 includes means for signal separation 630 that includes means for de-correlating the multiple microphone signals 632 that can be configured as an adaptive filter for de-correlating the first and second echo canceled microphone signals. The means for signal separation 630 further includes means for separating 640 a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals that can be configured as a means for Blind Source Separating the speech signal component for the noise signal component. A means for post processing 634 in the means for signal separation 630 can be configured to de-correlate a residual noise signal in the speech reference signal from the noise reference signal.
  • The communication device 110 also includes means for performing non-linear noise suppression 660 on a speech reference signal of the separated microphone signals. The means for performing non-linear noise suppression 660 can be followed by a means for performing post processing 670 of the speech reference signal.
  • A means for voice activity detecting 650 may operate in conjunction with the means for performing post processing 670 and may determine and provide a voice activity state. The output of the means for reducing noise 600 may be coupled to a means for back end signal processing 680 which operates to process the speech reference signal and couple it to a means for providing an air interface 690.
  • Speech signals received by the means for providing an air interface 690 are coupled to the means for back end signal processing 680, which formats the signal for output. The output signal is coupled to a means for volume control and speaker compensation 682, which adjusts the amplitude of the signal to adjust the speaker volume. The output signal may be coupled to the speaker 130 as well as to each of the means for echo canceling 620-1 and 820-2.
  • Multiple microphone noise and echo reduction is presented in the context of a communication device. In the present disclosure, the emphasis is given to two-microphone noise and echo reduction applications. However, the principle can be generalized to multiple-microphone noise and echo reduction applications. In such cases, additional microphones are used and more adaptive echo cancellers may be needed as well. The BSS algorithm separates multiple mixed signals into multiple separated signals. Among all separated signals, typically only one signal, the speech reference signal, is of interest. All other signals are considered different version of noise reference signals. The various noise reference signals can be used to further reduce residue noise and echo in the speech reference signal.
  • As used herein, the term coupled or connected is used to mean an indirect coupling as well as a direct coupling or connection. Where two or more blocks, modules, devices, or apparatus are coupled, there may be one or more intervening blocks between the two coupled blocks.
  • The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), a Reduced Instruction Set Computer (RISC) processor, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, for example, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • The steps of a method, process, or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The various steps or acts in a method or process may be performed in the order shown, or may be performed in another order. Specifically, a circuit or a number of circuits may be used to implement the various steps or acts in a method or process. The circuits may all be part of an integrated circuit, or some of the circuit may be used outside an integrated circuit, or each circuit may be implemented as an integrated circuit. Additionally, one or more process or method steps may be omitted or one or more process or method steps may be added to the methods and processes. An additional step, block, or action may be added in the beginning, end, or intervening existing elements of the methods and processes.
  • The above description of the disclosed embodiments is provided to enable any person of ordinary skill in the art to make or use the disclosure. Various modifications to these embodiments will be readily apparent to those of ordinary skill in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (25)

1. A method of noise reduction in multiple microphone communication devices, the method comprising:
receiving multiple microphone signals;
de-correlating the multiple microphone signals;
separating a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals; and
performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
2. The method of claim 1, further comprising performing center clipping of the speech reference signal based on a voice activity state.
3. The method of claim 1, further comprising inserting comfort noise to the speech reference signal based on a voice activity state.
4. The method of claim 1, further comprising performing echo cancellation on each of the multiple microphone signals.
5. The method of claim 1, wherein de-correlating the multiple microphone signals comprises filtering at least one of the multiple microphone signals.
6. The method of claim 5, wherein filtering comprises at least one of filtering using a multi-tap filter having at least one non-causal tap or adaptive filtering the at least one of the multiple microphone signals.
7. The method of claim 1, wherein separating the speech signal component from the noise signal comprises Blind Source Separating the speech signal component.
8. The method of claim 1, wherein performing non-linear noise suppression comprises performing spectral subtraction on the speech reference signal.
9. The method of claim 8, wherein performing spectral subtraction comprises:
estimating a noise within a frequency bin based on a noise reference signal from the separated microphone signals; and
adjusting a gain applied to a portion of the speech reference signal within the frequency bin based on the noise in the frequency bin.
10. The method of claim 1, further comprising:
performing echo cancellation on each of the multiple microphone signals; and
de-correlating a residual noise in the speech reference signal from the noise signal.
11. The method of claim 10, wherein performing non-linear noise suppression comprises performing spectral subtraction on the speech reference signal based on a noise estimate derived from the noise signal.
12. An apparatus for noise reduction in multiple microphone systems, the apparatus comprising:
a first echo canceller configured to cancel an echo in a first microphone signal to generate a first echo canceled microphone signal;
a second echo canceller configured to cancel an echo in a second microphone signal to generate a second echo canceled microphone signal;
a signal separator configured to receive the first and second echo canceled microphone signals and separate a speech signal component from a noise signal component to generate a speech reference signal and a noise reference signal; and
a non-linear processing module configured to receive the speech reference signal and noise reference signal and perform non-linear processing on the speech reference signal.
13. The apparatus of claim 12, further comprising a post processing module configured to implement center clipping on the speech reference signal output by the non-linear processing module based on a voice activity state.
14. The apparatus of claim 13, further comprising a voice activity detection module configured to determine the voice activity state based on the speech reference signal and noise reference signal.
15. The apparatus of claim 12, further comprising a post processing module configured to implement comfort noise injection on the speech reference signal output by the non-linear processing module based on a voice activity state.
16. The apparatus of claim 12, wherein the first echo canceller comprises:
an adaptive filter configured to receive a echo signal source and provide a filtered echo signal and configured to minimize a metric determined based on a feedback signal; and
a signal summer configured to subtract the filtered echo signal from the first microphone signal, and configured to couple the first echo canceled microphone signal as the feedback signal.
17. The apparatus of claim 12, wherein the signal separator comprises:
a de-correlator configured to de-correlate the first echo canceled microphone signal from the second echo canceled microphone signal; and
a Blind Source Separator configured to separate a speech signal component from a noise signal component based on de-correlated first echo canceled microphone signal and the second echo canceled microphone signal from the de-correlator.
18. The apparatus of claim 17, wherein the signal separator further comprises a post processing module configured to de-correlate a residual noise in the speech reference signal from the noise reference signal output from the Blind Source Separator.
19. An apparatus for noise reduction in multiple microphone systems, the apparatus comprising:
means for receiving multiple microphone signals;
means for de-correlating the multiple microphone signals;
means for separating a speech signal component from a noise signal in at least one of the multiple microphone signals to generate separated microphone signals; and
means for performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
20. The apparatus of claim 19, further comprising means for performing echo cancellation on each of the multiple microphone signals.
21. A computer-readable media including instructions that may be utilized by one or more processors, the computer-readable media comprising:
instructions for de-correlating multiple received microphone signals;
instructions for separating a speech signal component from a noise signal in at least one of the multiple received microphone signals to generate separated microphone signals; and
instructions for performing non-linear noise suppression on a speech reference signal of the separated microphone signals.
22. The computer-readable media of claim 21, wherein the instructions for separating the speech signal comprise instructions for Blind Source Separating the speech signal component.
23. A circuit for noise reduction in multiple microphone systems, the circuit comprising:
a first echo canceller configured to cancel an echo in a first microphone signal to generate a first echo canceled microphone signal;
a second echo canceller configured to cancel an echo in a second microphone signal to generate a second echo canceled microphone signal;
a signal separator configured to receive the first and second echo canceled microphone signals and separate a speech signal component from a noise signal component to generate a speech reference signal and a noise reference signal; and
a non-linear processing module configured to receive the speech reference signal and noise reference signal and perform non-linear processing on the speech reference signal.
24. The circuit of claim 23, further comprising a post processing module configured to implement center clipping on the speech reference signal output by the non-linear processing module based on a voice activity state.
25. The circuit of claim 24, wherein the circuit is an integrated circuit.
US11/864,906 2007-09-28 2007-09-28 Apparatus and method of noise and echo reduction in multiple microphone audio systems Active 2031-03-08 US8175871B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/864,906 US8175871B2 (en) 2007-09-28 2007-09-28 Apparatus and method of noise and echo reduction in multiple microphone audio systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/864,906 US8175871B2 (en) 2007-09-28 2007-09-28 Apparatus and method of noise and echo reduction in multiple microphone audio systems

Publications (2)

Publication Number Publication Date
US20090089054A1 true US20090089054A1 (en) 2009-04-02
US8175871B2 US8175871B2 (en) 2012-05-08

Family

ID=40509369

Family Applications (1)

Application Number Title Priority Date Filing Date
US11/864,906 Active 2031-03-08 US8175871B2 (en) 2007-09-28 2007-09-28 Apparatus and method of noise and echo reduction in multiple microphone audio systems

Country Status (1)

Country Link
US (1) US8175871B2 (en)

Cited By (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080240465A1 (en) * 2007-03-27 2008-10-02 Sony Corporation Sound reproducing device and sound reproduction method
US20090192803A1 (en) * 2008-01-28 2009-07-30 Qualcomm Incorporated Systems, methods, and apparatus for context replacement by audio level
US20090238377A1 (en) * 2008-03-18 2009-09-24 Qualcomm Incorporated Speech enhancement using multiple microphones on multiple devices
US20090323924A1 (en) * 2008-06-25 2009-12-31 Microsoft Corporation Acoustic echo suppression
US20100232616A1 (en) * 2009-03-13 2010-09-16 Harris Corporation Noise error amplitude reduction
US20110081026A1 (en) * 2009-10-01 2011-04-07 Qualcomm Incorporated Suppressing noise in an audio signal
US20120045074A1 (en) * 2010-08-17 2012-02-23 C-Media Electronics Inc. System, method and apparatus with environmental noise cancellation
US20120095755A1 (en) * 2009-06-19 2012-04-19 Fujitsu Limited Audio signal processing system and audio signal processing method
US20120128168A1 (en) * 2010-11-18 2012-05-24 Texas Instruments Incorporated Method and apparatus for noise and echo cancellation for two microphone system subject to cross-talk
US20120163612A1 (en) * 2008-01-21 2012-06-28 Skype Limited Communication System
US20120232890A1 (en) * 2011-03-11 2012-09-13 Kabushiki Kaisha Toshiba Apparatus and method for discriminating speech, and computer readable medium
US20130136266A1 (en) * 2011-11-30 2013-05-30 David McClain System for Dynamic Spectral Correction of Audio Signals to Compensate for Ambient Noise
US20130294611A1 (en) * 2012-05-04 2013-11-07 Sony Computer Entertainment Inc. Source separation by independent component analysis in conjuction with optimization of acoustic echo cancellation
US20130301840A1 (en) * 2012-05-11 2013-11-14 Christelle Yemdji Methods for processing audio signals and circuit arrangements therefor
TWI423688B (en) * 2010-04-14 2014-01-11 Alcor Micro Corp Voice sensor with electromagnetic wave receiver
US20140119552A1 (en) * 2012-10-26 2014-05-01 Broadcom Corporation Loudspeaker localization with a microphone array
TWI458361B (en) * 2010-09-14 2014-10-21 C Media Electronics Inc System, method and apparatus with environment noise cancellation
US8880395B2 (en) 2012-05-04 2014-11-04 Sony Computer Entertainment Inc. Source separation by independent component analysis in conjunction with source direction information
US8886526B2 (en) 2012-05-04 2014-11-11 Sony Computer Entertainment Inc. Source separation using independent component analysis with mixed multi-variate probability density function
US8964998B1 (en) * 2011-06-07 2015-02-24 Sound Enhancement Technology, Llc System for dynamic spectral correction of audio signals to compensate for ambient noise in the listener's environment
US20150065199A1 (en) * 2013-09-05 2015-03-05 Saurin Shah Mobile phone with variable energy consuming speech recognition module
US9099096B2 (en) 2012-05-04 2015-08-04 Sony Computer Entertainment Inc. Source separation by independent component analysis with moving constraint
US9185487B2 (en) 2006-01-30 2015-11-10 Audience, Inc. System and method for providing noise suppression utilizing null processing noise subtraction
EP2966646A1 (en) * 2014-07-09 2016-01-13 2236008 Ontario Inc. System and method for acoustic management
CN105261363A (en) * 2015-09-18 2016-01-20 深圳前海达闼科技有限公司 Voice recognition method, device and terminal
US20160066087A1 (en) * 2006-01-30 2016-03-03 Ludger Solbach Joint noise suppression and acoustic echo cancellation
US20160127561A1 (en) * 2014-10-31 2016-05-05 Imagination Technologies Limited Automatic Tuning of a Gain Controller
US9424859B2 (en) 2012-11-21 2016-08-23 Harman International Industries Canada Ltd. System to control audio effect parameters of vocal signals
US9461702B2 (en) * 2012-09-06 2016-10-04 Imagination Technologies Limited Systems and methods of echo and noise cancellation in voice communication
WO2017000772A1 (en) * 2015-06-30 2017-01-05 芋头科技(杭州)有限公司 Front-end audio processing system
US9558755B1 (en) * 2010-05-20 2017-01-31 Knowles Electronics, Llc Noise suppression assisted automatic speech recognition
US9595997B1 (en) * 2013-01-02 2017-03-14 Amazon Technologies, Inc. Adaption-based reduction of echo and noise
JP2017069745A (en) * 2015-09-30 2017-04-06 沖電気工業株式会社 Sound source separation and echo suppression device, sound source separation and echo suppression program, and sound source separation and echo suppression method
US20170110142A1 (en) * 2015-10-18 2017-04-20 Kopin Corporation Apparatuses and methods for enhanced speech recognition in variable environments
US9640194B1 (en) 2012-10-04 2017-05-02 Knowles Electronics, Llc Noise suppression for speech processing based on machine-learning mask estimation
US9648421B2 (en) 2011-12-14 2017-05-09 Harris Corporation Systems and methods for matching gain levels of transducers
US9668048B2 (en) 2015-01-30 2017-05-30 Knowles Electronics, Llc Contextual switching of microphones
WO2017112343A1 (en) * 2015-12-24 2017-06-29 Intel Corporation Audio signal processing in noisy environments
US9699554B1 (en) 2010-04-21 2017-07-04 Knowles Electronics, Llc Adaptive signal equalization
US20170208170A1 (en) * 2013-12-23 2017-07-20 Imagination Technologies Limited Echo Path Change Detector
US9799330B2 (en) 2014-08-28 2017-10-24 Knowles Electronics, Llc Multi-sourced noise suppression
US9830899B1 (en) * 2006-05-25 2017-11-28 Knowles Electronics, Llc Adaptive noise cancellation
US9838784B2 (en) 2009-12-02 2017-12-05 Knowles Electronics, Llc Directional audio capture
CN107483761A (en) * 2016-06-07 2017-12-15 电信科学技术研究院 A kind of echo suppressing method and device
US9978388B2 (en) 2014-09-12 2018-05-22 Knowles Electronics, Llc Systems and methods for restoration of speech components
CN108275159A (en) * 2017-01-04 2018-07-13 2236008安大略有限公司 Speech interfaces and vocal music entertainment systems
WO2018229464A1 (en) * 2017-06-13 2018-12-20 Sandeep Kumar Chintala Noise cancellation in voice communication systems
CN109313910A (en) * 2016-05-19 2019-02-05 微软技术许可有限责任公司 The constant training of displacement of the more speaker speech separation unrelated for talker
US10306389B2 (en) 2013-03-13 2019-05-28 Kopin Corporation Head wearable acoustic system with noise canceling microphone geometry apparatuses and methods
US10339952B2 (en) 2013-03-13 2019-07-02 Kopin Corporation Apparatuses and systems for acoustic channel auto-balancing during multi-channel signal extraction
WO2019172865A1 (en) * 2018-03-05 2019-09-12 Hewlett-Packard Development Company, L.P. Use mode-based microphone processing application modifications
US10468020B2 (en) * 2017-06-06 2019-11-05 Cypress Semiconductor Corporation Systems and methods for removing interference for audio pattern recognition
US10535360B1 (en) * 2017-05-25 2020-01-14 Tp Lab, Inc. Phone stand using a plurality of directional speakers
CN111755020A (en) * 2020-08-07 2020-10-09 南京时保联信息科技有限公司 Stereo echo cancellation method
US10819858B2 (en) * 2018-08-12 2020-10-27 AAC Technologies Pte. Ltd. Method for improving echo cancellation effect and system thereof
CN112002339A (en) * 2020-07-22 2020-11-27 海尔优家智能科技(北京)有限公司 Voice noise reduction method and device, computer-readable storage medium and electronic device
US10999444B2 (en) * 2018-12-12 2021-05-04 Panasonic Intellectual Property Corporation Of America Acoustic echo cancellation device, acoustic echo cancellation method and non-transitory computer readable recording medium recording acoustic echo cancellation program
CN113077808A (en) * 2021-03-22 2021-07-06 北京搜狗科技发展有限公司 Voice processing method and device for voice processing
CN114222225A (en) * 2022-02-22 2022-03-22 深圳市技湛科技有限公司 Howling suppression method and device for sound amplification equipment, sound amplification equipment and storage medium
US11368803B2 (en) * 2012-06-28 2022-06-21 Sonos, Inc. Calibration of playback device(s)
US11516612B2 (en) 2016-01-25 2022-11-29 Sonos, Inc. Calibration based on audio content
US11528578B2 (en) 2011-12-29 2022-12-13 Sonos, Inc. Media playback based on sensor data
US11531514B2 (en) 2016-07-22 2022-12-20 Sonos, Inc. Calibration assistance
US11625219B2 (en) 2014-09-09 2023-04-11 Sonos, Inc. Audio processing algorithms
US11696081B2 (en) 2014-03-17 2023-07-04 Sonos, Inc. Audio settings based on environment
US11698770B2 (en) 2016-08-05 2023-07-11 Sonos, Inc. Calibration of a playback device based on an estimated frequency response
US11706579B2 (en) 2015-09-17 2023-07-18 Sonos, Inc. Validation of audio calibration using multi-dimensional motion check
US11728780B2 (en) 2019-08-12 2023-08-15 Sonos, Inc. Audio calibration of a portable playback device
US11736877B2 (en) 2016-04-01 2023-08-22 Sonos, Inc. Updating playback device configuration information based on calibration data
US11736878B2 (en) 2016-07-15 2023-08-22 Sonos, Inc. Spatial audio correction
US20230282197A1 (en) * 2022-03-07 2023-09-07 Mediatek Singapore Pte. Ltd. Heterogeneous Computing for Hybrid Acoustic Echo Cancellation
US11800306B2 (en) 2016-01-18 2023-10-24 Sonos, Inc. Calibration using multiple recording devices
US11803350B2 (en) 2015-09-17 2023-10-31 Sonos, Inc. Facilitating calibration of an audio playback device
US11877139B2 (en) 2018-08-28 2024-01-16 Sonos, Inc. Playback device calibration
US11889276B2 (en) 2016-04-12 2024-01-30 Sonos, Inc. Calibration of audio playback devices

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7398241B2 (en) * 2001-06-08 2008-07-08 Genworth Financial, Inc. Method and system for portable retirement investment
EP2171714B1 (en) * 2007-06-21 2012-08-15 Koninklijke Philips Electronics N.V. A device for and a method of processing audio signals
US8954324B2 (en) * 2007-09-28 2015-02-10 Qualcomm Incorporated Multiple microphone voice activity detector
US9247346B2 (en) * 2007-12-07 2016-01-26 Northern Illinois Research Foundation Apparatus, system and method for noise cancellation and communication for incubators and related devices
JP5375400B2 (en) * 2009-07-22 2013-12-25 ソニー株式会社 Audio processing apparatus, audio processing method and program
JP5489778B2 (en) * 2010-02-25 2014-05-14 キヤノン株式会社 Information processing apparatus and processing method thereof
US8583428B2 (en) * 2010-06-15 2013-11-12 Microsoft Corporation Sound source separation using spatial filtering and regularization phases
US8751220B2 (en) * 2011-11-07 2014-06-10 Broadcom Corporation Multiple microphone based low complexity pitch detector
US9065895B2 (en) * 2012-02-22 2015-06-23 Broadcom Corporation Non-linear echo cancellation
US9516418B2 (en) 2013-01-29 2016-12-06 2236008 Ontario Inc. Sound field spatial stabilizer
US9106196B2 (en) * 2013-06-20 2015-08-11 2236008 Ontario Inc. Sound field spatial stabilizer with echo spectral coherence compensation
US9099973B2 (en) 2013-06-20 2015-08-04 2236008 Ontario Inc. Sound field spatial stabilizer with structured noise compensation
US9271100B2 (en) 2013-06-20 2016-02-23 2236008 Ontario Inc. Sound field spatial stabilizer with spectral coherence compensation
TWI520127B (en) * 2013-08-28 2016-02-01 晨星半導體股份有限公司 Controller for audio device and associated operation method
KR20150103972A (en) 2014-03-04 2015-09-14 삼성전자주식회사 Method for controlling video function and call function and electronic device implementing the same
JP6349899B2 (en) * 2014-04-14 2018-07-04 ヤマハ株式会社 Sound emission and collection device
US9589556B2 (en) * 2014-06-19 2017-03-07 Yang Gao Energy adjustment of acoustic echo replica signal for speech enhancement
US9516159B2 (en) * 2014-11-04 2016-12-06 Apple Inc. System and method of double talk detection with acoustic echo and noise control
US9712866B2 (en) 2015-04-16 2017-07-18 Comigo Ltd. Cancelling TV audio disturbance by set-top boxes in conferences
US9554207B2 (en) 2015-04-30 2017-01-24 Shure Acquisition Holdings, Inc. Offset cartridge microphones
US9565493B2 (en) 2015-04-30 2017-02-07 Shure Acquisition Holdings, Inc. Array microphone system and method of assembling the same
BR112018007055A2 (en) * 2015-10-13 2018-10-23 Sony Corporation Information processor
US10367948B2 (en) 2017-01-13 2019-07-30 Shure Acquisition Holdings, Inc. Post-mixing acoustic echo cancellation systems and methods
US11056129B2 (en) * 2017-04-06 2021-07-06 Dean Robert Gary Anderson Adaptive parametrically formulated noise systems, devices, and methods
US10192567B1 (en) * 2017-10-18 2019-01-29 Motorola Mobility Llc Echo cancellation and suppression in electronic device
EP3804356A1 (en) 2018-06-01 2021-04-14 Shure Acquisition Holdings, Inc. Pattern-forming microphone array
US11297423B2 (en) 2018-06-15 2022-04-05 Shure Acquisition Holdings, Inc. Endfire linear array microphone
WO2020061353A1 (en) 2018-09-20 2020-03-26 Shure Acquisition Holdings, Inc. Adjustable lobe shape for array microphones
US11211061B2 (en) 2019-01-07 2021-12-28 2236008 Ontario Inc. Voice control in a multi-talker and multimedia environment
EP3942842A1 (en) 2019-03-21 2022-01-26 Shure Acquisition Holdings, Inc. Housings and associated design features for ceiling array microphones
US11558693B2 (en) 2019-03-21 2023-01-17 Shure Acquisition Holdings, Inc. Auto focus, auto focus within regions, and auto placement of beamformed microphone lobes with inhibition and voice activity detection functionality
CN113841421A (en) 2019-03-21 2021-12-24 舒尔获得控股公司 Auto-focus, in-region auto-focus, and auto-configuration of beamforming microphone lobes with suppression
WO2020237206A1 (en) 2019-05-23 2020-11-26 Shure Acquisition Holdings, Inc. Steerable speaker array, system, and method for the same
US11302347B2 (en) 2019-05-31 2022-04-12 Shure Acquisition Holdings, Inc. Low latency automixer integrated with voice and noise activity detection
EP4018680A1 (en) 2019-08-23 2022-06-29 Shure Acquisition Holdings, Inc. Two-dimensional microphone array with improved directivity
US10984815B1 (en) * 2019-09-27 2021-04-20 Cypress Semiconductor Corporation Techniques for removing non-linear echo in acoustic echo cancellers
US11823706B1 (en) * 2019-10-14 2023-11-21 Meta Platforms, Inc. Voice activity detection in audio signal
US11552611B2 (en) 2020-02-07 2023-01-10 Shure Acquisition Holdings, Inc. System and method for automatic adjustment of reference gain
CN111161748B (en) * 2020-02-20 2022-09-23 百度在线网络技术(北京)有限公司 Double-talk state detection method and device and electronic equipment
US11706562B2 (en) 2020-05-29 2023-07-18 Shure Acquisition Holdings, Inc. Transducer steering and configuration systems and methods using a local positioning system
EP4285605A1 (en) 2021-01-28 2023-12-06 Shure Acquisition Holdings, Inc. Hybrid audio beamforming system
US11503415B1 (en) * 2021-04-23 2022-11-15 Eargo, Inc. Detection of feedback path change

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5539832A (en) * 1992-04-10 1996-07-23 Ramot University Authority For Applied Research & Industrial Development Ltd. Multi-channel signal separation using cross-polyspectra
US5825671A (en) * 1994-03-16 1998-10-20 U.S. Philips Corporation Signal-source characterization system
US20020114472A1 (en) * 2000-11-30 2002-08-22 Lee Soo Young Method for active noise cancellation using independent component analysis
US20020172374A1 (en) * 1999-11-29 2002-11-21 Bizjak Karl M. Noise extractor system and method
US6526148B1 (en) * 1999-05-18 2003-02-25 Siemens Corporate Research, Inc. Device and method for demixing signal mixtures using fast blind source separation technique based on delay and attenuation compensation, and for selecting channels for the demixed signals
US20030061185A1 (en) * 1999-10-14 2003-03-27 Te-Won Lee System and method of separating signals
US20030179888A1 (en) * 2002-03-05 2003-09-25 Burnett Gregory C. Voice activity detection (VAD) devices and methods for use with noise suppression systems
US6694020B1 (en) * 1999-09-14 2004-02-17 Agere Systems, Inc. Frequency domain stereophonic acoustic echo canceller utilizing non-linear transformations
US20050060142A1 (en) * 2003-09-12 2005-03-17 Erik Visser Separation of target acoustic signals in a multi-transducer arrangement
US20050105644A1 (en) * 2002-02-27 2005-05-19 Qinetiq Limited Blind signal separation
US6904146B2 (en) * 2002-05-03 2005-06-07 Acoustic Technology, Inc. Full duplex echo cancelling circuit
US20060013101A1 (en) * 2002-05-13 2006-01-19 Kazuhiro Kawana Audio apparatus and its reproduction program
US20060080089A1 (en) * 2004-10-08 2006-04-13 Matthias Vierthaler Circuit arrangement and method for audio signals containing speech
US20070021958A1 (en) * 2005-07-22 2007-01-25 Erik Visser Robust separation of speech signals in a noisy environment
US7359504B1 (en) * 2002-12-03 2008-04-15 Plantronics, Inc. Method and apparatus for reducing echo and noise
US7496482B2 (en) * 2003-09-02 2009-02-24 Nippon Telegraph And Telephone Corporation Signal separation method, signal separation device and recording medium
US7630502B2 (en) * 2003-09-16 2009-12-08 Mitel Networks Corporation Method for optimal microphone array design under uniform acoustic coupling constraints
US7653537B2 (en) * 2003-09-30 2010-01-26 Stmicroelectronics Asia Pacific Pte. Ltd. Method and system for detecting voice activity based on cross-correlation
US7817808B2 (en) * 2007-07-19 2010-10-19 Alon Konchitsky Dual adaptive structure for speech enhancement
US7970564B2 (en) * 2006-05-02 2011-06-28 Qualcomm Incorporated Enhancement techniques for blind source separation (BSS)

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR0161258B1 (en) 1988-03-11 1999-03-20 프레드릭 제이 비스코 Voice activity detection
JP2758846B2 (en) 1995-02-27 1998-05-28 埼玉日本電気株式会社 Noise canceller device
US5694474A (en) 1995-09-18 1997-12-02 Interval Research Corporation Adaptive filter for signal processing and method therefor
JP3505085B2 (en) 1998-04-14 2004-03-08 アルパイン株式会社 Audio equipment
WO2001095666A2 (en) 2000-06-05 2001-12-13 Nanyang Technological University Adaptive directional noise cancelling microphone system
JP3364487B2 (en) 2001-06-25 2003-01-08 隆義 山本 Speech separation method for composite speech data, speaker identification method, speech separation device for composite speech data, speaker identification device, computer program, and recording medium
US7082204B2 (en) 2002-07-15 2006-07-25 Sony Ericsson Mobile Communications Ab Electronic devices, methods of operating the same, and computer program products for detecting noise in a signal based on a combination of spatial correlation and time correlation
US7383178B2 (en) 2002-12-11 2008-06-03 Softmax, Inc. System and method for speech processing using independent component analysis under stability constraints
JP2004274683A (en) 2003-03-12 2004-09-30 Matsushita Electric Ind Co Ltd Echo canceler, echo canceling method, program, and recording medium
JP2005227512A (en) 2004-02-12 2005-08-25 Yamaha Motor Co Ltd Sound signal processing method and its apparatus, voice recognition device, and program
WO2006131959A1 (en) 2005-06-06 2006-12-14 Saga University Signal separating apparatus
JP4556875B2 (en) 2006-01-18 2010-10-06 ソニー株式会社 Audio signal separation apparatus and method

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5539832A (en) * 1992-04-10 1996-07-23 Ramot University Authority For Applied Research & Industrial Development Ltd. Multi-channel signal separation using cross-polyspectra
US5825671A (en) * 1994-03-16 1998-10-20 U.S. Philips Corporation Signal-source characterization system
US6526148B1 (en) * 1999-05-18 2003-02-25 Siemens Corporate Research, Inc. Device and method for demixing signal mixtures using fast blind source separation technique based on delay and attenuation compensation, and for selecting channels for the demixed signals
US6694020B1 (en) * 1999-09-14 2004-02-17 Agere Systems, Inc. Frequency domain stereophonic acoustic echo canceller utilizing non-linear transformations
US20030061185A1 (en) * 1999-10-14 2003-03-27 Te-Won Lee System and method of separating signals
US20020172374A1 (en) * 1999-11-29 2002-11-21 Bizjak Karl M. Noise extractor system and method
US20020114472A1 (en) * 2000-11-30 2002-08-22 Lee Soo Young Method for active noise cancellation using independent component analysis
US20050105644A1 (en) * 2002-02-27 2005-05-19 Qinetiq Limited Blind signal separation
US20030179888A1 (en) * 2002-03-05 2003-09-25 Burnett Gregory C. Voice activity detection (VAD) devices and methods for use with noise suppression systems
US6904146B2 (en) * 2002-05-03 2005-06-07 Acoustic Technology, Inc. Full duplex echo cancelling circuit
US20060013101A1 (en) * 2002-05-13 2006-01-19 Kazuhiro Kawana Audio apparatus and its reproduction program
US7359504B1 (en) * 2002-12-03 2008-04-15 Plantronics, Inc. Method and apparatus for reducing echo and noise
US7496482B2 (en) * 2003-09-02 2009-02-24 Nippon Telegraph And Telephone Corporation Signal separation method, signal separation device and recording medium
US20050060142A1 (en) * 2003-09-12 2005-03-17 Erik Visser Separation of target acoustic signals in a multi-transducer arrangement
US7630502B2 (en) * 2003-09-16 2009-12-08 Mitel Networks Corporation Method for optimal microphone array design under uniform acoustic coupling constraints
US7653537B2 (en) * 2003-09-30 2010-01-26 Stmicroelectronics Asia Pacific Pte. Ltd. Method and system for detecting voice activity based on cross-correlation
US20060080089A1 (en) * 2004-10-08 2006-04-13 Matthias Vierthaler Circuit arrangement and method for audio signals containing speech
US20070021958A1 (en) * 2005-07-22 2007-01-25 Erik Visser Robust separation of speech signals in a noisy environment
US7464029B2 (en) * 2005-07-22 2008-12-09 Qualcomm Incorporated Robust separation of speech signals in a noisy environment
US7970564B2 (en) * 2006-05-02 2011-06-28 Qualcomm Incorporated Enhancement techniques for blind source separation (BSS)
US7817808B2 (en) * 2007-07-19 2010-10-19 Alon Konchitsky Dual adaptive structure for speech enhancement

Cited By (118)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160066087A1 (en) * 2006-01-30 2016-03-03 Ludger Solbach Joint noise suppression and acoustic echo cancellation
US9185487B2 (en) 2006-01-30 2015-11-10 Audience, Inc. System and method for providing noise suppression utilizing null processing noise subtraction
US9830899B1 (en) * 2006-05-25 2017-11-28 Knowles Electronics, Llc Adaptive noise cancellation
US20080240465A1 (en) * 2007-03-27 2008-10-02 Sony Corporation Sound reproducing device and sound reproduction method
US8265297B2 (en) * 2007-03-27 2012-09-11 Sony Corporation Sound reproducing device and sound reproduction method for echo cancelling and noise reduction
US20120163612A1 (en) * 2008-01-21 2012-06-28 Skype Limited Communication System
US9172817B2 (en) * 2008-01-21 2015-10-27 Skype Communication system
US8554551B2 (en) 2008-01-28 2013-10-08 Qualcomm Incorporated Systems, methods, and apparatus for context replacement by audio level
US20090192791A1 (en) * 2008-01-28 2009-07-30 Qualcomm Incorporated Systems, methods and apparatus for context descriptor transmission
US8600740B2 (en) 2008-01-28 2013-12-03 Qualcomm Incorporated Systems, methods and apparatus for context descriptor transmission
US20090190780A1 (en) * 2008-01-28 2009-07-30 Qualcomm Incorporated Systems, methods, and apparatus for context processing using multiple microphones
US8560307B2 (en) 2008-01-28 2013-10-15 Qualcomm Incorporated Systems, methods, and apparatus for context suppression using receivers
US8554550B2 (en) 2008-01-28 2013-10-08 Qualcomm Incorporated Systems, methods, and apparatus for context processing using multi resolution analysis
US20090192790A1 (en) * 2008-01-28 2009-07-30 Qualcomm Incorporated Systems, methods, and apparatus for context suppression using receivers
US8483854B2 (en) 2008-01-28 2013-07-09 Qualcomm Incorporated Systems, methods, and apparatus for context processing using multiple microphones
US20090192803A1 (en) * 2008-01-28 2009-07-30 Qualcomm Incorporated Systems, methods, and apparatus for context replacement by audio level
US20090192802A1 (en) * 2008-01-28 2009-07-30 Qualcomm Incorporated Systems, methods, and apparatus for context processing using multi resolution analysis
US9113240B2 (en) 2008-03-18 2015-08-18 Qualcomm Incorporated Speech enhancement using multiple microphones on multiple devices
US20090238377A1 (en) * 2008-03-18 2009-09-24 Qualcomm Incorporated Speech enhancement using multiple microphones on multiple devices
US20090323924A1 (en) * 2008-06-25 2009-12-31 Microsoft Corporation Acoustic echo suppression
US8325909B2 (en) * 2008-06-25 2012-12-04 Microsoft Corporation Acoustic echo suppression
US8229126B2 (en) * 2009-03-13 2012-07-24 Harris Corporation Noise error amplitude reduction
US20100232616A1 (en) * 2009-03-13 2010-09-16 Harris Corporation Noise error amplitude reduction
US20120095755A1 (en) * 2009-06-19 2012-04-19 Fujitsu Limited Audio signal processing system and audio signal processing method
US8676571B2 (en) * 2009-06-19 2014-03-18 Fujitsu Limited Audio signal processing system and audio signal processing method
CN102549659A (en) * 2009-10-01 2012-07-04 高通股份有限公司 Suppressing noise in an audio signal
US8571231B2 (en) * 2009-10-01 2013-10-29 Qualcomm Incorporated Suppressing noise in an audio signal
US20110081026A1 (en) * 2009-10-01 2011-04-07 Qualcomm Incorporated Suppressing noise in an audio signal
US9838784B2 (en) 2009-12-02 2017-12-05 Knowles Electronics, Llc Directional audio capture
TWI423688B (en) * 2010-04-14 2014-01-11 Alcor Micro Corp Voice sensor with electromagnetic wave receiver
US9699554B1 (en) 2010-04-21 2017-07-04 Knowles Electronics, Llc Adaptive signal equalization
US9558755B1 (en) * 2010-05-20 2017-01-31 Knowles Electronics, Llc Noise suppression assisted automatic speech recognition
US20120045074A1 (en) * 2010-08-17 2012-02-23 C-Media Electronics Inc. System, method and apparatus with environmental noise cancellation
US8737640B2 (en) * 2010-08-17 2014-05-27 C-Media Electronics Inc. System, method and apparatus with environmental noise cancellation
TWI458361B (en) * 2010-09-14 2014-10-21 C Media Electronics Inc System, method and apparatus with environment noise cancellation
US20120128168A1 (en) * 2010-11-18 2012-05-24 Texas Instruments Incorporated Method and apparatus for noise and echo cancellation for two microphone system subject to cross-talk
US20120232890A1 (en) * 2011-03-11 2012-09-13 Kabushiki Kaisha Toshiba Apparatus and method for discriminating speech, and computer readable medium
US9330682B2 (en) * 2011-03-11 2016-05-03 Kabushiki Kaisha Toshiba Apparatus and method for discriminating speech, and computer readable medium
US8964998B1 (en) * 2011-06-07 2015-02-24 Sound Enhancement Technology, Llc System for dynamic spectral correction of audio signals to compensate for ambient noise in the listener's environment
WO2013081670A1 (en) * 2011-11-30 2013-06-06 Sound Enhancement Technology, Llc System for dynamic spectral correction of audio signals to compensate for ambient noise
US20130136266A1 (en) * 2011-11-30 2013-05-30 David McClain System for Dynamic Spectral Correction of Audio Signals to Compensate for Ambient Noise
US8913754B2 (en) * 2011-11-30 2014-12-16 Sound Enhancement Technology, Llc System for dynamic spectral correction of audio signals to compensate for ambient noise
US9648421B2 (en) 2011-12-14 2017-05-09 Harris Corporation Systems and methods for matching gain levels of transducers
US11889290B2 (en) 2011-12-29 2024-01-30 Sonos, Inc. Media playback based on sensor data
US11849299B2 (en) 2011-12-29 2023-12-19 Sonos, Inc. Media playback based on sensor data
US11528578B2 (en) 2011-12-29 2022-12-13 Sonos, Inc. Media playback based on sensor data
US11825290B2 (en) 2011-12-29 2023-11-21 Sonos, Inc. Media playback based on sensor data
US11910181B2 (en) 2011-12-29 2024-02-20 Sonos, Inc Media playback based on sensor data
US11825289B2 (en) 2011-12-29 2023-11-21 Sonos, Inc. Media playback based on sensor data
US20130294611A1 (en) * 2012-05-04 2013-11-07 Sony Computer Entertainment Inc. Source separation by independent component analysis in conjuction with optimization of acoustic echo cancellation
CN103426436A (en) * 2012-05-04 2013-12-04 索尼电脑娱乐公司 Source separation by independent component analysis in conjuction with optimization of acoustic echo cancellation
US8886526B2 (en) 2012-05-04 2014-11-11 Sony Computer Entertainment Inc. Source separation using independent component analysis with mixed multi-variate probability density function
US8880395B2 (en) 2012-05-04 2014-11-04 Sony Computer Entertainment Inc. Source separation by independent component analysis in conjunction with source direction information
US9099096B2 (en) 2012-05-04 2015-08-04 Sony Computer Entertainment Inc. Source separation by independent component analysis with moving constraint
US9768829B2 (en) * 2012-05-11 2017-09-19 Intel Deutschland Gmbh Methods for processing audio signals and circuit arrangements therefor
US20130301840A1 (en) * 2012-05-11 2013-11-14 Christelle Yemdji Methods for processing audio signals and circuit arrangements therefor
US11516608B2 (en) 2012-06-28 2022-11-29 Sonos, Inc. Calibration state variable
US11800305B2 (en) 2012-06-28 2023-10-24 Sonos, Inc. Calibration interface
US11516606B2 (en) 2012-06-28 2022-11-29 Sonos, Inc. Calibration interface
US11368803B2 (en) * 2012-06-28 2022-06-21 Sonos, Inc. Calibration of playback device(s)
US9461702B2 (en) * 2012-09-06 2016-10-04 Imagination Technologies Limited Systems and methods of echo and noise cancellation in voice communication
US9640194B1 (en) 2012-10-04 2017-05-02 Knowles Electronics, Llc Noise suppression for speech processing based on machine-learning mask estimation
US20140119552A1 (en) * 2012-10-26 2014-05-01 Broadcom Corporation Loudspeaker localization with a microphone array
US9609141B2 (en) * 2012-10-26 2017-03-28 Avago Technologies General Ip (Singapore) Pte. Ltd. Loudspeaker localization with a microphone array
US9424859B2 (en) 2012-11-21 2016-08-23 Harman International Industries Canada Ltd. System to control audio effect parameters of vocal signals
US9595997B1 (en) * 2013-01-02 2017-03-14 Amazon Technologies, Inc. Adaption-based reduction of echo and noise
US10306389B2 (en) 2013-03-13 2019-05-28 Kopin Corporation Head wearable acoustic system with noise canceling microphone geometry apparatuses and methods
US10339952B2 (en) 2013-03-13 2019-07-02 Kopin Corporation Apparatuses and systems for acoustic channel auto-balancing during multi-channel signal extraction
US9251806B2 (en) * 2013-09-05 2016-02-02 Intel Corporation Mobile phone with variable energy consuming speech recognition module
US20150065199A1 (en) * 2013-09-05 2015-03-05 Saurin Shah Mobile phone with variable energy consuming speech recognition module
US10250740B2 (en) * 2013-12-23 2019-04-02 Imagination Technologies Limited Echo path change detector
US20170208170A1 (en) * 2013-12-23 2017-07-20 Imagination Technologies Limited Echo Path Change Detector
US11696081B2 (en) 2014-03-17 2023-07-04 Sonos, Inc. Audio settings based on environment
EP2966646A1 (en) * 2014-07-09 2016-01-13 2236008 Ontario Inc. System and method for acoustic management
US9978355B2 (en) * 2014-07-09 2018-05-22 2236008 Ontario Inc. System and method for acoustic management
US9767784B2 (en) 2014-07-09 2017-09-19 2236008 Ontario Inc. System and method for acoustic management
US9799330B2 (en) 2014-08-28 2017-10-24 Knowles Electronics, Llc Multi-sourced noise suppression
US11625219B2 (en) 2014-09-09 2023-04-11 Sonos, Inc. Audio processing algorithms
US9978388B2 (en) 2014-09-12 2018-05-22 Knowles Electronics, Llc Systems and methods for restoration of speech components
US10244121B2 (en) * 2014-10-31 2019-03-26 Imagination Technologies Limited Automatic tuning of a gain controller
US20160127561A1 (en) * 2014-10-31 2016-05-05 Imagination Technologies Limited Automatic Tuning of a Gain Controller
US9668048B2 (en) 2015-01-30 2017-05-30 Knowles Electronics, Llc Contextual switching of microphones
WO2017000772A1 (en) * 2015-06-30 2017-01-05 芋头科技(杭州)有限公司 Front-end audio processing system
US11803350B2 (en) 2015-09-17 2023-10-31 Sonos, Inc. Facilitating calibration of an audio playback device
US11706579B2 (en) 2015-09-17 2023-07-18 Sonos, Inc. Validation of audio calibration using multi-dimensional motion check
WO2017045512A1 (en) * 2015-09-18 2017-03-23 深圳前海达闼科技有限公司 Voice recognition method and apparatus, terminal, and voice recognition device
CN105261363A (en) * 2015-09-18 2016-01-20 深圳前海达闼科技有限公司 Voice recognition method, device and terminal
JP2017069745A (en) * 2015-09-30 2017-04-06 沖電気工業株式会社 Sound source separation and echo suppression device, sound source separation and echo suppression program, and sound source separation and echo suppression method
US20170110142A1 (en) * 2015-10-18 2017-04-20 Kopin Corporation Apparatuses and methods for enhanced speech recognition in variable environments
US11631421B2 (en) * 2015-10-18 2023-04-18 Solos Technology Limited Apparatuses and methods for enhanced speech recognition in variable environments
US9928848B2 (en) * 2015-12-24 2018-03-27 Intel Corporation Audio signal noise reduction in noisy environments
WO2017112343A1 (en) * 2015-12-24 2017-06-29 Intel Corporation Audio signal processing in noisy environments
US20170186442A1 (en) * 2015-12-24 2017-06-29 Intel Corporation Audio signal processing in noisy environments
US11800306B2 (en) 2016-01-18 2023-10-24 Sonos, Inc. Calibration using multiple recording devices
US11516612B2 (en) 2016-01-25 2022-11-29 Sonos, Inc. Calibration based on audio content
US11736877B2 (en) 2016-04-01 2023-08-22 Sonos, Inc. Updating playback device configuration information based on calibration data
US11889276B2 (en) 2016-04-12 2024-01-30 Sonos, Inc. Calibration of audio playback devices
CN109313910A (en) * 2016-05-19 2019-02-05 微软技术许可有限责任公司 The constant training of displacement of the more speaker speech separation unrelated for talker
CN107483761A (en) * 2016-06-07 2017-12-15 电信科学技术研究院 A kind of echo suppressing method and device
US11736878B2 (en) 2016-07-15 2023-08-22 Sonos, Inc. Spatial audio correction
US11531514B2 (en) 2016-07-22 2022-12-20 Sonos, Inc. Calibration assistance
US11698770B2 (en) 2016-08-05 2023-07-11 Sonos, Inc. Calibration of a playback device based on an estimated frequency response
CN108275159A (en) * 2017-01-04 2018-07-13 2236008安大略有限公司 Speech interfaces and vocal music entertainment systems
US10535360B1 (en) * 2017-05-25 2020-01-14 Tp Lab, Inc. Phone stand using a plurality of directional speakers
US11355135B1 (en) * 2017-05-25 2022-06-07 Tp Lab, Inc. Phone stand using a plurality of microphones
US10468020B2 (en) * 2017-06-06 2019-11-05 Cypress Semiconductor Corporation Systems and methods for removing interference for audio pattern recognition
WO2018229464A1 (en) * 2017-06-13 2018-12-20 Sandeep Kumar Chintala Noise cancellation in voice communication systems
US11081125B2 (en) 2017-06-13 2021-08-03 Sandeep Kumar Chintala Noise cancellation in voice communication systems
WO2019172865A1 (en) * 2018-03-05 2019-09-12 Hewlett-Packard Development Company, L.P. Use mode-based microphone processing application modifications
US10819858B2 (en) * 2018-08-12 2020-10-27 AAC Technologies Pte. Ltd. Method for improving echo cancellation effect and system thereof
US11877139B2 (en) 2018-08-28 2024-01-16 Sonos, Inc. Playback device calibration
US10999444B2 (en) * 2018-12-12 2021-05-04 Panasonic Intellectual Property Corporation Of America Acoustic echo cancellation device, acoustic echo cancellation method and non-transitory computer readable recording medium recording acoustic echo cancellation program
US11728780B2 (en) 2019-08-12 2023-08-15 Sonos, Inc. Audio calibration of a portable playback device
CN112002339A (en) * 2020-07-22 2020-11-27 海尔优家智能科技(北京)有限公司 Voice noise reduction method and device, computer-readable storage medium and electronic device
CN111755020A (en) * 2020-08-07 2020-10-09 南京时保联信息科技有限公司 Stereo echo cancellation method
CN113077808A (en) * 2021-03-22 2021-07-06 北京搜狗科技发展有限公司 Voice processing method and device for voice processing
CN114222225A (en) * 2022-02-22 2022-03-22 深圳市技湛科技有限公司 Howling suppression method and device for sound amplification equipment, sound amplification equipment and storage medium
US20230282197A1 (en) * 2022-03-07 2023-09-07 Mediatek Singapore Pte. Ltd. Heterogeneous Computing for Hybrid Acoustic Echo Cancellation

Also Published As

Publication number Publication date
US8175871B2 (en) 2012-05-08

Similar Documents

Publication Publication Date Title
US8175871B2 (en) Apparatus and method of noise and echo reduction in multiple microphone audio systems
US10964314B2 (en) System and method for optimized noise reduction in the presence of speech distortion using adaptive microphone array
US9589556B2 (en) Energy adjustment of acoustic echo replica signal for speech enhancement
US9520139B2 (en) Post tone suppression for speech enhancement
RU2483439C2 (en) Robust two microphone noise suppression system
US7092529B2 (en) Adaptive control system for noise cancellation
EP2045928B1 (en) Multi-channel echo cancellation with round robin regularization
US9280965B2 (en) Method for determining a noise reference signal for noise compensation and/or noise reduction
EP1848243B1 (en) Multi-channel echo compensation system and method
KR101520123B1 (en) Acoustic echo cancellation based on noise environment
US7003099B1 (en) Small array microphone for acoustic echo cancellation and noise suppression
US9613634B2 (en) Control of acoustic echo canceller adaptive filter for speech enhancement
US8306215B2 (en) Echo canceller for eliminating echo without being affected by noise
KR101532531B1 (en) Echo cancellation device and method for small-scale hands-free voice communication system
KR20100113146A (en) Enhanced blind source separation algorithm for highly correlated mixtures
US8468018B2 (en) Apparatus and method for canceling noise of voice signal in electronic apparatus
US9699554B1 (en) Adaptive signal equalization
US8462962B2 (en) Sound processor, sound processing method and recording medium storing sound processing program
WO2009130513A1 (en) Two microphone noise reduction system
WO2011129725A1 (en) Method and arrangement for noise cancellation in a speech encoder
KR20110038024A (en) System and method for providing noise suppression utilizing null processing noise subtraction
US9508359B2 (en) Acoustic echo preprocessing for speech enhancement
CN109273019B (en) Method for double-talk detection for echo suppression and echo suppression
WO2013096159A2 (en) Apparatus and method for noise removal
US9729967B2 (en) Feedback canceling system and method

Legal Events

Date Code Title Description
AS Assignment

Owner name: QUALCOMM INCORPORATED, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WANG, SONG;GUPTA, SAMIR KUMAR;CHOY, EDDIE L. T.;REEL/FRAME:020030/0584;SIGNING DATES FROM 20071017 TO 20071025

Owner name: QUALCOMM INCORPORATED, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WANG, SONG;GUPTA, SAMIR KUMAR;CHOY, EDDIE L. T.;SIGNING DATES FROM 20071017 TO 20071025;REEL/FRAME:020030/0584

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 12