WO2000063887A1 - Noise suppression using external voice activity detection - Google Patents

Noise suppression using external voice activity detection Download PDF

Info

Publication number
WO2000063887A1
WO2000063887A1 PCT/US2000/007090 US0007090W WO0063887A1 WO 2000063887 A1 WO2000063887 A1 WO 2000063887A1 US 0007090 W US0007090 W US 0007090W WO 0063887 A1 WO0063887 A1 WO 0063887A1
Authority
WO
WIPO (PCT)
Prior art keywords
estimate
noise
voice activity
noise floor
voice
Prior art date
Application number
PCT/US2000/007090
Other languages
French (fr)
Inventor
James Brian Piket
Christopher Wayne Springfield
Ernest Pei-Ching Chen
Original Assignee
Motorola Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc. filed Critical Motorola Inc.
Priority to AU38937/00A priority Critical patent/AU3893700A/en
Priority to EP00918063A priority patent/EP1086453B1/en
Priority to JP2000612931A priority patent/JP2002542692A/en
Priority to DE60020317T priority patent/DE60020317T2/en
Publication of WO2000063887A1 publication Critical patent/WO2000063887A1/en
Priority to HK01107509A priority patent/HK1041739A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Definitions

  • the invention relates to communication systems and, more particularly, to noise suppression of transmitted voice signals.
  • a transmitting station may employ a noise suppression mechanism in order to reduce the noise content of a transmitted voice signal.
  • This can be particularly useful when the transmitting station is a mobile handset or hands-free telephone operating in the presence of background noise.
  • a sudden increase in background noise can cause a far-end listener to hear an undesirable level of noise.
  • This problem is particularly apparent when the transmitter station is operating as a mobile station and the transmitter station includes noise suppression technology. While current noise suppression techniques are effective in reducing background noise in a static or slowly changing noise environment, noise suppression performance can be significantly degraded when the transmitting station is operated in the presence of a rapidly changing noise environment.
  • an increase in background noise can be interpreted by the noise suppression algorithm as a voice signal from the user of the mobile transmitter. This condition is brought about due to the inter-dependency between the voice activity detection and the noise floor estimate computed by the noise suppression algorithm.
  • One noise suppression technique such as a stationary spectral check, has been used with some success in order to mitigate be effects of sudden increases in background noise.
  • this solution has been shown to be inadequate in many cases due to the time required for the noise suppression algorithm to reduce the background noise to an acceptable level. In some cases, this time period can be 10-20 seconds in duration.
  • the system can experience a locked fault condition in which noise floor updates cease to occur. This results in the transmitter being placed in a condition where the listener is subjected to an unacceptable amount of noise for an extended period of time.
  • FIG. 1 is a block diagram of a transmitter which employs voice activity detection using and external voice activity detector in accordance with a preferred embodiment of the invention
  • FIG. 2 is a flowchart of a method for noise suppression using an external voice activity detector in accordance with a preferred embodiment of the invention.
  • FIG. 3 is a flowchart of a method used by an external voice activity detector to control the updating noise content estimate performed by a noise suppression algorithm in accordance with a preferred embodiment of the invention.
  • a method and system for improved noise suppression using an external voice activity detector provides a capability to conduct voice communications in the presence of widely varying background noise.
  • the method and system correct a shortcoming in many noise suppression techniques by providing faster noise updates which minimizes the noise heard by the listening station. Additionally, the locked fault condition where noise updates cease to occur is avoided. These result in a hands-free communications system which does not subject a far-end listener to a noise burst when an increase in background noise occurs.
  • FIG. 1 is a block diagram of a transmitter which employs voice activity detection using and external voice activity detector in accordance with a preferred embodiment of the invention.
  • microphone 50 receives acoustic energy and converts this energy to an electrical signal.
  • Microphone 50 can be any type of the microphone or other transducer which converts mechanical or acoustic vibrations into electrical signals.
  • Microphone 50 is coupled to analog to digital converter 75 which converts the incoming analog electrical signal to a digital representation.
  • Analog to digital converter 75 can be any general purpose type of converter which preferably possesses sufficient sampling rate and dynamic range in order to produce accurate digital representations of the incoming analog voice signals from microphone 50.
  • noise suppressor 100 which includes preprocessor 110, voice activity detector 120, noise content estimator 130, and channel gain calculation element 140.
  • An output of analog to digital converter 75 is additionally coupled to external voice activity detector 150.
  • noise suppressor 100 is illustrative of a variety of noise suppressors suitable for use in conjunction with the present invention. Additionally, the functions of noise suppressor 100 may be performed entirely as one or more software processing elements, or may be performed in hardware where individual functions are performed by discrete and dedicated processing elements.
  • preprocessor 110 receives the digital representations of voice signals from analog to digital converter 75.
  • preprocessor 110 performs any required spectral conditioning functions in which certain spectral bands, preferably those which contain primarily voice, are emphasized, while other spectral bands, such as those which contain primarily noise, are de-emphasized. Additionally, preprocessor 110 may also perform conversion from a time domain signal to a frequency domain signal in order to allow the remaining portions of noise suppressor 100 to perform additional manipulations on the digital representations of the voice signals.
  • the output of preprocessor 110 is coupled to voice activity detector 120, and noise content estimator 130.
  • voice activity detector 120 performs voice detection based on the noise floor and channel energy statistics of the digital representations of the voice signals from preprocessor 110.
  • Noise content estimator 130 measures the background noise present in the digital representations of the voice signals from preprocessor 110.
  • channel gain calculation element 140 segments the digital representations of the voice signals into a group of frequency bins. By way of the segmentation of voice signals into frequency bins, channel and gain calculations can be performed on specific frequency bands which primarily contain voice information. Additionally, those frequency bands which primarily contain noise information can be attenuated.
  • noise content estimator 130 and voice activity detector 120 are coupled in order to perform a voice activity decision which is based on the noise content of the digital representations of the voice signal from preprocessor 110.
  • voice activity detector 120 determines voice activity by way of receiving an input from noise content estimator 130.
  • external voice activity detector 150 performs a separate voice activity determination in order to assist noise content estimator 130 in determining the noise content of the digital representation of the voice signals from preprocessor 110.
  • external voice activity detector determines voice activity without an input from noise content estimator 130.
  • the external noise floor estimate is not tied Through removing the dependency of noise floor determination on voice activity detection decisions, a more reliable voice activity detection mechanism can be provided for use in environments where background noise changes rapidly.
  • External voice activity detector 150 accepts inputs of digital representations of voice signals from analog to digital converter 75. These inputs are coupled to signal power estimator 154, and noise floor estimator 156. Signal power estimator 154 performs computations in order to determine the signal power present in the input signal. Noise floor estimator 156 performs calculations on the input signal in order to ascertain the noise floor of the signal input.
  • Outputs from signal power estimator 154 and noise floor estimator 156 are coupled to voice activity processor 158 which compares the levels of signal power and noise floor in order to determine whether an update of noise content estimator 130, should be performed.
  • voice activity processor 158 compares the levels of signal power and noise floor in order to determine whether an update of noise content estimator 130, should be performed.
  • the method used by signal power estimator 154, noise of floor estimator 156, voice activity processor 158 is discussed further in reference to FIG. 3.
  • the output of voice activity 158 is coupled to noise suppressor 100. In a preferred embodiment, this output consists of an indicator which can force noise content estimator 130 to perform a noise estimate of the digital representations of the voice signal from preprocessor 110.
  • FIG. 2 is a flow chart of a method performed by an external voice activity detector in accordance with a preferred embodiment of the invention.
  • External voice activity detector 150 of FIG. 1 is suitable for performing the method.
  • the method of FIG. 2 begins with the voice activity detector computing a background noise floor estimate.
  • this estimate is based upon a slow rise/fast-fall technique designed to track changes in the noise floor of a particular signal.
  • the technique does not require an assumption as to whether the incoming digital representation of a voice signal is either voice or noise.
  • an estimate of the current signal power is desirably updated in step 220 by way of an integration function such as the leaky integrator shown in the equation below.
  • step 230 the current signal power estimate is compared to the noise floor estimate. If the signal power estimate exceeds the noise floor estimate, which can indicate a decrease in the noise level of the incoming voice signal, the updated noise floor is set equal to the signal power estimate in step 245. This produces the desired "fast fall” in the noise floor. If the signal power estimate exceeds the noise floor estimates, symbolizing a increase in noise level, a slope factor is applied to the noise floor estimate (in step 240) to cause a slow rise rambling of the current noise floor estimates at a rate of decibels per second.
  • the algorithm for steps 230, 240 and 245 can be expressed as:
  • Step 250 a voice activity factor, ⁇ , is applied to the updated noise floor estimates to create a voice activity threshold estimate, ( ⁇ (NF y (n)).
  • a voice activity threshold estimate
  • the method then continues in step 260 where the signal power estimate is compared with the voice activity threshold estimates from step 250.
  • Step 260 is the primary decision as to whether or not to force the noise suppression technique to update the noise content estimate of the digital representations of the voice signal, although typical implementation would preferably also employ well-known techniques such as hangover periods and hysteresis.
  • step 270 If the signal power estimate exceeds the voice activity threshold estimate, then the external voice activity detector allows the noise suppression technique to update the noise content estimate, as in step 270.
  • step 262 is executed in which a determination is made as to whether an upper limit of a silence counter has been reached. If the upper limit of the silence counter has not been reached, step 263 is executed in which the counter is incremented, and the method returns to step 260.
  • a complete description of the purpose and preferred numerical values of the silence counter is described with reference to FIG. 3.
  • step 265 is executed in which the external voice activity sensor forces the noise suppression technique to update the noise content estimate.
  • step 280 is then executed where the silence counter is rest. After executing steps 265 through 280, the method returns to step 210, where the next frame of digital representations of voice signals is evaluated.
  • the algorithm for steps 250, through 280 can be expressed as:
  • FIG. 3 is a flow chart of a method used by an external voice activity detector to control the updating of a noise content estimate performed by a noise suppression algorithm in accordance with a preferred embodiment of the invention.
  • the method begins in step 310 where an external voice activity detector, such as external voice activity detector 150 of FIG. 1 , determines if voice activity is present.
  • Step 310 represents the outcome of voice activity detection, such as that described in reference to FIG. 2, in which a noise content estimate is forced if the appropriate conditions are present.
  • step 320 is executed where a counter is incremented.
  • a check is performed to determine if the current value of the counter has reached an upper limit. In a preferred embodiment, the upper limit for the counter is set to equal 20.
  • step 330 determines that the upper limit has not been reached, the method executes step 350 where the external voice activity detector allows the noise suppression algorithm to determine if an update in the noise content of an incoming digital representation of a voice signal is required. The method then returns to step 310. If the external voice activity detector determines that a voice signal is present, as in step 310, a counter is reset in step 315 and the method returns to step 310.
  • Steps 320 through 340 allow a noise update only after a relatively long "hangover" period has occurred.
  • the use of a hangover period restricts the noise suppression algorithm to performing a noise content estimate only after a hands-free subscriber has stopped talking. Thus, noise content estimates are not performed during the voice the pauses which occur during normal speech.
  • the use of a counter to limit the time between forced updates of the noise content of the voice signal limits the length of the hangover period. By limiting the length of the hangover period, the locked fault condition in which the noise suppression algorithm ceases to update the noise content estimate can be avoided. Thus preventing the far-end listener from be subjected to high levels of noise.
  • a method and system for improved noise suppression using an external voice activity detector provides a capability to conduct voice communications in the presence of widely varying background noise.
  • the method and system correct a shortcoming present in many noise suppression techniques by forcing the noise suppression technique to perform noise content estimates on incoming digital representations of voice signals under certain conditions. This, in turn, minimizes the noise heard by the listening station. Additionally, the locked fault condition where noise updates cease to occur, is avoided.
  • the method and system result in a hands- 8 free communications system which does not subject a far-end listener to a noise burst when an increase in background noise occurs.

Abstract

A communications transmitter which operates as a mobile telephone incorporates a noise suppressor (100) which reduces the background noise in the transmitted voice signal. An external voice activity detector (150), which operates in conjunction with a noise suppressor (100) estimates the signal power of the incoming voice signal and compares this to an estimated noise floor. As a result of this comparison, a voice activity factor is applied to an updated noise floor estimate to create a voice activity threshold estimate. The voice activity threshold estimate is then used to decide whether or not the force noise suppressor (100) to perform an update of a noise content estimate of the incoming voice signal.

Description

NOISE SUPPRESSION USING EXTERNAL VOICE ACTIVITY DETECTION
Field of the Invention
The invention relates to communication systems and, more particularly, to noise suppression of transmitted voice signals.
Background of the Invention
In a communications system, a transmitting station may employ a noise suppression mechanism in order to reduce the noise content of a transmitted voice signal. This can be particularly useful when the transmitting station is a mobile handset or hands-free telephone operating in the presence of background noise. In these environments, a sudden increase in background noise can cause a far-end listener to hear an undesirable level of noise. This problem is particularly apparent when the transmitter station is operating as a mobile station and the transmitter station includes noise suppression technology. While current noise suppression techniques are effective in reducing background noise in a static or slowly changing noise environment, noise suppression performance can be significantly degraded when the transmitting station is operated in the presence of a rapidly changing noise environment.
In mobile environments, large changes in background noise can be brought about when the user of the mobile transmitter activates a fan, lowers a window while the mobile station is in motion, or is otherwise subjected to significant and sudden changes in the background noise within the mobile station. The background noise within the mobile unit can also be affected by numerous other changes within the mobile station.
In typical mobile transmitters which use voice activity detection internal to a noise suppression algorithm, an increase in background noise can be interpreted by the noise suppression algorithm as a voice signal from the user of the mobile transmitter. This condition is brought about due to the inter-dependency between the voice activity detection and the noise floor estimate computed by the noise suppression algorithm. One noise suppression technique, such as a stationary spectral check, has been used with some success in order to mitigate be effects of sudden increases in background noise. However, in practice, this solution has been shown to be inadequate in many cases due to the time required for the noise suppression algorithm to reduce the background noise to an acceptable level. In some cases, this time period can be 10-20 seconds in duration. In other cases, the system can experience a locked fault condition in which noise floor updates cease to occur. This results in the transmitter being placed in a condition where the listener is subjected to an unacceptable amount of noise for an extended period of time.
Therefore, it is highly desirable for the noise suppression method and system to adapt to sudden increases in background noise through the use of a voice activity detector with reduced inter-dependency between voice activity detection and noise floor estimates. Such a system would provide a capability for lower noise transmissions while a mobile station is operating in the presence of widely varying background noise.
Brief Description of the Drawings
The invention is pointed out with particularity in the appended claims. However, a more complete understanding of the present invention may be derived by referring to the detailed description and claims when considered in connection with the figures, wherein like reference numbers refer to similar items throughout the figures, and:
FIG. 1 is a block diagram of a transmitter which employs voice activity detection using and external voice activity detector in accordance with a preferred embodiment of the invention;
FIG. 2 is a flowchart of a method for noise suppression using an external voice activity detector in accordance with a preferred embodiment of the invention; and
FIG. 3 is a flowchart of a method used by an external voice activity detector to control the updating noise content estimate performed by a noise suppression algorithm in accordance with a preferred embodiment of the invention.
Description of the Preferred Embodiments
A method and system for improved noise suppression using an external voice activity detector provides a capability to conduct voice communications in the presence of widely varying background noise. The method and system correct a shortcoming in many noise suppression techniques by providing faster noise updates which minimizes the noise heard by the listening station. Additionally, the locked fault condition where noise updates cease to occur is avoided. These result in a hands-free communications system which does not subject a far-end listener to a noise burst when an increase in background noise occurs.
FIG. 1 is a block diagram of a transmitter which employs voice activity detection using and external voice activity detector in accordance with a preferred embodiment of the invention. In FIG. 1 , microphone 50 receives acoustic energy and converts this energy to an electrical signal. Microphone 50 can be any type of the microphone or other transducer which converts mechanical or acoustic vibrations into electrical signals. Microphone 50 is coupled to analog to digital converter 75 which converts the incoming analog electrical signal to a digital representation. Analog to digital converter 75 can be any general purpose type of converter which preferably possesses sufficient sampling rate and dynamic range in order to produce accurate digital representations of the incoming analog voice signals from microphone 50.
The output of analog to digital converter 75 is input to noise suppressor 100 which includes preprocessor 110, voice activity detector 120, noise content estimator 130, and channel gain calculation element 140. An output of analog to digital converter 75 is additionally coupled to external voice activity detector 150. In a preferred embodiment, noise suppressor 100 is illustrative of a variety of noise suppressors suitable for use in conjunction with the present invention. Additionally, the functions of noise suppressor 100 may be performed entirely as one or more software processing elements, or may be performed in hardware where individual functions are performed by discrete and dedicated processing elements. In FIG. 1 , preprocessor 110 receives the digital representations of voice signals from analog to digital converter 75. In a preferred embodiment, preprocessor 110 performs any required spectral conditioning functions in which certain spectral bands, preferably those which contain primarily voice, are emphasized, while other spectral bands, such as those which contain primarily noise, are de-emphasized. Additionally, preprocessor 110 may also perform conversion from a time domain signal to a frequency domain signal in order to allow the remaining portions of noise suppressor 100 to perform additional manipulations on the digital representations of the voice signals.
The output of preprocessor 110 is coupled to voice activity detector 120, and noise content estimator 130. In a preferred embodiment, voice activity detector 120 performs voice detection based on the noise floor and channel energy statistics of the digital representations of the voice signals from preprocessor 110. Noise content estimator 130 measures the background noise present in the digital representations of the voice signals from preprocessor 110.
The output of voice activity detector 120 and noise content estimator 130 are then coupled to channel gain calculation element 140. In a preferred embodiment, channel gain calculation element 140 segments the digital representations of the voice signals into a group of frequency bins. By way of the segmentation of voice signals into frequency bins, channel and gain calculations can be performed on specific frequency bands which primarily contain voice information. Additionally, those frequency bands which primarily contain noise information can be attenuated.
As shown in FIG. 1 , noise content estimator 130 and voice activity detector 120 are coupled in order to perform a voice activity decision which is based on the noise content of the digital representations of the voice signal from preprocessor 110. Thus, voice activity detector 120 determines voice activity by way of receiving an input from noise content estimator 130.
In FIG. 1, external voice activity detector 150 performs a separate voice activity determination in order to assist noise content estimator 130 in determining the noise content of the digital representation of the voice signals from preprocessor 110. In a preferred embodiment, external voice activity detector determines voice activity without an input from noise content estimator 130. Importantly, the external noise floor estimate is not tied Through removing the dependency of noise floor determination on voice activity detection decisions, a more reliable voice activity detection mechanism can be provided for use in environments where background noise changes rapidly.
External voice activity detector 150, accepts inputs of digital representations of voice signals from analog to digital converter 75. These inputs are coupled to signal power estimator 154, and noise floor estimator 156. Signal power estimator 154 performs computations in order to determine the signal power present in the input signal. Noise floor estimator 156 performs calculations on the input signal in order to ascertain the noise floor of the signal input.
Outputs from signal power estimator 154 and noise floor estimator 156 are coupled to voice activity processor 158 which compares the levels of signal power and noise floor in order to determine whether an update of noise content estimator 130, should be performed. The method used by signal power estimator 154, noise of floor estimator 156, voice activity processor 158 is discussed further in reference to FIG. 3. The output of voice activity 158 is coupled to noise suppressor 100. In a preferred embodiment, this output consists of an indicator which can force noise content estimator 130 to perform a noise estimate of the digital representations of the voice signal from preprocessor 110.
FIG. 2 is a flow chart of a method performed by an external voice activity detector in accordance with a preferred embodiment of the invention. External voice activity detector 150 of FIG. 1 is suitable for performing the method. The method of FIG. 2 begins with the voice activity detector computing a background noise floor estimate. By way of example, and not by way of limitation, this estimate is based upon a slow rise/fast-fall technique designed to track changes in the noise floor of a particular signal. Preferably, the technique does not require an assumption as to whether the incoming digital representation of a voice signal is either voice or noise. As each sample, denoted by y(n) is processed , an estimate of the current signal power is desirably updated in step 220 by way of an integration function such as the leaky integrator shown in the equation below.
Py(n) = (1- )y2(n)+ Pv(n-1), where .9875
In step 230, the current signal power estimate is compared to the noise floor estimate. If the signal power estimate exceeds the noise floor estimate, which can indicate a decrease in the noise level of the incoming voice signal, the updated noise floor is set equal to the signal power estimate in step 245. This produces the desired "fast fall" in the noise floor. If the signal power estimate exceeds the noise floor estimates, symbolizing a increase in noise level, a slope factor is applied to the noise floor estimate (in step 240) to cause a slow rise rambling of the current noise floor estimates at a rate of decibels per second. The algorithm for steps 230, 240 and 245 can be expressed as:
If (Py(n)<NFy(n-l)) then NFv(n)=Py(n) else
NFy(n) = (NFy(n-l)) where β » 2 to 8 dB per second endif. In step 250, a voice activity factor, α, is applied to the updated noise floor estimates to create a voice activity threshold estimate, (α(NFy(n)). The method then continues in step 260 where the signal power estimate is compared with the voice activity threshold estimates from step 250. Step 260 is the primary decision as to whether or not to force the noise suppression technique to update the noise content estimate of the digital representations of the voice signal, although typical implementation would preferably also employ well-known techniques such as hangover periods and hysteresis.
If the signal power estimate exceeds the voice activity threshold estimate, then the external voice activity detector allows the noise suppression technique to update the noise content estimate, as in step 270. In the event that the signal power estimate does not exceed the voice activity threshold estimate, step 262 is executed in which a determination is made as to whether an upper limit of a silence counter has been reached. If the upper limit of the silence counter has not been reached, step 263 is executed in which the counter is incremented, and the method returns to step 260. A complete description of the purpose and preferred numerical values of the silence counter is described with reference to FIG. 3.
If the decision of step 262 indicates that the upper limit of the silence counter has been reached, step 265 is executed in which the external voice activity sensor forces the noise suppression technique to update the noise content estimate. Step 280 is then executed where the silence counter is rest. After executing steps 265 through 280, the method returns to step 210, where the next frame of digital representations of voice signals is evaluated. The algorithm for steps 250, through 280 can be expressed as:
If Py(n)> α((NFy(n)) then do not force update else force update, increment silence counter, and check threshold endif.
FIG. 3 is a flow chart of a method used by an external voice activity detector to control the updating of a noise content estimate performed by a noise suppression algorithm in accordance with a preferred embodiment of the invention. The method begins in step 310 where an external voice activity detector, such as external voice activity detector 150 of FIG. 1 , determines if voice activity is present. Step 310 represents the outcome of voice activity detection, such as that described in reference to FIG. 2, in which a noise content estimate is forced if the appropriate conditions are present. If step 310 determines that voice activity is not present, step 320 is executed where a counter is incremented. In step 330, a check is performed to determine if the current value of the counter has reached an upper limit. In a preferred embodiment, the upper limit for the counter is set to equal 20.
If the upper limit of the counter has been reached, the external voice activity detector forces an update of the noise content of the incoming digital representations of a voice signal and the method returns to step 310. If, however, step 330 determines that the upper limit has not been reached, the method executes step 350 where the external voice activity detector allows the noise suppression algorithm to determine if an update in the noise content of an incoming digital representation of a voice signal is required. The method then returns to step 310. If the external voice activity detector determines that a voice signal is present, as in step 310, a counter is reset in step 315 and the method returns to step 310.
Steps 320 through 340 allow a noise update only after a relatively long "hangover" period has occurred. The use of a hangover period restricts the noise suppression algorithm to performing a noise content estimate only after a hands-free subscriber has stopped talking. Thus, noise content estimates are not performed during the voice the pauses which occur during normal speech. Additionally, the use of a counter to limit the time between forced updates of the noise content of the voice signal limits the length of the hangover period. By limiting the length of the hangover period, the locked fault condition in which the noise suppression algorithm ceases to update the noise content estimate can be avoided. Thus preventing the far-end listener from be subjected to high levels of noise.
A method and system for improved noise suppression using an external voice activity detector provides a capability to conduct voice communications in the presence of widely varying background noise. The method and system correct a shortcoming present in many noise suppression techniques by forcing the noise suppression technique to perform noise content estimates on incoming digital representations of voice signals under certain conditions. This, in turn, minimizes the noise heard by the listening station. Additionally, the locked fault condition where noise updates cease to occur, is avoided. The method and system result in a hands- 8 free communications system which does not subject a far-end listener to a noise burst when an increase in background noise occurs.
Accordingly, it is intended by the appended claims to cover all modifications of the invention that fall within the true spirit and scope of the invention.

Claims

CLAIMSWhat is claimed is:
1. In a transmitter which performs a noise suppression technique on an incoming voice signal, the noise suppression technique using an internal voice activity detector, a method for controlling an update of a noise content estimate of said incoming voice signal, comprising the steps of: estimating a background noise floor of the incoming voice signal using a second voice activity detector external to the noise suppression technique; estimating a signal power of the incoming voice signal using the second voice activity detector; determining voice activity based on the background noise floor estimate and the signal power estimate; and controlling the update of the noise content estimate based on the determining step.
2. The method of claim 1 , wherein the determining step further comprises the step of applying a slope factor to the background noise floor estimate to form an updated noise floor estimate if the signal power estimate exceeds the background noise floor estimate.
3. The method of claim 2, wherein the slope factor is approximately in the range of 2 to 8 decibels per second.
4. The method of claim 3, wherein the determining step further comprises the step of applying a voice activity factor to the updated noise floor estimate to create a voice activity threshold estimate.
5. The method of claim 4, wherein the voice activity factor is approximately in the range of 8 decibels.
6. The method of claim 4, wherein the controlling step further comprises the step of allowing the internal voice activity detector to update a noise content estimate if the signal power estimate is greater than the voice activity threshold estimate.
7. The method of claim 1 , wherein the determining step further comprises the step of equating the background noise floor estimate with the signal power estimate if the signal power estimate is less than the background noise floor estimate.
8. The method of claim 7, wherein the determining step further comprises the step of applying a voice activity factor to the background noise floor estimate to create a voice activity threshold estimate.
9. The method of claim 8, wherein the controlling step further comprises the step of updating the noise content estimate if the signal power estimate is less than the voice activity threshold estimate.
10. The method of claim 1, wherein the second estimating step comprises the step of integrating a previous signal power estimate.
11. The method of claim 10 wherein said integrating step further comprises the step of applying a leaky integrator factor.
12. The method of claim 11 wherein the leaky integrator factor is approximately in the range of 99/100.
13. A transmitter for conveying voice signals to a remote receiver, comprising: a first voice activity detector; a noise floor estimator coupled to the first voice activity detector; and a second voice activity detector, coupled to the noise floor estimator, for controlling an update of a noise floor content of the voice signals performed by the noise floor estimator.
14. The transmitter of claim 13, wherein the second voice activity detector comprises a signal power estimator for computing a signal power estimate of an incoming voice signal.
15. The transmitter of claim 13, wherein the second voice activity detector comprises a noise floor estimator for estimating a noise floor of an incoming voice signal independent of a voice activity state.
16. The transmitter of claim 13, wherein the second voice activity detector comprises a voice activity processor for controlling an update of a noise content of the voice signals performed by the noise floor estimator.
17. In a communications system which performs noise suppression on an incoming voice signal, a method for controlling an update of a noise content of the incoming voice signal, comprising: estimating a noise content of an incoming voice signal using a first voice activity detector; determining voice activity using a second voice activity detector; and forcing an update of the noise content of the incoming voice signal based on the determining step.
18. The method of claim 17, wherein the determining step further comprises the step of comparing an estimate of a signal power of the incoming voice signal with a noise floor estimate.
19. The method of claim 18, wherein the determining step further comprises the step of equating the signal power with an updated version of the noise floor estimate if the signal power is less than the noise floor estimate.
20. The method of claim 18, wherein the determining step further comprises the step of equating an updated version of the noise floor estimate with the noise floor estimate multiplied by a slope factor, if the signal power is greater than the noise floor estimate.
PCT/US2000/007090 1999-04-19 2000-03-16 Noise suppression using external voice activity detection WO2000063887A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
AU38937/00A AU3893700A (en) 1999-04-19 2000-03-16 Noise suppression using external voice activity detection
EP00918063A EP1086453B1 (en) 1999-04-19 2000-03-16 Noise suppression using external voice activity detection
JP2000612931A JP2002542692A (en) 1999-04-19 2000-03-16 Noise suppression using external voice activity detection
DE60020317T DE60020317T2 (en) 1999-04-19 2000-03-16 NOISE REDUCTION USING AN EXTERNAL LANGUAGE ACTIVITY DETECTOR
HK01107509A HK1041739A1 (en) 1999-04-19 2001-10-29 Noise suppression using external voice activity detection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/293,901 US6618701B2 (en) 1999-04-19 1999-04-19 Method and system for noise suppression using external voice activity detection
US09/293,901 1999-04-19

Publications (1)

Publication Number Publication Date
WO2000063887A1 true WO2000063887A1 (en) 2000-10-26

Family

ID=23131053

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/007090 WO2000063887A1 (en) 1999-04-19 2000-03-16 Noise suppression using external voice activity detection

Country Status (9)

Country Link
US (1) US6618701B2 (en)
EP (1) EP1086453B1 (en)
JP (1) JP2002542692A (en)
KR (1) KR100676216B1 (en)
CN (1) CN1133152C (en)
AU (1) AU3893700A (en)
DE (1) DE60020317T2 (en)
HK (1) HK1041739A1 (en)
WO (1) WO2000063887A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1246167A1 (en) * 2001-03-29 2002-10-02 Nokia Corporation Arrangement for de-activating automatic noise cancellation in a mobile station
WO2009035613A1 (en) * 2007-09-12 2009-03-19 Dolby Laboratories Licensing Corporation Speech enhancement with noise level estimation adjustment
WO2010003544A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft Zur Förderung Der Angewandtern Forschung E.V. An apparatus and a method for generating bandwidth extension output data
EP2360685A1 (en) * 2010-01-13 2011-08-24 Yamaha Corporation Noise suppressing device
US8275626B2 (en) 2008-07-11 2012-09-25 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for decoding an encoded audio signal
WO2012127278A1 (en) * 2011-03-18 2012-09-27 Nokia Corporation Apparatus for audio signal processing

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7933295B2 (en) 1999-04-13 2011-04-26 Broadcom Corporation Cable modem with voice processing capability
US7263074B2 (en) * 1999-12-09 2007-08-28 Broadcom Corporation Voice activity detection based on far-end and near-end statistics
AU3041800A (en) * 1999-12-21 2001-07-09 Nokia Networks Oy Equaliser with a cost function taking into account noise energy
US7617099B2 (en) * 2001-02-12 2009-11-10 FortMedia Inc. Noise suppression by two-channel tandem spectrum modification for speech signal in an automobile
US7236929B2 (en) * 2001-05-09 2007-06-26 Plantronics, Inc. Echo suppression and speech detection techniques for telephony applications
US20020172350A1 (en) * 2001-05-15 2002-11-21 Edwards Brent W. Method for generating a final signal from a near-end signal and a far-end signal
US7295976B2 (en) * 2002-01-25 2007-11-13 Acoustic Technologies, Inc. Voice activity detector for telephone
US20040073422A1 (en) * 2002-10-14 2004-04-15 Simpson Gregory A. Apparatus and methods for surreptitiously recording and analyzing audio for later auditioning and application
JP4282317B2 (en) * 2002-12-05 2009-06-17 アルパイン株式会社 Voice communication device
US8271279B2 (en) 2003-02-21 2012-09-18 Qnx Software Systems Limited Signature noise removal
US8326621B2 (en) * 2003-02-21 2012-12-04 Qnx Software Systems Limited Repetitive transient noise removal
US7885420B2 (en) 2003-02-21 2011-02-08 Qnx Software Systems Co. Wind noise suppression system
US7949522B2 (en) 2003-02-21 2011-05-24 Qnx Software Systems Co. System for suppressing rain noise
US20040218519A1 (en) * 2003-05-01 2004-11-04 Rong-Liang Chiou Apparatus and method for estimation of channel state information in OFDM receivers
KR20060094078A (en) 2003-10-16 2006-08-28 코닌클리즈케 필립스 일렉트로닉스 엔.브이. Voice activity detection with adaptive noise floor tracking
JP4490090B2 (en) * 2003-12-25 2010-06-23 株式会社エヌ・ティ・ティ・ドコモ Sound / silence determination device and sound / silence determination method
JP4601970B2 (en) * 2004-01-28 2010-12-22 株式会社エヌ・ティ・ティ・ドコモ Sound / silence determination device and sound / silence determination method
CA2454296A1 (en) * 2003-12-29 2005-06-29 Nokia Corporation Method and device for speech enhancement in the presence of background noise
DE102004049347A1 (en) * 2004-10-08 2006-04-20 Micronas Gmbh Circuit arrangement or method for speech-containing audio signals
KR100677396B1 (en) 2004-11-20 2007-02-02 엘지전자 주식회사 A method and a apparatus of detecting voice area on voice recognition device
WO2007026691A1 (en) * 2005-09-02 2007-03-08 Nec Corporation Noise suppressing method and apparatus and computer program
US7764634B2 (en) * 2005-12-29 2010-07-27 Microsoft Corporation Suppression of acoustic feedback in voice communications
US8204754B2 (en) * 2006-02-10 2012-06-19 Telefonaktiebolaget L M Ericsson (Publ) System and method for an improved voice detector
US7720681B2 (en) * 2006-03-23 2010-05-18 Microsoft Corporation Digital voice profiles
US9462118B2 (en) * 2006-05-30 2016-10-04 Microsoft Technology Licensing, Llc VoIP communication content control
US8971217B2 (en) * 2006-06-30 2015-03-03 Microsoft Technology Licensing, Llc Transmitting packet-based data items
US9966085B2 (en) * 2006-12-30 2018-05-08 Google Technology Holdings LLC Method and noise suppression circuit incorporating a plurality of noise suppression techniques
RU2440627C2 (en) 2007-02-26 2012-01-20 Долби Лэборетериз Лайсенсинг Корпорейшн Increasing speech intelligibility in sound recordings of entertainment programmes
CN101320559B (en) * 2007-06-07 2011-05-18 华为技术有限公司 Sound activation detection apparatus and method
EP2107553B1 (en) * 2008-03-31 2011-05-18 Harman Becker Automotive Systems GmbH Method for determining barge-in
US9575715B2 (en) * 2008-05-16 2017-02-21 Adobe Systems Incorporated Leveling audio signals
CN101625860B (en) * 2008-07-10 2012-07-04 新奥特(北京)视频技术有限公司 Method for self-adaptively adjusting background noise in voice endpoint detection
US8184791B2 (en) * 2009-03-30 2012-05-22 Verizon Patent And Licensing Inc. Method and system for compensating audio signals during a communication session
CN101859568B (en) * 2009-04-10 2012-05-30 比亚迪股份有限公司 Method and device for eliminating voice background noise
WO2011049516A1 (en) * 2009-10-19 2011-04-28 Telefonaktiebolaget Lm Ericsson (Publ) Detector and method for voice activity detection
AU2010308597B2 (en) * 2009-10-19 2015-10-01 Telefonaktiebolaget Lm Ericsson (Publ) Method and background estimator for voice activity detection
US8626498B2 (en) * 2010-02-24 2014-01-07 Qualcomm Incorporated Voice activity detection based on plural voice activity detectors
JP5528538B2 (en) * 2010-03-09 2014-06-25 三菱電機株式会社 Noise suppressor
US8447595B2 (en) 2010-06-03 2013-05-21 Apple Inc. Echo-related decisions on automatic gain control of uplink speech signal in a communications device
WO2012070668A1 (en) * 2010-11-25 2012-05-31 日本電気株式会社 Signal processing device, signal processing method, and signal processing program
HUE053127T2 (en) * 2010-12-24 2021-06-28 Huawei Tech Co Ltd Method and apparatus for adaptively detecting a voice activity in an input audio signal
CN102543092B (en) * 2010-12-29 2014-02-05 联芯科技有限公司 Noise estimation method and device
US8990074B2 (en) 2011-05-24 2015-03-24 Qualcomm Incorporated Noise-robust speech coding mode classification
US9210507B2 (en) * 2013-01-29 2015-12-08 2236008 Ontartio Inc. Microphone hiss mitigation
CN105830154B (en) * 2013-12-19 2019-06-28 瑞典爱立信有限公司 Estimate the ambient noise in audio signal
CN104269178A (en) * 2014-08-08 2015-01-07 华迪计算机集团有限公司 Method and device for conducting self-adaption spectrum reduction and wavelet packet noise elimination processing on voice signals
US9953661B2 (en) * 2014-09-26 2018-04-24 Cirrus Logic Inc. Neural network voice activity detection employing running range normalization
US10771631B2 (en) * 2016-08-03 2020-09-08 Dolby Laboratories Licensing Corporation State-based endpoint conference interaction
CN107123419A (en) * 2017-05-18 2017-09-01 北京大生在线科技有限公司 The optimization method of background noise reduction in the identification of Sphinx word speeds
WO2019068915A1 (en) * 2017-10-06 2019-04-11 Sony Europe Limited Audio file envelope based on rms power in sequences of sub-windows

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0335521A1 (en) * 1988-03-11 1989-10-04 BRITISH TELECOMMUNICATIONS public limited company Voice activity detection
EP0665530A1 (en) * 1994-01-28 1995-08-02 AT&T Corp. Voice activity detection driven noise remediator
EP0784311A1 (en) * 1995-12-12 1997-07-16 Nokia Mobile Phones Ltd. Method and device for voice activity detection and a communication device
WO1998001847A1 (en) * 1996-07-03 1998-01-15 British Telecommunications Public Limited Company Voice activity detector

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4052568A (en) * 1976-04-23 1977-10-04 Communications Satellite Corporation Digital voice switch
EP0127718B1 (en) * 1983-06-07 1987-03-18 International Business Machines Corporation Process for activity detection in a voice transmission system
US5276765A (en) * 1988-03-11 1994-01-04 British Telecommunications Public Limited Company Voice activity detection
JP2842026B2 (en) * 1991-02-20 1998-12-24 日本電気株式会社 Adaptive filter coefficient control method and apparatus
US5278944A (en) * 1992-07-15 1994-01-11 Kokusai Electric Co., Ltd. Speech coding circuit
IN184794B (en) * 1993-09-14 2000-09-30 British Telecomm
EP0681730A4 (en) * 1993-11-30 1997-12-17 At & T Corp Transmitted noise reduction in communications systems.
US5526419A (en) * 1993-12-29 1996-06-11 At&T Corp. Background noise compensation in a telephone set
US5659622A (en) 1995-11-13 1997-08-19 Motorola, Inc. Method and apparatus for suppressing noise in a communication system
US5881091A (en) 1996-02-05 1999-03-09 Hewlett-Packard Company Spread spectrum linearization for digitizing receivers
US5926060A (en) * 1996-05-10 1999-07-20 National Semiconductor Corporation Mirror model for designing a continuous-time filter with reduced filter noise
US6097820A (en) * 1996-12-23 2000-08-01 Lucent Technologies Inc. System and method for suppressing noise in digitally represented voice signals
JPH10247098A (en) 1997-03-04 1998-09-14 Mitsubishi Electric Corp Method for variable rate speech encoding and method for variable rate speech decoding
US6023674A (en) * 1998-01-23 2000-02-08 Telefonaktiebolaget L M Ericsson Non-parametric voice activity detection
US6108610A (en) * 1998-10-13 2000-08-22 Noise Cancellation Technologies, Inc. Method and system for updating noise estimates during pauses in an information signal

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0335521A1 (en) * 1988-03-11 1989-10-04 BRITISH TELECOMMUNICATIONS public limited company Voice activity detection
EP0665530A1 (en) * 1994-01-28 1995-08-02 AT&T Corp. Voice activity detection driven noise remediator
EP0784311A1 (en) * 1995-12-12 1997-07-16 Nokia Mobile Phones Ltd. Method and device for voice activity detection and a communication device
WO1998001847A1 (en) * 1996-07-03 1998-01-15 British Telecommunications Public Limited Company Voice activity detector

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1246167A1 (en) * 2001-03-29 2002-10-02 Nokia Corporation Arrangement for de-activating automatic noise cancellation in a mobile station
WO2009035613A1 (en) * 2007-09-12 2009-03-19 Dolby Laboratories Licensing Corporation Speech enhancement with noise level estimation adjustment
US8538763B2 (en) 2007-09-12 2013-09-17 Dolby Laboratories Licensing Corporation Speech enhancement with noise level estimation adjustment
WO2010003544A1 (en) * 2008-07-11 2010-01-14 Fraunhofer-Gesellschaft Zur Förderung Der Angewandtern Forschung E.V. An apparatus and a method for generating bandwidth extension output data
US8275626B2 (en) 2008-07-11 2012-09-25 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for decoding an encoded audio signal
US8296159B2 (en) 2008-07-11 2012-10-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for calculating a number of spectral envelopes
KR101278546B1 (en) 2008-07-11 2013-06-24 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. An apparatus and a method for generating bandwidth extension output data
TWI415115B (en) * 2008-07-11 2013-11-11 Fraunhofer Ges Forschung An apparatus and a method for generating bandwidth extension output data
US8612214B2 (en) 2008-07-11 2013-12-17 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and a method for generating bandwidth extension output data
KR101345695B1 (en) 2008-07-11 2013-12-30 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. An apparatus and a method for generating bandwidth extension output data
EP2360685A1 (en) * 2010-01-13 2011-08-24 Yamaha Corporation Noise suppressing device
WO2012127278A1 (en) * 2011-03-18 2012-09-27 Nokia Corporation Apparatus for audio signal processing

Also Published As

Publication number Publication date
JP2002542692A (en) 2002-12-10
AU3893700A (en) 2000-11-02
HK1041739A1 (en) 2002-07-19
EP1086453B1 (en) 2005-05-25
CN1300417A (en) 2001-06-20
US20020152066A1 (en) 2002-10-17
KR20010052483A (en) 2001-06-25
US6618701B2 (en) 2003-09-09
DE60020317T2 (en) 2005-11-17
KR100676216B1 (en) 2007-01-30
DE60020317D1 (en) 2005-06-30
EP1086453A1 (en) 2001-03-28
CN1133152C (en) 2003-12-31

Similar Documents

Publication Publication Date Title
US6618701B2 (en) Method and system for noise suppression using external voice activity detection
AU749243B2 (en) Acoustic echo canceler
KR100711869B1 (en) Improved system and method for implementation of an echo canceller
EP1599992B1 (en) Audibility enhancement
US6148078A (en) Methods and apparatus for controlling echo suppression in communications systems
US5839101A (en) Noise suppressor and method for suppressing background noise in noisy speech, and a mobile station
US20020057791A1 (en) Echo canceler and double-talk detector for use in a communications unit
JPH09503590A (en) Background noise reduction to improve conversation quality
US5734715A (en) Process and device for adaptive identification and adaptive echo canceller relating thereto
EP1142288B1 (en) Methods and apparatus for adaptive signal gain control in communications systems
WO1995023477A1 (en) Doubletalk detection by means of spectral content
KR100423472B1 (en) Gauging convergence of adaptive filters
EP1164712A1 (en) Sound communication device and echo processor
US7889874B1 (en) Noise suppressor
US6970558B1 (en) Method and device for suppressing noise in telephone devices
US6178162B1 (en) Method and apparatus for inhibiting echo in a channel of a communication system
EP1076929B1 (en) Voice operated switch for use in high noise environments
US7065207B2 (en) Controlling attenuation during echo suppression
US20030142813A1 (en) Telephone having four VAD circuits
WO2003065693A2 (en) Analog voice activity detector for telephone
WO2003065764A1 (en) Voice activity detector for telephone
JP2005020428A (en) Sound echo suppressing device and speech communication equipment

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 00800589.3

Country of ref document: CN

AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KR KZ LC LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 1020007013593

Country of ref document: KR

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2000918063

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2000918063

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020007013593

Country of ref document: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWG Wipo information: grant in national office

Ref document number: 2000918063

Country of ref document: EP

WWG Wipo information: grant in national office

Ref document number: 1020007013593

Country of ref document: KR