EP0615226A2 - Method for noise reduction in disturbed voice drannels - Google Patents

Method for noise reduction in disturbed voice drannels Download PDF

Info

Publication number
EP0615226A2
EP0615226A2 EP94102963A EP94102963A EP0615226A2 EP 0615226 A2 EP0615226 A2 EP 0615226A2 EP 94102963 A EP94102963 A EP 94102963A EP 94102963 A EP94102963 A EP 94102963A EP 0615226 A2 EP0615226 A2 EP 0615226A2
Authority
EP
European Patent Office
Prior art keywords
channels
speech
interference
noise reduction
individual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP94102963A
Other languages
German (de)
French (fr)
Other versions
EP0615226B1 (en
EP0615226A3 (en
Inventor
Klaus Dr. Ing. Linhard
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mercedes Benz Group AG
Original Assignee
Daimler Benz AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Daimler Benz AG filed Critical Daimler Benz AG
Publication of EP0615226A2 publication Critical patent/EP0615226A2/en
Publication of EP0615226A3 publication Critical patent/EP0615226A3/en
Application granted granted Critical
Publication of EP0615226B1 publication Critical patent/EP0615226B1/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Definitions

  • the invention relates to a method according to the preamble of patent claim 1.
  • Such a method is used in automatic speech recognition or in hands-free systems to improve speech quality, e.g. in offices or in a motor vehicle.
  • Disrupted speech is easier to grasp if it is recorded with two or more channels. Language and interference should be present in each channel.
  • the multi-channel signals are processed with digital signal processing.
  • the transit time difference of the useful signal in the individual channels must first be determined. It will later be possible to merge the individual channels into one channel in the correct phase.
  • an acoustic directional lobe is set for this event.
  • the noise reduction is first carried out in each individual channel. Since the noise reduction does not work correctly, distortions and artificial insertions (e.g. "musical tones") can occur. When merging the individual processed channels, these errors are averaged and thus reduced.
  • the sum signal is then post-processed using the cross-correlation of the signals in the individual channels. It is assumed that interference or reverberation is less correlated than the useful signal of the channels.
  • a method for merging 2 disturbed speech channels is described in the publication "Multimicrophone signal-processing technique to remove room reverberation from speech signals” by Allen, Berkley and Blauert (J: Acoust. Soc. Am., Vol. 62, No. 4, October 1977) and out “Noise Suppression Signal Processing Using 2-Point Received Signals” by Kaneda and Tohyame (Electronics and Communication in Japan, Vol. 67-A, No. 12, 1984).
  • the first method is intended for the dehalling of speech signals and does not use a real phase compensation of the useful signal and the dehalling with noise reduction is only carried out in a post-processing stage.
  • the second method uses a simple linear phase compensation of the channels, but here too the noise is only reduced in the post-processing stage.
  • the invention is therefore based on the object of specifying a method for noise reduction in which the noise reduction is carried out in several stages and a significant improvement in the speech quality is achieved.
  • the microphone signals x and y are transformed into the frequency range (FFT, Fast Fourier Transformation).
  • the segments are half overlapped and weighted with a Hanning window.
  • the segments are each N values long and are expanded by an additional N zeros.
  • the transformed segments X l (i) and Y l (i) result.
  • the output signal z results after inverse transformation and the overlap of the segments.
  • the sampling rate of the signals x and y is, for example, 12 kHz.
  • the long-term mean of the magnitude spectrum is subtracted (spectral subtraction H SPS ).
  • the short-term average K and the long-term average L are used to calculate a first adaptive smoothing constant ⁇ .
  • the interference spectrum S nn (i) is estimated with ⁇ .
  • This adaptive smoothing constant replaces the otherwise common speech pause detector.
  • S ⁇ xx, l ( i ) (1 - ⁇ l ) S ⁇ xx, l - 1 ( i ) + ⁇ l
  • Part of the background noise is allowed to create a natural auditory impression and to mask part of the "musical tones".
  • a second adaptive smoothing with ⁇ is used to reduce a further part of the "musical tones" by smoothing the power density S xx little during speech and strongly smoothing during pause.
  • the method specified in the unpublished patent application P 42 43 831 is used to calculate the linear phase shift between useful parts in the channels. This method fits seamlessly into the noise reduction method according to the invention.
  • the phase shift is estimated from a selected number of the maximums of the cross power density and the phase correction is achieved by multiplication in the frequency domain with the all-pass function H ALLP .
  • X ⁇ l ( i ): X ⁇ l ( i )H ALLP
  • l X ⁇ l ( i ): X ⁇ l ( i ) (cos ( i * ⁇ ) + j sin ( i * ⁇ ))
  • phase correction is carried out for the other channel.
  • the first channel serves as a reference.
  • the directional filters for the channels are calculated using a "beamforming process". Various cases can be considered as noise. Different directional filters H R result depending on the noise situation . A set of these filters is selected, however, if the system status is known in later operation, it is possible to switch to a specific set or the filters can be continuously adapted.
  • the "Beamforming method” is the Frost gradient method ("An Algorithm for Linearly Constrained Adaptive Array Processing" Proc. IEEE, Vol. 60, No. 8, 1972) or according to Sondhi and Elko ("Adaptive Optimization of Microphone Arrays under a Nonlinear Contraint” Int. Conf. on ASSP, Tokyo, 1096, pp. 981-984).
  • the method according to the invention is not limited to systems with two channels, but can be used for multi-channel systems (3 and more channels).

Abstract

The invention relates to a method which can be used not only for removing noise, for example in automatic speech recognition, but also for improving the speech quality for humans, for example hands-free speech on the car phone. The noise reduction is carried out in a two-channel or multiple-channel manner, in such a way that the temporal and spatial-acoustic signal properties of speech and noise are used systematically in a stepwise fashion. <IMAGE>

Description

Die Erfindung betrifft ein Verfahren nach dem Oberbegriff des Patentanspruchs 1.The invention relates to a method according to the preamble of patent claim 1.

Ein derartiges Verfahren findet Anwendung bei der automatischen Spracherkennung oder bei Freisprechanlagen zur Verbesserung der Sprachqualität, z.B. in Büroräumen oder im Kraftfahrzeug.Such a method is used in automatic speech recognition or in hands-free systems to improve speech quality, e.g. in offices or in a motor vehicle.

Gestörte Sprache ist besser erfaßbar, wenn sie mit zwei oder mehreren Kanälen aufgezeichnet wird. Dabei soll in jedem Kanal Sprache und Störung vorhanden sein. Die mehrkanaligen Signale werden mit einer digitalen Signalverarbeitung aufbereitet.Disrupted speech is easier to grasp if it is recorded with two or more channels. Language and interference should be present in each channel. The multi-channel signals are processed with digital signal processing.

Bei mehrkanaligen Systemen ist zunächst der Laufzeitunterschied des Nutzsignals in den einzelnen Kanälen zu ermitteln. Dabei wird es später möglich, die einzelnen Kanäle wieder phasenrichtig zu einem Kanal zusammenzuführen.In the case of multi-channel systems, the transit time difference of the useful signal in the individual channels must first be determined. It will later be possible to merge the individual channels into one channel in the correct phase.

Von besonderem Interesse sind Systeme mit 2 Kanälen, da sich hiermit bereits ein räumliches Schallfeld nach einzelnen Richtungen auflösen läßt, der Rechenaufwand aber noch erträglich bleibt.Systems with 2 channels are of particular interest because they can already be used to resolve a spatial sound field in individual directions, but the computational effort remains tolerable.

Ist die Richtung bekannt, aus der das interessierende Schallereignis eintrifft, wird eine akustische Richtkeule auf dieses Ereignis eingestellt.If the direction from which the sound event of interest arrives is known, an acoustic directional lobe is set for this event.

Die Geräuschreduktion wird zunächst in jedem einzelnen Kanal durchgeführt. Da die Geräuschreduktion nicht fehlerfrei arbeitet können Verzerrungen und künstliche Einfügungen (z.B. "musical tones") entstehen. Bei der Zusammenführung der einzelnen verarbeiteten Kanälen ergibt sich eine Mittelung und damit Verringerung dieser Fehler.The noise reduction is first carried out in each individual channel. Since the noise reduction does not work correctly, distortions and artificial insertions (e.g. "musical tones") can occur. When merging the individual processed channels, these errors are averaged and thus reduced.

Das Summensignal wird anschließend nachverarbeitet, indem die Kreuzkorrelation der Signale in den einzelnen Kanälen verwendet wird. Dabei wird vorausgesetzt daß Störungen oder Nachhall weniger korreliert ist als das Nutzsignal der Kanäle.The sum signal is then post-processed using the cross-correlation of the signals in the individual channels. It is assumed that interference or reverberation is less correlated than the useful signal of the channels.

Ein Verfahren zur Zusammenführung von 2 gestörten Sprachkanälen ist aus der Veröffentlichung "Multimicrophone signal-processing technique to remove room reverberation from speech signals" von Allen, Berkley und Blauert (J: Acoust. Soc. Am., Vol.62, No. 4, October 1977) und aus "Noise Suppression Signal Processing Using 2-Point Received Signals" von Kaneda und Tohyame (Electronics and Communication in Japan, Vol. 67-A, No. 12, 1984) bekannt. Das erste Verfahren ist zur Enthallung von Sprachsignalen gedacht und verwendet keinen echten Phasenausgleich des Nutzsignals und die Enthallung mit Geräuschreduktion wird nur in einer Nachverarbeitungsstufe durchgeführt. Das zweite Verfahren benutzt einen einfachen linearen Phasenausgleich der Kanäle, die Geräuschreduktion erfolgt aber auch hier nur in der Nachverarbeitungsstufe.A method for merging 2 disturbed speech channels is described in the publication "Multimicrophone signal-processing technique to remove room reverberation from speech signals" by Allen, Berkley and Blauert (J: Acoust. Soc. Am., Vol. 62, No. 4, October 1977) and out "Noise Suppression Signal Processing Using 2-Point Received Signals" by Kaneda and Tohyame (Electronics and Communication in Japan, Vol. 67-A, No. 12, 1984). The first method is intended for the dehalling of speech signals and does not use a real phase compensation of the useful signal and the dehalling with noise reduction is only carried out in a post-processing stage. The second method uses a simple linear phase compensation of the channels, but here too the noise is only reduced in the post-processing stage.

Der Erfindung liegt deshalb die Aufgabe zugrunde, ein Verfahren zur Geräuschreduktion anzugeben, bei dem die Geräuschreduktion in mehreren Stufen durchgeführt und eine deutliche Verbesserung der Sprachqualität erzielt wird.The invention is therefore based on the object of specifying a method for noise reduction in which the noise reduction is carried out in several stages and a significant improvement in the speech quality is achieved.

Die Aufgabe wird gelöst durch die im kennzeichnenden Teil des Patentanspruchs 1 angegebenen Merkmale. Vorteilhafte Ausgestaltungen und/oder Weiterbildungen sind den Unteransprüchen zu entnehmen.The object is achieved by the features specified in the characterizing part of patent claim 1. Advantageous refinements and / or further developments can be found in the subclaims.

Mit dem erfindungsgemäßen Verfahren werden die räumlichen und die zeitlichen Eigenschaften des Nutzsignals und der Störung systematisch ausgenutzt:

  • 1.) räumliche Eigenschaft der Schallfelder:
    • a) Dämpfung von Punktstörquellen
      Mit digitalen Richtungsfiltern am Eingang der Kanäle wird zusammen mit der Phasenschätzung eine akustische Richtkeule auf den Sprecher ausgerichtet. Für die Phasenschätzung wird das in der unveröffentlichten deutschen Patentanmeldung P 42 43 831 beschriebene Verfahren verwendet. Es ist robust gegenüber Störungen und benötigt nur einen geringen Rechenaufwand. Die Richtungsfilter sind fest eingestellt. Es wird angenommen, daß der Sprecher sich relativ nahe an den Mikrofonen befindet (Abstand ≦ 1m) und sich nur in einem beschränkten Bereich bewegt. Instationäre und stationäre Punkt-Störquellen werden durch diese räumliche Auswertung gedämpft.
    • b) Dämpfung von diffusen Störquellen
      In der Nachverarbeitung werden mit Hilfe der Kreuzkorrelation die diffusen Stör- und Hallanteile gedämpft.
  • 2.) zeitliche Signaleigenschaften:
    Die spektrale Subtraktion schätzt die Störung in den Sprachpausen und führt eine betragsmäßige Subtraktion im Spektralbereich durch. Hier werden die zeitlich stationären Störanteile gedämpft.
  • 3.) Mittelung der Kanäle (Addition):
    Durch die räumliche Trennung der Aufnahmekanäle (Mikrofone in einem bestimmten Abstand) treten Fehler der spektralen Subtraktion (Verzerrung und "musical tones") in den einzelnen Kanälen z.T. zeitlich zufällig auf. Eine Mittelung der Kanäle vermindert diesen Fehler.
With the method according to the invention, the spatial and the temporal properties of the useful signal and the disturbance are used systematically:
  • 1.) spatial property of the sound fields:
    • a) Attenuation of point sources of interference
      With digital directional filters at the entrance of the channels, an acoustic directional lobe is aligned with the speaker together with the phase estimation. The method described in the unpublished German patent application P 42 43 831 is used for the phase estimation. It is robust against interference and requires little computing effort. The directional filters are fixed. It is assumed that the speaker is relatively close to the microphones (distance ≦ 1m) and only moves in a limited area. Transient and stationary point sources of interference are dampened by this spatial evaluation.
    • b) Attenuation of diffuse sources of interference
      In post-processing, the cross-correlation dampens the diffuse interference and Hall components.
  • 2.) Temporal signal properties:
    The spectral subtraction estimates the disturbance in the speech pauses and carries out an amount-based subtraction in the spectral range. Here the temporally stationary disturbance components are damped.
  • 3.) Averaging the channels (addition):
    Due to the spatial separation of the recording channels (microphones at a certain distance), spectral subtraction errors (distortion and "musical tones") in the individual channels sometimes occur at random in time. Averaging the channels reduces this error.

Die Erfindung wird anhand von Ausführungsbeispielen näher erläutert und Bezugnahme auf schematische Zeichnungen.

FIG. 1
zeigt ein Blockdiagramm des gesamten Verfahrens.
FIG. 2
zeigt einen Vergleich der gemittelten Ausgangsleistungen Z verschiedener Verfahren mit der Leistung des Original-Geräuschsignals (Beispiel: Mikrofonabstand 12cm, Fahrzeug mit 140km/h). Es wird die zunehmende Geräuschreduktion gezeigt wenn die Verarbeitung mit einem Kanal, mit zwei Kanälen und mit zwei Kanälen mit Nachverarbeitung durchgeführt wird.
The invention is explained in more detail using exemplary embodiments and reference to schematic drawings.
FIG. 1
shows a block diagram of the entire method.
FIG. 2nd
shows a comparison of the average output powers Z of different methods with the power the original sound signal (example: microphone distance 12cm, vehicle at 140km / h). The increasing noise reduction is shown when the processing is carried out with one channel, with two channels and with two channels with post-processing.

Die Mikrofonsignale x und y werden in den Frequenzbereich transformiert (FFT, Fast Fourier-Transformation). Die Segmente sind halb überlappt und werden mit einem Hanning-Fenster gewichtet. Die Segmente sind jeweils N Werte lang und werden um weitere N Nullen erweitert. Die Transformationslänge wird beispielsweise zu 2N = 512 gewählt. Es ergeben sich die transformierten Segmente Xl(i) und Yl(i). Das Ausgangssignal z ergibt sich nach Rücktransformation und der Überlappung der Segmente. l bezeichnet den Blockindex der Segmente, i die diskrete Frequenz (i=0,1,2...,2N-1). Die Abtastrate der Signale x und y beträgt z.B. 12kHz.The microphone signals x and y are transformed into the frequency range (FFT, Fast Fourier Transformation). The segments are half overlapped and weighted with a Hanning window. The segments are each N values long and are expanded by an additional N zeros. The transformation length is chosen to be 2N = 512, for example. The transformed segments X l (i) and Y l (i) result. The output signal z results after inverse transformation and the overlap of the segments. l denotes the block index of the segments, i the discrete frequency (i = 0,1,2 ..., 2N-1). The sampling rate of the signals x and y is, for example, 12 kHz.

Im Frequenzbereich wird der Langzeitmittelwert des Betragsspektrums subtrahiert (Spektrale Subtraktion HSPS). Das Kurzzeitmittel K und das Langzeitmittel L werden benutzt, um eine erste adaptive Glättungkonstante β zu berechnen. Mit β wird das Störspektrum Snn(i) geschätzt. Diese adaptive Glättungskonstante ersetzt den sonst üblichen Sprachpausendetektor. l bezeichnet den Blockindex, i die diskrete Frequenz. Als Glättungskonstante βo wird z.B. βo = 0.03 verwendet.

Figure imgb0001
β l = gβ₀
Figure imgb0002

   mit g l = 2 L l -1 L l -1 + K l
Figure imgb0003
L l = (1 - β l ) L l -1 + β l K l
Figure imgb0004
S ˆ nn,l ( i ) = (1 - β l ) S ˆ nn,l ₋₁( i ) + β l | X l ( i )|²
Figure imgb0005
In the frequency domain, the long-term mean of the magnitude spectrum is subtracted (spectral subtraction H SPS ). The short-term average K and the long-term average L are used to calculate a first adaptive smoothing constant β. The interference spectrum S nn (i) is estimated with β. This adaptive smoothing constant replaces the otherwise common speech pause detector. l denotes the block index, i the discrete frequency. For example, β o = 0.03 is used as the smoothing constant β o .
Figure imgb0001
β l = gβ₀
Figure imgb0002

With G l = 2nd L l -1 L l -1 + K l
Figure imgb0003
L l = (1 - β l ) L l -1 + β l K l
Figure imgb0004
S ˆ nn, l ( i ) = (1 - β l ) S ˆ nn, l ₋₁ ( i ) + β l | X l ( i ) | ²
Figure imgb0005

Das Störspektrum wird normiert und subtrahiert. | X ˆ l ( i )| = | X l ( i )| - S ˆ nn,l ( i ) | X l ( i )|

Figure imgb0006
X ˆ l ( i ) = (1 - S ˆ nn,l ( i ) | X l ( i )|² ) X l ( i )
Figure imgb0007
The interference spectrum is normalized and subtracted. | X ˆ l ( i ) | = | X l ( i ) | - S ˆ nn, l ( i ) | X l ( i ) |
Figure imgb0006
X ˆ l ( i ) = (1 - S ˆ nn, l ( i ) | X l ( i ) | ² ) X l ( i )
Figure imgb0007

Eine modifizierte Form ergibt sich mit: X ˆ l ( i ) = (1 - α S ˆ nn,l ( i ) S xx,l ( i ) ) X l ( i ); für (1 - α S ˆ nn,l ( i ) S xx,l ( i ) ) X l ( i ) < ƒ₀ S nn,l ( i )

Figure imgb0008
X ˆ l ( i ) = ƒ₀ S nn,l ( i ) ; sonst
Figure imgb0009
A modified form results with: X ˆ l ( i ) = (1 - α S ˆ nn, l ( i ) S xx, l ( i ) ) X l ( i ); for (1 - α S ˆ nn, l ( i ) S xx, l ( i ) ) X l ( i ) <ƒ₀ S nn, l ( i )
Figure imgb0008
X ˆ l ( i ) = ƒ₀ S nn, l ( i ); otherwise
Figure imgb0009

Für die Leistungsdichte Sxx,l eines Kanales gilt: S ˆ xx,l ( i ) = (1 - α l ) S ˆ xx,l - ₁( i ) + α l | X l ( i )|²

Figure imgb0010
α l = 2 - g l ; für 0.5 < 2 - g l < 2.0
Figure imgb0011
α l = 0.5 ; für 0.5 > 2 - g l
Figure imgb0012
α l = 2 ; für 2 < 2 - g l
Figure imgb0013

fo wird als "spectral floor" bezeichnet. Es wird ein Teil des Hintergrundgeräuschs zugelassen, um einen natürlich Höreindruck zu erzeugen und um einen Teil der "musical tones" zu maskieren. α ist ein Überschätzfaktor für das Geräusch und dient der weiteren Reduzierung des Restgeräuschs. Für diese Werte kann z.B. fo = 0.2 und α = 1.5 gewählt werden.The following applies to the power density S xx, l of a channel: S ˆ xx, l ( i ) = (1 - α l ) S ˆ xx, l - ₁ ( i ) + α l | X l ( i ) | ²
Figure imgb0010
α l = 2 - G l ; for 0.5 <2 - G l <2.0
Figure imgb0011
α l = 0.5; for 0.5> 2 - G l
Figure imgb0012
α l = 2; for 2 <2 - G l
Figure imgb0013

f o is referred to as "spectral floor". Part of the background noise is allowed to create a natural auditory impression and to mask part of the "musical tones". α is an overestimation factor for the noise and serves to further reduce the residual noise. For these values, for example, f o = 0.2 and α = 1.5 can be selected.

Im Gegensatz zu den bekannten Formen der spektralen Substraktion wird eine zweite adaptive Glättung mit α dazu benutzt einen weiteren Teil der "musical tones" zu reduzieren, indem die Leistungsdichte Sxx bei Sprache wenig und bei Pause stark geglättet wird.In contrast to the known forms of spectral subtraction, a second adaptive smoothing with α is used to reduce a further part of the "musical tones" by smoothing the power density S xx little during speech and strongly smoothing during pause.

Für den zweiten Kanal Y gelten die entsprechenden Gleichungen.The corresponding equations apply to the second channel Y.

Zur Berechnung der linearen Phasenverschiebung zwischen Nutzanteilen in den Kanälen wird das in der nicht vorveröffentlichen Patentanmeldung P 42 43 831 angegebene Verfahren verwendet. Dieses Verfahren fügt sich nahtlos in das erfindungsgemäße Geräuschreduktionsverfahren ein. Die Phasenverschiebung wird an einer ausgewählten Anzahl der Maximas der Kreuzleistungsdichte geschätzt und die Phasenkorrektur durch Multiplikation im Frequenzbereich mit der Allpaßfunktion HALLP erreicht. X ˆ l ( i ) := X ˆ l ( i )H ALLP,l

Figure imgb0014
X ˆ l ( i ) := X ˆ l ( i )(cos( i * φ) + j sin( i * φ))
Figure imgb0015
The method specified in the unpublished patent application P 42 43 831 is used to calculate the linear phase shift between useful parts in the channels. This method fits seamlessly into the noise reduction method according to the invention. The phase shift is estimated from a selected number of the maximums of the cross power density and the phase correction is achieved by multiplication in the frequency domain with the all-pass function H ALLP . X ˆ l ( i ): = X ˆ l ( i )H ALLP, l
Figure imgb0014
X ˆ l ( i ): = X ˆ l ( i ) (cos ( i * φ) + j sin ( i * φ))
Figure imgb0015

Bei mehr als zwei Kanälen wird die Phasenkorrektur für den jeweils weiteren Kanal durchgeführt. Der erste Kanal dient als Referenz.If there are more than two channels, the phase correction is carried out for the other channel. The first channel serves as a reference.

Durch ein "Beamforming-Verfahren" werden für die Kanäle die Richtungsfilter berechnet. Dabei können als Geräusch verschiedene Fälle betrachtet werden. Es ergeben sich entsprechend der Geräuschsituation verschiedene Richtungsfilter HR. Es wird ein Satz dieser Filter ausgewählt, jedoch kann falls im späteren Betrieb der Systemzustand bekannt ist, auf einem bestimmten Satz umgeschaltet werden oder die Filter können ständig adaptiert werden. Als "Beamforming-Verfahren" wird beispielsweise das Gradientenverfahren nach Frost ("An Algorithm for Linearly Constrained Adaptive Array Processing" Proc. IEEE, Vol. 60, No. 8, 1972) oder nach Sondhi und Elko ("Adaptive Optimization of Microphone Arrays under a Nonlinear Contraint" Int. Conf. on ASSP, Tokyo, 1096, S. 981-984) verwendet.The directional filters for the channels are calculated using a "beamforming process". Various cases can be considered as noise. Different directional filters H R result depending on the noise situation . A set of these filters is selected, however, if the system status is known in later operation, it is possible to switch to a specific set or the filters can be continuously adapted. For example, the "Beamforming method" is the Frost gradient method ("An Algorithm for Linearly Constrained Adaptive Array Processing" Proc. IEEE, Vol. 60, No. 8, 1972) or according to Sondhi and Elko ("Adaptive Optimization of Microphone Arrays under a Nonlinear Contraint" Int. Conf. on ASSP, Tokyo, 1096, pp. 981-984).

Für die Richtungsfilterung ergibt sich im Frequenzbereich die Multiplikation: X ˆ l ( i ) := X ˆ l ( i ) H R ( i )

Figure imgb0016
The multiplication for directional filtering results in the frequency domain: X ˆ l ( i ): = X ˆ l ( i ) H R ( i )
Figure imgb0016

Die Addition der Kanäle ergibt mit den Richtungsfiltern die Gesamt-Richtcharakteristik und das Ausgangssignal Z l ( i ) = X ˆ l ( i ) + Y ˆ l ( i )

Figure imgb0017
The addition of the channels with the directional filters results in the overall directional characteristic and the output signal Z. l ( i ) = X ˆ l ( i ) + Y ˆ l ( i )
Figure imgb0017

Außerdem führt die Addition der Kanäle zu einer Mittelung und damit Reduzierung der statistischen Fehler der spektralen Subtraktion.In addition, the addition of the channels leads to an averaging and thus reduction of the statistical errors of the spectral subtraction.

Anschließend wird die Kreuzleistungsdichte der beiden Kanäle mit Hilfe einer Glättungskonstanten (z.B. γ = 0.3) be rechnet. S xy,l ( i ) = (1 - γ) S xy,l ₋₁( i ) + γ X ˆ l ( i ) Y ˆ l *( i )

Figure imgb0018
The cross power density of the two channels is then calculated using a smoothing constant (eg γ = 0.3). S xy, l ( i ) = (1 - γ) S xy, l ₋₁ ( i ) + γ X ˆ l ( i ) Y ˆ l * ( i )
Figure imgb0018

Die Kreuzleistungsdichte Sxy wird mit der Summe der Leistungsdichten Sxx, Syy der einzelnen Kanäle normiert. Es ergibt sich eine modifizierte Kohärenzfunktion: H KKF,l ( i ) = S xy,l ( i ) S xx,l ( i ) + S yy,l ( i ) ; für S xy,l ( i ) S xx,l ( i ) + S yy,l ( i ) > 0.3

Figure imgb0019
H KKF,l ( i ) = 0.3; sonst
Figure imgb0020

mit S xx,l ( i ) = (1 - γ) S xx,l ₋₁( i ) + γ X ˆ l ( i ) X ˆ l * ( i )
Figure imgb0021
S yy,l ( i ) = (1 - γ) S yy,l -1 ( i ) + γ Y ˆ l ( i ) Y ˆ l * ( i )
Figure imgb0022
The cross power density S xy is standardized with the sum of the power densities S xx, S yy of the individual channels. The result is a modified coherence function: H KKF, l ( i ) = S xy, l ( i ) S xx, l ( i ) + S yy, l ( i ) ; For S xy, l ( i ) S xx, l ( i ) + S yy, l ( i ) > 0.3
Figure imgb0019
H KKF, l ( i ) = 0.3; otherwise
Figure imgb0020

With S xx, l ( i ) = (1 - γ) S xx, l ₋₁ ( i ) + γ X ˆ l ( i ) X ˆ l * ( i )
Figure imgb0021
S yy, l ( i ) = (1 - γ) S yy, l -1 ( i ) + γ Y ˆ l ( i ) Y ˆ l * ( i )
Figure imgb0022

Für das Ausgangssignal Z gilt: Z l ( i ) := Z l ( i ) H KKF,l ( i )

Figure imgb0023
The following applies to the output signal Z: Z. l ( i ): = Z. l ( i ) H KKF, l ( i )
Figure imgb0023

Werden Richtungsfilter nach dem Verfahren von Sondhi und Elko verwendet, ist ein inverses Filter zur Frequenzgangkorrektur erforderlich. Dieses Filter dient der Anhebung der tieferen Frequenzen, weil der Frequenzgang der Richtungsfilter (für die gewünschte Richtung, Richtung des Sprechers) zu einer Absenkung dieser Frequenzen führt. Dieses Filter HINV kann auf einfache Weise aus dem berechneten Frequenzgang approximiert werden. Z l ( i ) := Z l ( i ) H INV,l ( i )

Figure imgb0024
If directional filters are used according to the Sondhi and Elko method, an inverse filter for frequency response correction is required. This filter is used to raise the lower frequencies because the frequency response of the directional filters (for the desired direction, direction of the speaker) leads to a reduction in these frequencies. This filter H INV can be approximated in a simple manner from the calculated frequency response. Z. l ( i ): = Z. l ( i ) H INV, l ( i )
Figure imgb0024

Wird die Adaption nach dem Verfahren von Frost durchgeführt, ist kein inverses Filter erforderlich, weil der Frequenzgang in Richtung des Sprechers den konstanten Wert 1 hat.If the adaptation is carried out using the Frost method, an inverse filter is not necessary because the Frequency response in the direction of the speaker has a constant value of 1.

Das erfindungsgemäße Verfahren ist nicht auf Systeme mit zwei Kanälen beschränkt, sondern auf Mehrkanalsysteme (3 und mehr Kanäle) anwendbar.The method according to the invention is not limited to systems with two channels, but can be used for multi-channel systems (3 and more channels).

Claims (4)

Verfahren zur Geräuschreduktion von zumindest zwei gestörten Sprachkanälen, wobei die gestörten Sprachkanäle zu einem Ausgangskanal zusammengeführt werden, dadurch gekennzeichnet, - daß mittels digitaler Richtungsfilter und einer linearen Phasenschätzung für die einzelnen Kanäle eine schwenkbare, akustische Richtkeule erzeugt wird, die der Sprecherbewegung folgt und dadurch die räumlichen Störquellen gedämpft werden, - daß in den einzelnen Kanälen in den Sprechpausen die Störung geschätzt wird und durch spektrale Subtraktion die zeitlich stationären Störquellen gedämpft werden, - daß anschließend die einzelnen Sprachkanäle addiert werden, und dadurch die statistischen Störungen der spektralen Subtraktion gemittelt werden, und - daß das Summensignal mit einer modifizierten Kohärenzfunktion nachverarbeitet wird und dadurch die diffusen Stör- und Hallanteile gedämpft werden. Method for noise reduction of at least two disrupted voice channels, the disrupted voice channels being combined to form an output channel, characterized in that that a swiveling, acoustic directional lobe is generated for the individual channels by means of digital directional filters and a linear phase estimate, which follows the speaker movement and thereby dampens the spatial interference sources, - that the interference is estimated in the individual channels during the pauses in speech and that the temporally stationary sources of interference are damped by spectral subtraction, - That the individual speech channels are then added, and thereby the statistical disturbances of the spectral subtraction are averaged, and - That the sum signal is post-processed with a modified coherence function and thereby the diffuse interference and Hall components are damped. Verfahren nach Anspruch 1, dadurch gekennzeichnet, - daß die spektrale Subtraktion mit zwei adaptiven Glättungskonstanten α, β durchgeführt wird, - daß mit der ersten adaptiven Glättungskonstante β das Störspektrum Snn geschätzt wird, und - daß mit der zweiten adaptiven Glättungskonstanten α die Leistungsdichte Sxx der einzelnen Kanäle in den Sprachpausen stark und bei Sprache wenig geglättet wird. A method according to claim 1, characterized in that the spectral subtraction is carried out with two adaptive smoothing constants α, β, - That the interference spectrum S nn is estimated with the first adaptive smoothing constant β, and - That with the second adaptive smoothing constant α, the power density S xx of the individual channels is greatly smoothed during speech pauses and little smoothed during speech. Verfahren nach Anspruch 1, dadurch gekennzeichnet, daß die linearen Phasenverschiebung von zumindest zwei Signalen über eine bestimmte Anzahl von Maxima der Kreuzleistungsdichte im Frequenzbereich ermittelt wird.Method according to claim 1, characterized in that the linear phase shift of at least two signals is determined over a certain number of maxima of the cross power density in the frequency domain. Verfahren nach Anspruch 1, dadurch gekennzeichnet, daß die Phasenkorrektur, die Richtungsfilterung und eine eventulelle notwendige inverse Filterung im Frequenzbereich durchgeführt werden.Method according to Claim 1, characterized in that the phase correction, the directional filtering and any necessary inverse filtering are carried out in the frequency domain.
EP94102963A 1993-03-11 1994-02-28 Method for noise reduction in disturbed voice channels Expired - Lifetime EP0615226B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE4307688 1993-03-11
DE4307688A DE4307688A1 (en) 1993-03-11 1993-03-11 Method of noise reduction for disturbed voice channels

Publications (3)

Publication Number Publication Date
EP0615226A2 true EP0615226A2 (en) 1994-09-14
EP0615226A3 EP0615226A3 (en) 1995-08-23
EP0615226B1 EP0615226B1 (en) 1999-05-06

Family

ID=6482502

Family Applications (1)

Application Number Title Priority Date Filing Date
EP94102963A Expired - Lifetime EP0615226B1 (en) 1993-03-11 1994-02-28 Method for noise reduction in disturbed voice channels

Country Status (2)

Country Link
EP (1) EP0615226B1 (en)
DE (2) DE4307688A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998003965A1 (en) * 1996-07-19 1998-01-29 Daimler-Benz Ag Method of reducing voice signal interference
EP1251493A2 (en) * 2001-04-14 2002-10-23 DaimlerChrysler AG Method for noise reduction with self-adjusting spurious frequency
EP1286333A1 (en) * 2001-08-21 2003-02-26 Culturecom Technology (Macau) Ltd. Method and apparatus for processing a sound signal
US11741973B2 (en) 2015-03-09 2023-08-29 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19942868A1 (en) * 1999-09-08 2001-03-15 Volkswagen Ag Method for operating a multiple microphone arrangement in a motor vehicle and a multiple microphone arrangement itself
DE19955156A1 (en) * 1999-11-17 2001-06-21 Univ Karlsruhe Method and device for suppressing an interference signal component in the output signal of a sound transducer means
DE10120231A1 (en) * 2001-04-19 2002-10-24 Deutsche Telekom Ag Single-channel noise reduction of speech signals whose noise changes more slowly than speech signals, by estimating non-steady noise using power calculation and time-delay stages

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4112430A (en) * 1977-06-01 1978-09-05 The United States Of America As Represented By The Secretary Of The Navy Beamformer for wideband signals
US4653102A (en) * 1985-11-05 1987-03-24 Position Orientation Systems Directional microphone system

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4066842A (en) * 1977-04-27 1978-01-03 Bell Telephone Laboratories, Incorporated Method and apparatus for cancelling room reverberation and noise pickup
JPS5715597A (en) * 1980-07-02 1982-01-26 Nippon Gakki Seizo Kk Microphone device
US4811404A (en) * 1987-10-01 1989-03-07 Motorola, Inc. Noise suppression system
JPH01118900A (en) * 1987-11-01 1989-05-11 Ricoh Co Ltd Noise suppressor
DE4012349A1 (en) * 1989-04-19 1990-10-25 Ricoh Kk Noise elimination device for speech recognition system - uses spectral subtraction of sampled noise values from sampled speech values
GB8911153D0 (en) * 1989-05-16 1989-09-20 Smiths Industries Plc Speech recognition apparatus and methods
US5267323A (en) * 1989-12-29 1993-11-30 Pioneer Electronic Corporation Voice-operated remote control system
DE4106405C2 (en) * 1990-03-23 1996-02-29 Ricoh Kk Noise suppression device for a speech recognition system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4112430A (en) * 1977-06-01 1978-09-05 The United States Of America As Represented By The Secretary Of The Navy Beamformer for wideband signals
US4653102A (en) * 1985-11-05 1987-03-24 Position Orientation Systems Directional microphone system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING 1991, Bd. 5, 14.Mai 1991 - 17.Mai 1991 TORONTO, CA, Seiten 3581-3584, KELLERMANN 'A self-steering digital microphone array' *
SIGNAL PROCESSING VI - THEORIES AND APPLICATIONS. PROCEEDINGS OF EUSIPCO-92, SIXTH EUROPEAN SIGNAL PROCESSING CONFERENCE, 24.August 1992 - 27.August 1992 BRUSSELS, BE, Seiten 1633-1636, LE BOUQUIN ET AL. 'Study of a noise cancellation system based on the coherence function' *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998003965A1 (en) * 1996-07-19 1998-01-29 Daimler-Benz Ag Method of reducing voice signal interference
US6687669B1 (en) 1996-07-19 2004-02-03 Schroegmeier Peter Method of reducing voice signal interference
EP1251493A2 (en) * 2001-04-14 2002-10-23 DaimlerChrysler AG Method for noise reduction with self-adjusting spurious frequency
EP1251493A3 (en) * 2001-04-14 2003-11-19 DaimlerChrysler AG Method for noise reduction with self-adjusting spurious frequency
US7020291B2 (en) 2001-04-14 2006-03-28 Harman Becker Automotive Systems Gmbh Noise reduction method with self-controlling interference frequency
EP1286333A1 (en) * 2001-08-21 2003-02-26 Culturecom Technology (Macau) Ltd. Method and apparatus for processing a sound signal
US11741973B2 (en) 2015-03-09 2023-08-29 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal
US11881225B2 (en) 2015-03-09 2024-01-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Audio encoder for encoding a multichannel signal and audio decoder for decoding an encoded audio signal

Also Published As

Publication number Publication date
DE4307688A1 (en) 1994-09-15
DE59408194D1 (en) 1999-06-10
EP0615226B1 (en) 1999-05-06
EP0615226A3 (en) 1995-08-23

Similar Documents

Publication Publication Date Title
DE69531136T2 (en) Method and device for multi-channel compensation of an acoustic echo
US5400409A (en) Noise-reduction method for noise-affected voice channels
DE60316704T2 (en) MULTI-CHANNEL LANGUAGE RECOGNITION IN UNUSUAL ENVIRONMENTS
DE102008027848B4 (en) Echo cancellers, echo cancellation and computer readable storage media
DE69932626T2 (en) SIGNAL PROCESSING DEVICE AND METHOD
DE102010023615B4 (en) Signal processing apparatus and signal processing method
EP0747880B1 (en) System for speech recognition
EP1143416A2 (en) Time domain noise reduction
EP0612059A2 (en) Method for estimating the propagation time in noisy speech channels
EP1251493A2 (en) Method for noise reduction with self-adjusting spurious frequency
EP3375204B1 (en) Audio signal processing in a vehicle
DE112007003625T5 (en) Echo cancellation device, echo cancellation system, echo cancellation method and computer program
EP1189419B1 (en) Method and device for eliminating the loudspeaker interference on microphone signals
EP1456839A2 (en) Method and device for the suppression of periodic interference signals
EP1155561B1 (en) Method and device for suppressing noise in telephone devices
EP3065417B1 (en) Method for suppressing interference noise in an acoustic system
EP0615226B1 (en) Method for noise reduction in disturbed voice channels
DE69817461T2 (en) Method and device for the optimized processing of an interference signal during a sound recording
DE602005000897T2 (en) Input sound processor
DE10137348A1 (en) Noise filtering method in voice communication apparatus, involves controlling overestimation factor and background noise variable in transfer function of wiener filter based on ratio of speech and noise signal
DE19729521B4 (en) Method and device for suppressing noise and echo
DE102018117558A1 (en) ADAPTIVE AFTER-FILTERING
DE10025655B4 (en) A method of removing an unwanted component of a signal and system for distinguishing between unwanted and desired signal components
DE3230391C2 (en)
DE19818608C2 (en) Method and device for speech detection and noise parameter estimation

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): DE FR GB IT

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): DE FR GB IT

17P Request for examination filed

Effective date: 19950920

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

17Q First examination report despatched

Effective date: 19980610

GRAG Despatch of communication of intention to grant

Free format text: ORIGINAL CODE: EPIDOS AGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAH Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOS IGRA

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: DAIMLERCHRYSLER AG

ITF It: translation for a ep patent filed

Owner name: BARZANO' E ZANARDO MILANO S.P.A.

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): DE FR GB IT

REF Corresponds to:

Ref document number: 59408194

Country of ref document: DE

Date of ref document: 19990610

GBT Gb: translation of ep patent filed (gb section 77(6)(a)/1977)

Effective date: 19990702

ET Fr: translation filed
PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

26N No opposition filed
REG Reference to a national code

Ref country code: GB

Ref legal event code: IF02

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 59408194

Country of ref document: DE

Representative=s name: GRUENECKER, KINKELDEY, STOCKMAIR & SCHWANHAEUS, DE

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 59408194

Country of ref document: DE

Representative=s name: GRUENECKER, KINKELDEY, STOCKMAIR & SCHWANHAEUS, DE

Effective date: 20120411

Ref country code: DE

Ref legal event code: R081

Ref document number: 59408194

Country of ref document: DE

Owner name: NUANCE COMMUNICATIONS, INC. (N.D.GES.D. STAATE, US

Free format text: FORMER OWNER: HARMAN BECKER AUTOMOTIVE SYSTEMS GMBH, 76307 KARLSBAD, DE

Effective date: 20120411

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20120224

Year of fee payment: 19

REG Reference to a national code

Ref country code: FR

Ref legal event code: TP

Owner name: NUANCE COMMUNICATIONS, INC., US

Effective date: 20120924

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20130228

Year of fee payment: 20

Ref country code: FR

Payment date: 20130301

Year of fee payment: 20

Ref country code: DE

Payment date: 20130220

Year of fee payment: 20

REG Reference to a national code

Ref country code: DE

Ref legal event code: R071

Ref document number: 59408194

Country of ref document: DE

REG Reference to a national code

Ref country code: GB

Ref legal event code: PE20

Expiry date: 20140227

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20140301

Ref country code: GB

Free format text: LAPSE BECAUSE OF EXPIRATION OF PROTECTION

Effective date: 20140227