EP0615226A2

EP0615226A2 - Method for noise reduction in disturbed voice drannels

Info

Publication number: EP0615226A2
Application number: EP94102963A
Authority: EP
Inventors: Klaus Dr. Ing. Linhard
Original assignee: Daimler Benz AG
Current assignee: Mercedes Benz Group AG
Priority date: 1993-03-11
Filing date: 1994-02-28
Publication date: 1994-09-14
Anticipated expiration: 2014-02-28
Also published as: DE4307688A1; DE59408194D1; EP0615226B1; EP0615226A3

Abstract

The invention relates to a method which can be used not only for removing noise, for example in automatic speech recognition, but also for improving the speech quality for humans, for example hands-free speech on the car phone. The noise reduction is carried out in a two-channel or multiple-channel manner, in such a way that the temporal and spatial-acoustic signal properties of speech and noise are used systematically in a stepwise fashion. <IMAGE>

Description

Die Erfindung betrifft ein Verfahren nach dem Oberbegriff des Patentanspruchs 1.The invention relates to a method according to the preamble of patent claim 1.

Ein derartiges Verfahren findet Anwendung bei der automatischen Spracherkennung oder bei Freisprechanlagen zur Verbesserung der Sprachqualität, z.B. in Büroräumen oder im Kraftfahrzeug.Such a method is used in automatic speech recognition or in hands-free systems to improve speech quality, e.g. in offices or in a motor vehicle.

Gestörte Sprache ist besser erfaßbar, wenn sie mit zwei oder mehreren Kanälen aufgezeichnet wird. Dabei soll in jedem Kanal Sprache und Störung vorhanden sein. Die mehrkanaligen Signale werden mit einer digitalen Signalverarbeitung aufbereitet.Disrupted speech is easier to grasp if it is recorded with two or more channels. Language and interference should be present in each channel. The multi-channel signals are processed with digital signal processing.

Bei mehrkanaligen Systemen ist zunächst der Laufzeitunterschied des Nutzsignals in den einzelnen Kanälen zu ermitteln. Dabei wird es später möglich, die einzelnen Kanäle wieder phasenrichtig zu einem Kanal zusammenzuführen.In the case of multi-channel systems, the transit time difference of the useful signal in the individual channels must first be determined. It will later be possible to merge the individual channels into one channel in the correct phase.

Von besonderem Interesse sind Systeme mit 2 Kanälen, da sich hiermit bereits ein räumliches Schallfeld nach einzelnen Richtungen auflösen läßt, der Rechenaufwand aber noch erträglich bleibt.Systems with 2 channels are of particular interest because they can already be used to resolve a spatial sound field in individual directions, but the computational effort remains tolerable.

Ist die Richtung bekannt, aus der das interessierende Schallereignis eintrifft, wird eine akustische Richtkeule auf dieses Ereignis eingestellt.If the direction from which the sound event of interest arrives is known, an acoustic directional lobe is set for this event.

Die Geräuschreduktion wird zunächst in jedem einzelnen Kanal durchgeführt. Da die Geräuschreduktion nicht fehlerfrei arbeitet können Verzerrungen und künstliche Einfügungen (z.B. "musical tones") entstehen. Bei der Zusammenführung der einzelnen verarbeiteten Kanälen ergibt sich eine Mittelung und damit Verringerung dieser Fehler.The noise reduction is first carried out in each individual channel. Since the noise reduction does not work correctly, distortions and artificial insertions (e.g. "musical tones") can occur. When merging the individual processed channels, these errors are averaged and thus reduced.

Das Summensignal wird anschließend nachverarbeitet, indem die Kreuzkorrelation der Signale in den einzelnen Kanälen verwendet wird. Dabei wird vorausgesetzt daß Störungen oder Nachhall weniger korreliert ist als das Nutzsignal der Kanäle.The sum signal is then post-processed using the cross-correlation of the signals in the individual channels. It is assumed that interference or reverberation is less correlated than the useful signal of the channels.

Ein Verfahren zur Zusammenführung von 2 gestörten Sprachkanälen ist aus der Veröffentlichung "Multimicrophone signal-processing technique to remove room reverberation from speech signals" von Allen, Berkley und Blauert (J: Acoust. Soc. Am., Vol.62, No. 4, October 1977) und aus "Noise Suppression Signal Processing Using 2-Point Received Signals" von Kaneda und Tohyame (Electronics and Communication in Japan, Vol. 67-A, No. 12, 1984) bekannt. Das erste Verfahren ist zur Enthallung von Sprachsignalen gedacht und verwendet keinen echten Phasenausgleich des Nutzsignals und die Enthallung mit Geräuschreduktion wird nur in einer Nachverarbeitungsstufe durchgeführt. Das zweite Verfahren benutzt einen einfachen linearen Phasenausgleich der Kanäle, die Geräuschreduktion erfolgt aber auch hier nur in der Nachverarbeitungsstufe.A method for merging 2 disturbed speech channels is described in the publication "Multimicrophone signal-processing technique to remove room reverberation from speech signals" by Allen, Berkley and Blauert (J: Acoust. Soc. Am., Vol. 62, No. 4, October 1977) and out "Noise Suppression Signal Processing Using 2-Point Received Signals" by Kaneda and Tohyame (Electronics and Communication in Japan, Vol. 67-A, No. 12, 1984). The first method is intended for the dehalling of speech signals and does not use a real phase compensation of the useful signal and the dehalling with noise reduction is only carried out in a post-processing stage. The second method uses a simple linear phase compensation of the channels, but here too the noise is only reduced in the post-processing stage.

Der Erfindung liegt deshalb die Aufgabe zugrunde, ein Verfahren zur Geräuschreduktion anzugeben, bei dem die Geräuschreduktion in mehreren Stufen durchgeführt und eine deutliche Verbesserung der Sprachqualität erzielt wird.The invention is therefore based on the object of specifying a method for noise reduction in which the noise reduction is carried out in several stages and a significant improvement in the speech quality is achieved.

Die Aufgabe wird gelöst durch die im kennzeichnenden Teil des Patentanspruchs 1 angegebenen Merkmale. Vorteilhafte Ausgestaltungen und/oder Weiterbildungen sind den Unteransprüchen zu entnehmen.The object is achieved by the features specified in the characterizing part of patent claim 1. Advantageous refinements and / or further developments can be found in the subclaims.

Mit dem erfindungsgemäßen Verfahren werden die räumlichen und die zeitlichen Eigenschaften des Nutzsignals und der Störung systematisch ausgenutzt:

1.) räumliche Eigenschaft der Schallfelder:
- a) Dämpfung von Punktstörquellen
  Mit digitalen Richtungsfiltern am Eingang der Kanäle wird zusammen mit der Phasenschätzung eine akustische Richtkeule auf den Sprecher ausgerichtet. Für die Phasenschätzung wird das in der unveröffentlichten deutschen Patentanmeldung P 42 43 831 beschriebene Verfahren verwendet. Es ist robust gegenüber Störungen und benötigt nur einen geringen Rechenaufwand. Die Richtungsfilter sind fest eingestellt. Es wird angenommen, daß der Sprecher sich relativ nahe an den Mikrofonen befindet (Abstand ≦ 1m) und sich nur in einem beschränkten Bereich bewegt. Instationäre und stationäre Punkt-Störquellen werden durch diese räumliche Auswertung gedämpft.
- b) Dämpfung von diffusen Störquellen
  In der Nachverarbeitung werden mit Hilfe der Kreuzkorrelation die diffusen Stör- und Hallanteile gedämpft.
2.) zeitliche Signaleigenschaften:
Die spektrale Subtraktion schätzt die Störung in den Sprachpausen und führt eine betragsmäßige Subtraktion im Spektralbereich durch. Hier werden die zeitlich stationären Störanteile gedämpft.
3.) Mittelung der Kanäle (Addition):
Durch die räumliche Trennung der Aufnahmekanäle (Mikrofone in einem bestimmten Abstand) treten Fehler der spektralen Subtraktion (Verzerrung und "musical tones") in den einzelnen Kanälen z.T. zeitlich zufällig auf. Eine Mittelung der Kanäle vermindert diesen Fehler.

With the method according to the invention, the spatial and the temporal properties of the useful signal and the disturbance are used systematically:

1.) spatial property of the sound fields:
- a) Attenuation of point sources of interference
  With digital directional filters at the entrance of the channels, an acoustic directional lobe is aligned with the speaker together with the phase estimation. The method described in the unpublished German patent application P 42 43 831 is used for the phase estimation. It is robust against interference and requires little computing effort. The directional filters are fixed. It is assumed that the speaker is relatively close to the microphones (distance ≦ 1m) and only moves in a limited area. Transient and stationary point sources of interference are dampened by this spatial evaluation.
- b) Attenuation of diffuse sources of interference
  In post-processing, the cross-correlation dampens the diffuse interference and Hall components.
2.) Temporal signal properties:
The spectral subtraction estimates the disturbance in the speech pauses and carries out an amount-based subtraction in the spectral range. Here the temporally stationary disturbance components are damped.
3.) Averaging the channels (addition):
Due to the spatial separation of the recording channels (microphones at a certain distance), spectral subtraction errors (distortion and "musical tones") in the individual channels sometimes occur at random in time. Averaging the channels reduces this error.

Die Erfindung wird anhand von Ausführungsbeispielen näher erläutert und Bezugnahme auf schematische Zeichnungen.

FIG. 1: zeigt ein Blockdiagramm des gesamten Verfahrens.
FIG. 2: zeigt einen Vergleich der gemittelten Ausgangsleistungen Z verschiedener Verfahren mit der Leistung des Original-Geräuschsignals (Beispiel: Mikrofonabstand 12cm, Fahrzeug mit 140km/h). Es wird die zunehmende Geräuschreduktion gezeigt wenn die Verarbeitung mit einem Kanal, mit zwei Kanälen und mit zwei Kanälen mit Nachverarbeitung durchgeführt wird.

The invention is explained in more detail using exemplary embodiments and reference to schematic drawings.

FIG. 1: shows a block diagram of the entire method.
FIG. 2nd: shows a comparison of the average output powers Z of different methods with the power the original sound signal (example: microphone distance 12cm, vehicle at 140km / h). The increasing noise reduction is shown when the processing is carried out with one channel, with two channels and with two channels with post-processing.

Die Mikrofonsignale x und y werden in den Frequenzbereich transformiert (FFT, Fast Fourier-Transformation). Die Segmente sind halb überlappt und werden mit einem Hanning-Fenster gewichtet. Die Segmente sind jeweils N Werte lang und werden um weitere N Nullen erweitert. Die Transformationslänge wird beispielsweise zu 2N = 512 gewählt. Es ergeben sich die transformierten Segmente X_l(i) und Y_l(i). Das Ausgangssignal z ergibt sich nach Rücktransformation und der Überlappung der Segmente. l bezeichnet den Blockindex der Segmente, i die diskrete Frequenz (i=0,1,2...,2N-1). Die Abtastrate der Signale x und y beträgt z.B. 12kHz.The microphone signals x and y are transformed into the frequency range (FFT, Fast Fourier Transformation). The segments are half overlapped and weighted with a Hanning window. The segments are each N values long and are expanded by an additional N zeros. The transformation length is chosen to be 2N = 512, for example. The transformed segments X _l (i) and Y _l (i) result. The output signal z results after inverse transformation and the overlap of the segments. l denotes the block index of the segments, i the discrete frequency (i = 0,1,2 ..., 2N-1). The sampling rate of the signals x and y is, for example, 12 kHz.

Im Frequenzbereich wird der Langzeitmittelwert des Betragsspektrums subtrahiert (Spektrale Subtraktion H_SPS). Das Kurzzeitmittel K und das Langzeitmittel L werden benutzt, um eine erste adaptive Glättungkonstante β zu berechnen. Mit β wird das Störspektrum S_nn(i) geschätzt. Diese adaptive Glättungskonstante ersetzt den sonst üblichen Sprachpausendetektor. l bezeichnet den Blockindex, i die diskrete Frequenz. Als Glättungskonstante β_o wird z.B. β_o = 0.03 verwendet.

β_{l} = gβ₀

mit

g_{l} = \frac{2 L_{l -1}}{L_{l -1} + K_{l}}

L_{l} = (1 - β_{l}) L_{l -1} + β_{l} K_{l}

{\hat{S}}_{nn,l} (i) = (1 - β_{l}) {\hat{S}}_{nn,l} ₋₁(i) + β_{l} | X_{l} (i)|²

In the frequency domain, the long-term mean of the magnitude spectrum is subtracted (spectral subtraction H _SPS ). The short-term average K and the long-term average L are used to calculate a first adaptive smoothing constant β. The interference spectrum S _nn (i) is estimated with β. This adaptive smoothing constant replaces the otherwise common speech pause detector. l denotes the block index, i the discrete frequency. For example, β _o = 0.03 is used as the smoothing constant β _o .

β_{l} = gβ₀

With

G_{l} = \frac{2nd L_{l -1}}{L_{l -1} + K_{l}}

L_{l} = (1 - β_{l}) L_{l -1} + β_{l} K_{l}

{\hat{S}}_{nn, l} (i) = (1 - β_{l}) {\hat{S}}_{nn, l} ₋₁ (i) + β_{l} | X_{l} (i) | ²

Das Störspektrum wird normiert und subtrahiert. $| {\hat{X}}_{l} (i)| = | X_{l} (i)| - \frac{{\hat{S}}_{nn,l} (i)}{| X_{l} (i)|}$

{\hat{X}}_{l} (i) = (1 - \frac{{\hat{S}}_{nn,l} (i)}{| X_{l} (i)|²}) X_{l} (i)

The interference spectrum is normalized and subtracted.

| {\hat{X}}_{l} (i) | = | X_{l} (i) | - \frac{{\hat{S}}_{nn, l} (i)}{| X_{l} (i) |}

{\hat{X}}_{l} (i) = (1 - \frac{{\hat{S}}_{nn, l} (i)}{| X_{l} (i) | ²}) X_{l} (i)

Eine modifizierte Form ergibt sich mit: ${\hat{X}}_{l} (i) = (1 - α \sqrt{\frac{{\hat{S}}_{nn,l} (i)}{S_{xx,l} (i)}}) X_{l} (i); für (1 - α \sqrt{\frac{{\hat{S}}_{nn,l} (i)}{S_{xx,l} (i)}}) X_{l} (i) < ƒ₀ S_{nn,l} (i)$

{\hat{X}}_{l} (i) = ƒ₀ S_{nn,l} (i) ; sonst

A modified form results with:

{\hat{X}}_{l} (i) = (1 - α \sqrt{\frac{{\hat{S}}_{nn, l} (i)}{S_{xx, l} (i)}}) X_{l} (i); for (1 - α \sqrt{\frac{{\hat{S}}_{nn, l} (i)}{S_{xx, l} (i)}}) X_{l} (i) <ƒ₀ S_{nn, l} (i)

{\hat{X}}_{l} (i) = ƒ₀ S_{nn, l} (i); otherwise

Für die Leistungsdichte S_xx,l eines Kanales gilt: ${\hat{S}}_{xx,l} (i {) = (1 - α}_{l}) {\hat{S}}_{xx,l -} ₁(i {) + α}_{l} | X_{l} (i)|²$

α_{l} = 2 - g_{l}; für 0.5 < 2 - g_{l} < 2.0

α_{l} = 0.5 ; für 0.5 > 2 - g_{l}

α_{l} = 2 ; für 2 < 2 - g_{l}

f_o wird als "spectral floor" bezeichnet. Es wird ein Teil des Hintergrundgeräuschs zugelassen, um einen natürlich Höreindruck zu erzeugen und um einen Teil der "musical tones" zu maskieren. α ist ein Überschätzfaktor für das Geräusch und dient der weiteren Reduzierung des Restgeräuschs. Für diese Werte kann z.B. f_o = 0.2 und α = 1.5 gewählt werden.The following applies to the power density S _{xx, l of} a channel:

{\hat{S}}_{xx, l} (i {) = (1 - α}_{l}) {\hat{S}}_{xx, l -} ₁ (i {) + α}_{l} | X_{l} (i) | ²

α_{l} = 2 - G_{l}; for 0.5 <2 - G_{l} <2.0

α_{l} = 0.5; for 0.5> 2 - G_{l}

α_{l} = 2; for 2 <2 - G_{l}

f _o is referred to as "spectral floor". Part of the background noise is allowed to create a natural auditory impression and to mask part of the "musical tones". α is an overestimation factor for the noise and serves to further reduce the residual noise. For these values, for example, f _o = 0.2 and α = 1.5 can be selected.

Im Gegensatz zu den bekannten Formen der spektralen Substraktion wird eine zweite adaptive Glättung mit α dazu benutzt einen weiteren Teil der "musical tones" zu reduzieren, indem die Leistungsdichte S_xx bei Sprache wenig und bei Pause stark geglättet wird.In contrast to the known forms of spectral subtraction, a second adaptive smoothing with α is used to reduce a further part of the "musical tones" by smoothing the power density S _xx little during speech and strongly smoothing during pause.

Für den zweiten Kanal Y gelten die entsprechenden Gleichungen.The corresponding equations apply to the second channel Y.

Zur Berechnung der linearen Phasenverschiebung zwischen Nutzanteilen in den Kanälen wird das in der nicht vorveröffentlichen Patentanmeldung P 42 43 831 angegebene Verfahren verwendet. Dieses Verfahren fügt sich nahtlos in das erfindungsgemäße Geräuschreduktionsverfahren ein. Die Phasenverschiebung wird an einer ausgewählten Anzahl der Maximas der Kreuzleistungsdichte geschätzt und die Phasenkorrektur durch Multiplikation im Frequenzbereich mit der Allpaßfunktion H_ALLP erreicht. ${\hat{X}}_{l} (i) := {\hat{X}}_{l} (i {)H}_{ALLP,l}$

{\hat{X}}_{l} (i) := {\hat{X}}_{l} (i)(cos(i * φ) + j sin(i * φ))

The method specified in the unpublished patent application P 42 43 831 is used to calculate the linear phase shift between useful parts in the channels. This method fits seamlessly into the noise reduction method according to the invention. The phase shift is estimated from a selected number of the maximums of the cross power density and the phase correction is achieved by multiplication in the frequency domain with the all-pass function H _ALLP .

{\hat{X}}_{l} (i): = {\hat{X}}_{l} (i {)H}_{ALLP, l}

{\hat{X}}_{l} (i): = {\hat{X}}_{l} (i) (cos (i * φ) + j sin (i * φ))

Bei mehr als zwei Kanälen wird die Phasenkorrektur für den jeweils weiteren Kanal durchgeführt. Der erste Kanal dient als Referenz.If there are more than two channels, the phase correction is carried out for the other channel. The first channel serves as a reference.

Durch ein "Beamforming-Verfahren" werden für die Kanäle die Richtungsfilter berechnet. Dabei können als Geräusch verschiedene Fälle betrachtet werden. Es ergeben sich entsprechend der Geräuschsituation verschiedene Richtungsfilter H_R. Es wird ein Satz dieser Filter ausgewählt, jedoch kann falls im späteren Betrieb der Systemzustand bekannt ist, auf einem bestimmten Satz umgeschaltet werden oder die Filter können ständig adaptiert werden. Als "Beamforming-Verfahren" wird beispielsweise das Gradientenverfahren nach Frost ("An Algorithm for Linearly Constrained Adaptive Array Processing" Proc. IEEE, Vol. 60, No. 8, 1972) oder nach Sondhi und Elko ("Adaptive Optimization of Microphone Arrays under a Nonlinear Contraint" Int. Conf. on ASSP, Tokyo, 1096, S. 981-984) verwendet.The directional filters for the channels are calculated using a "beamforming process". Various cases can be considered as noise. Different directional filters H _R result depending on the noise _situation . A set of these filters is selected, however, if the system status is known in later operation, it is possible to switch to a specific set or the filters can be continuously adapted. For example, the "Beamforming method" is the Frost gradient method ("An Algorithm for Linearly Constrained Adaptive Array Processing" Proc. IEEE, Vol. 60, No. 8, 1972) or according to Sondhi and Elko ("Adaptive Optimization of Microphone Arrays under a Nonlinear Contraint" Int. Conf. on ASSP, Tokyo, 1096, pp. 981-984).

Für die Richtungsfilterung ergibt sich im Frequenzbereich die Multiplikation: ${\hat{X}}_{l} (i) := {\hat{X}}_{l} (i) H_{R} (i)$

The multiplication for directional filtering results in the frequency domain:

{\hat{X}}_{l} (i): = {\hat{X}}_{l} (i) H_{R} (i)

Die Addition der Kanäle ergibt mit den Richtungsfiltern die Gesamt-Richtcharakteristik und das Ausgangssignal $Z_{l} (i) = {\hat{X}}_{l} (i) + {\hat{Y}}_{l} (i)$

The addition of the channels with the directional filters results in the overall directional characteristic and the output signal

{Z.}_{l} (i) = {\hat{X}}_{l} (i) + {\hat{Y}}_{l} (i)

Außerdem führt die Addition der Kanäle zu einer Mittelung und damit Reduzierung der statistischen Fehler der spektralen Subtraktion.In addition, the addition of the channels leads to an averaging and thus reduction of the statistical errors of the spectral subtraction.

Anschließend wird die Kreuzleistungsdichte der beiden Kanäle mit Hilfe einer Glättungskonstanten (z.B. γ = 0.3) be rechnet. $S_{xy,l} (i) = (1 - γ) S_{xy,l} ₋₁(i) + γ {\hat{X}}_{l} (i) {\hat{Y}}_{l} *(i)$

The cross power density of the two channels is then calculated using a smoothing constant (eg γ = 0.3).

S_{xy, l} (i) = (1 - γ) S_{xy, l} ₋₁ (i) + γ {\hat{X}}_{l} (i) {\hat{Y}}_{l} * (i)

Die Kreuzleistungsdichte S_xy wird mit der Summe der Leistungsdichten S_xx, S_yy der einzelnen Kanäle normiert. Es ergibt sich eine modifizierte Kohärenzfunktion: $H_{KKF,l} (i) = \frac{S_{xy,l} (i)}{S_{xx,l} (i) + S_{yy,l} (i)}; für \frac{S_{xy,l} (i)}{S_{xx,l} (i) + S_{yy,l} (i)} > 0.3$

H_{KKF,l} (i) = 0.3; sonst

mit

S_{xx,l} (i) = (1 - γ) S_{xx,l} ₋₁(i) + γ {\hat{X}}_{l} (i) {\hat{X}}_{l}^{*} (i)

S_{yy,l} (i) = (1 - γ) S_{yy,l -1} (i) + γ {\hat{Y}}_{l} (i) {\hat{Y}}_{l}^{*} (i)

The cross power density S _xy is standardized with the sum of the power densities S _xx, S _{yy of} the individual channels. The result is a modified coherence function:

H_{KKF, l} (i) = \frac{S_{xy, l} (i)}{S_{xx, l} (i) + S_{yy, l} (i)}; For \frac{S_{xy, l} (i)}{S_{xx, l} (i) + S_{yy, l} (i)} > 0.3

H_{KKF, l} (i) = 0.3; otherwise

With

S_{xx, l} (i) = (1 - γ) S_{xx, l} ₋₁ (i) + γ {\hat{X}}_{l} (i) {\hat{X}}_{l}^{*} (i)

S_{yy, l} (i) = (1 - γ) S_{yy, l -1} (i) + γ {\hat{Y}}_{l} (i) {\hat{Y}}_{l}^{*} (i)

Für das Ausgangssignal Z gilt: $Z_{l} (i) := Z_{l} (i) H_{KKF,l} (i)$

The following applies to the output signal Z:

{Z.}_{l} (i): = {Z.}_{l} (i) H_{KKF, l} (i)

Werden Richtungsfilter nach dem Verfahren von Sondhi und Elko verwendet, ist ein inverses Filter zur Frequenzgangkorrektur erforderlich. Dieses Filter dient der Anhebung der tieferen Frequenzen, weil der Frequenzgang der Richtungsfilter (für die gewünschte Richtung, Richtung des Sprechers) zu einer Absenkung dieser Frequenzen führt. Dieses Filter H_INV kann auf einfache Weise aus dem berechneten Frequenzgang approximiert werden. $Z_{l} (i) := Z_{l} (i) H_{INV,l} (i)$

If directional filters are used according to the Sondhi and Elko method, an inverse filter for frequency response correction is required. This filter is used to raise the lower frequencies because the frequency response of the directional filters (for the desired direction, direction of the speaker) leads to a reduction in these frequencies. This filter H _INV can be approximated in a simple manner from the calculated frequency response.

{Z.}_{l} (i): = {Z.}_{l} (i) H_{INV, l} (i)

Wird die Adaption nach dem Verfahren von Frost durchgeführt, ist kein inverses Filter erforderlich, weil der Frequenzgang in Richtung des Sprechers den konstanten Wert 1 hat.If the adaptation is carried out using the Frost method, an inverse filter is not necessary because the Frequency response in the direction of the speaker has a constant value of 1.

Das erfindungsgemäße Verfahren ist nicht auf Systeme mit zwei Kanälen beschränkt, sondern auf Mehrkanalsysteme (3 und mehr Kanäle) anwendbar.The method according to the invention is not limited to systems with two channels, but can be used for multi-channel systems (3 and more channels).

Claims

Method for noise reduction of at least two disrupted voice channels, the disrupted voice channels being combined to form an output channel, characterized in that that a swiveling, acoustic directional lobe is generated for the individual channels by means of digital directional filters and a linear phase estimate, which follows the speaker movement and thereby dampens the spatial interference sources,

- that the interference is estimated in the individual channels during the pauses in speech and that the temporally stationary sources of interference are damped by spectral subtraction,

- That the individual speech channels are then added, and thereby the statistical disturbances of the spectral subtraction are averaged, and

- That the sum signal is post-processed with a modified coherence function and thereby the diffuse interference and Hall components are damped.

A method according to claim 1, characterized in that the spectral subtraction is carried out with two adaptive smoothing constants α, β,

- That the interference spectrum S _{nn is} estimated with the first adaptive smoothing constant β, and

- That with the second adaptive smoothing constant α, the power density S _{xx of} the individual channels is greatly smoothed during speech pauses and little smoothed during speech.

Method according to claim 1, characterized in that the linear phase shift of at least two signals is determined over a certain number of maxima of the cross power density in the frequency domain.

Method according to Claim 1, characterized in that the phase correction, the directional filtering and any necessary inverse filtering are carried out in the frequency domain.