METHOD AND APPARATUS FOR MULTI-WAY
CONFERENCE CALLING WITH A FULL-DUPLEX
SPEAKERPHONE AND A DIGITAL WIRELESS
COMMUNICATION INTERFACE
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to speakerphones and more specifically pertains to speakerphones for multi-way conferencing which incorporate one or more digital wireless communication interfaces, such as cordless handsets.
2. Description of Related Art
A full-duplex speakerphone includes a hands-free telephone having a base with a microphone and a loudspeaker. A third interface may be added to allow a third party to participate on an equal basis with the room talker and line talker for three-way conferencing.
Such three-way conferencing machines typically operate in the analog domain and do not utilize signal processors with echo cancellers. Moreover, the proposed devices to date are believed to work in half-duplex mode but not in full- duplex mode.
Other conferencing machines may operate in full-duplex mode, but they can only accommodate two speakers and do not have a digital wireless communication interface. One such device is shown in Figure 1. A speakerphone 2 operates in digital domain but only has a speakerphone base 10 with a microphone 12 for
room talk and a loudspeaker 14, connected to a Public Switched Telephone Network (PSTN) line, for line talk. Its construction is well known in the art.
The speakerphone 2 allows a talker in a room to converse with another talker speaking on the PSTN input line. The room signal is picked up by the microphone 12. Since the room talker may vary his distance from the microphone 12, an automatic gain control (AGC) circuit 42 is required to amplify the signal from the room talker to a listenable level and to maintain the power level of the signal, before it gets transmitted to the PSTN output line. Similarly, the PSTN input line signal is amplified with another AGC circuit 44 to an appropriate level before it is applied to the loudspeaker 14. These amplifications are performed in the AGC circuits 42, 44 with help of automatic gain control algorithms which yield the power of the output signals within a specified range.
It is well known in the art that the gains in a speakerphone system can cause feedback if the electrical echo from the PSTN input line and the room acoustic echo, between the loudspeaker and the microphone, are not removed. Uncanceled echo may be perceptually irritating, and these echoes should be reduced below audible levels. Signal processing technology is able to measure room acoustic echo and line echo to cancel them automatically.
In some speakerphones, echo is canceled with two signal processing mechanisms called echo cancellers (ECs) 46, 48, one of which is used for acoustic echo and another for line echo. Each EC 46, 48 has a signal input, a feedback input connected to the output of a corresponding summing element 50, 52 and an output connected to a negative input of the same summing element 50, 52. ECs typically use linear adaptive filters, such as filters with adaptive finite impulse response (AFIR), whose coefficients are weighted in accordance with the room and line acoustics. However, it is understood that any device that filters echo may be used for echo cancellers 46, 48. Adaptive filters in the AEC 46 and LEC 48 generate a replica of the corresponding acoustic or line echo which is
subtracted in the corresponding summing element 50, 52 from the signal on the positive input of the summing element 50, 52, which is either the room talk or PSTN line talk. Coefficients of adaptive filters are updated every sample period and minimize error signals according to an algorithm. They must be constantly updated to account for changes in the acoustic environment.
Although this speakerphone 2 works in full-duplex mode, it can only support two-way conferencing. Therefore, there is a need for a speakerphone for multi-way conference calling, equipped with one or more digital wireless communication interfaces to allow additional parties to participate on an equal basis with the room talker and line talker. Moreover, there is a need for a speakerphone system which works in full-duplex mode.
OBJECTS AND SUMMARY OF THE INVENTION
It is a primary object of the present invention to overcome the aforementioned shortcomings associated with the prior art and to provide a speakerphone for multi-way conference calling with one or more digital wireless communication interfaces.
Another object of the present invention is to provide a speakerphone system with a digital wireless communication interface which works in full-duplex mode. These as well as additional objects and advantages of the present invention are achieved by providing a full-duplex device for conference calling which allows multi-way conferencing and has at least one digital wireless communication interface. The device uses a plurality of analog-to-digital converters for sampling a room talker signal from the speakerphone microphone to be sent to an output line, and a signal from an input line.
A digital wireless communication interface is attached at the input of each echo canceller. The acoustic echo canceller receives the room talker signal, the
line input signal and a signal from the digital wireless communication interface and cancels acoustic echo from the room talker signal. The line echo canceller receives the line input signal, the room talker signal and the signal from the digital wireless communication interface and cancels line echo from the line input signal. In order to support the digital wireless communication interface, the full- duplex speakerphone device is provided with a first summing device for digitally adding the signal from the digital wireless communication interface to the line input signal, a second summing device for digitally adding the signal from the digital wireless communication interface to the room talker signal, and a third summing device for digitally adding the room talker signal and the line input signal, and providing a signal to be input into the digital wireless communication interface.
In a more specific aspect of the invention, the digital wireless communication interface is preferably a cordless handset working in spread- spectrum. Each echo canceller is preferably an adaptive finite impulse response filter using a predetermined algorithm, preferably the least-means-squares algorithm, to provide an echo-canceled output signal.
BRIEF DESCRIPTION OF THE DRAWINGS
The objects and features of the present invention, which are believed to be novel, are set forth with particularity in the appended claims. The present invention, both as to its organization and manner of operation, together with further objects and advantages, may best be understood by reference to the following description, taken in connection with the accompanying drawings.
Figure 1 is a schematic illustration of a conventional speakerphone device. Figure 2 is a schematic illustration of the components of a conferencing device using a speakerphone and a digital wireless communication interface, in accordance with a preferred embodiment of the present invention.
Figure 3 is a schematic illustration of the components of a conferencing device using a speakerphone and two digital wireless communication interfaces, in accordance with a preferred embodiment of the present invention.
Figure 4 is a schematic illustration of the components of a conferencing device using two speakerphone, in accordance with a preferred embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
The following description is provided to enable any person skilled in the art to make and use the invention and sets forth the best modes contemplated by the invent r[s] of carrying out [his/her/their] invention. Various modifications, however, will remain readily apparent to those skilled in the art, since the general principles of the present invention have been defined herein specifically to provide a device for multi-way conference calling using a speakerphone with one or more digital wireless communication interfaces. Figure 2 is a schematic illustration of the components of a conferencing device using a speakerphone 4 and a digital wireless communication interface 20, in accordance with the preferred embodiment of the present invention. In the illustrated embodiment the speakerphone 4 is used to accommodate more signal sources than just one room talker and the line talker, and to allow a multi-way conversation. Therefore, the speakerphone base 10 is connected to one or more digital wireless communication interfaces 20. Preferably, the digital wireless communication interface 20 is a digital spread-spectrum cordless handset 20 with a microphone 22 and an earpiece 24. The cordless handset 20 is preferably a digital 900 MHz spread-spectrum cordless telephone. The speakerphone system 4 has a receive signal path and a transmit signal path. In the receive signal path, the line signal from the PSTN input line i(t) 21 is converted to digital form in an analog-to-digital converter (ADC) 58 and any line echo is estimated by the line echo canceller 48 and subtracted from the digitized
input signal i(n) 23. The residual input signal r(n) 25, ideally containing only the line talker signal, is amplified by the AGC 44 which maintains its output power at a specified level. This amplified residual input signal a(n) 26 is converted into analog form in a digital-to-analog converter (DAC) 60 and output to the loudspeaker 14. In the transmit signal path, the signal from the room talker y(t) 27 is picked up by the speakerphone microphone 12 and converted to digital form y(n) 28 in another analog-to-digital converter (ADC) 62. Room echo is estimated by the acoustic echo canceller 46 and subtracted from the digitized room signal y(n) 28. The residual output signal z(n) 29, ideally containing only the room talker signal, is amplified by the AGC 42 to a specific level x(n) 31. The amplified residual output signal x(n) 31 is converted to analog signal as signal in a digital-to-analog converter (DAC) 64 and output to the PSTN output line as signal o(t) 33.
As shown in Figure 2, a signal from the handset microphone 22 m(n) 36 is added digitally into the speakerphone system 4 signals a(n) 26 and x(n) 31 with summing devices 32 and 34. Similarly, the digitized room signal y(n) 28, after room echo removal by AEC 46, AGC 42 and attenuator 66, and PSTN input line signal i(n) 23, after line echo removal by LEC 48, are extracted from the speakerphone 4 lines and added in a summing device 40 to form a signal e(n) 38 for transmission to the handset earpiece 24. Before the summation, signals m(n) 36 and e(n) 38 are amplified in one of multipliers 35, 37, 39, 41. Therefore, signal m(n) 36 from the handset microphone 22 is digitally added to the signal a(n) 26, input into the acoustic echo canceller 46, and to the signal x(n) 31 , input into the line canceller 48.
With this connection, the acoustic echo canceller 46 can be trained by a signal from either the PSTN input line i(n) 23 or the handset 20 signal m(n) 36, and the line echo canceller 48 can be trained by either the room talker signal y(n) 28 or the handset talker signal m(n) 36 (assuming absence of double-talk). This connection cancels echoes heard in the handset 20 as well as by the room talker and PSTN line talker.
The acoustic echo canceller 46 of Figure 2 receives the digitized room talker signal y(n) 28 which has a room echo, the outside line input signal 25 after echo removal by LEC 48, and the wireless interface talker signal m(n) 36, and provides an estimate of the acoustic echo in the room talker signal y(n) 28 at its output. The summing element 50 is a difference means having a first input coupled to the acoustic echo canceller 46 output and a second input coupled to the room talker signal y(n) 28. The summing element 50 subtracts the acoustic echo estimate from the room talker signal y(n) 28 to produce an echo-canceled room talker signal z(n) 29. The line echo canceller 48 performs the same on the outside line input signal i(n) 23 which has line echo, with the summing element 52 and produces an echo-canceled outside line input signal r(n) 25.
Each EC 46, 48 uses a training algorithm for the adjustment of its coefficients. In the present invention it is preferably the least-mean-square (LMS) algorithm, which is an implementation of the steepest descent method. It could also be a variant of the LMS algorithm, preferably the variant with partial block updating. The operation of these algorithms is well known in the art. Any other filtering algorithm can also be used if it provides quick convergence and filter stability.
In order to train an EC, there must be a substantial broadband input signal and negligible disturbance to the echo it seeks to cancel. In the conventional speakerphone configuration 2, at most βftly one EC can be trained at any given time due to the above constraints. In the speakerphone 4 of the present invention, both echo cancellers 46, 48 may be trained at once using only the signal from the handset microphone 22, m(n) 36, as a training signal, if there is no room talk or the PSTN line talk. Additionally, AEC 46 can be trained when there is a significant line input signal i(n) 23 and negligible room talker signal y(n) 28. Similarly, training of the LEC 48 is not allowed when talk is detected on the PSTN input line i(n) 23.
The device of the present invention works in four modes: handset mode, speakerphone mode, intercom mode and conference mode. In the speakerphone mode the handset 20 is not used and two-way conferencing is performed between a line talker and a room talker. In the handset mode, the handset 20 replaces the speakerphone base 10 as the source and destination of signals to the PSTN output line and from the PSTN input line. The AEC 46 is disabled in handset mode and there is no need for cancellation of acoustic echo because there is negligible acoustic echo from the handset 20. Moreover, since the level of the handset 20 output signal m(n) 36 does not vary much, this connection does not need an AGC circuit. In the intercom mode, there is no PSTN input signal i(t) 21 and output signal o(t) 33 and the AGC 44 is not used. The room signal y(n) 28 is directed to and received from the cordless handset 20 and instead of the line echo there is a small acoustic coupling between the earpiece 24 and the microphone 22 in the handset 20. In conference mode the speakerphone 4 of the illustrated invention operates as previously described. The handset microphone 22 signal m(n) 36 is digitally added to the amplified residual input signal a(n) 26 in the summing device 32, positioned before the AEC 46. The handset microphone 22 signal m(n) 36 is also digitally added to the amplified residual output signal x(n) 31 in the summing device 34, positioned before the LEC 48. AEC 46 and LEC 48 cancel acoustic and line echo. Amplified residual output signal from the room x(n) 31 and the residual input signal r(n) 25 are added in the summing device 40 and output to the handset earpiece 24 as the signal e(n) 38.
In practice, in conference or speakerphone mode, echo cancellers 46, 48 do not provide perfect cancellation. Usually, there is a leakage of the signal from the transmit path to the receive path and a potential feedback loop, i.e., a gain loop, is formed around the elements of the receive path 12 to 33 and elements of the transmit path 21 to 14. In the gain loop, AGC circuit 42 in the transmit path amplifies the signal z(n) 29 to be sent to the PSTN output line as o(t) 33. Also,
AGC circuit 44 performs the same function for the receive path. Therefore, the gain around the gain loop may become greater than unity at some frequencies and the speakerphone 4 is prone to oscillation. This is not a problem in handset mode, since there is not enough coupling from the earpiece 24 to the microphone 22 to yield an unstable gain loop.
In order to prevent the oscillation and stabilize the system, it may be necessary to attenuate the gain in the gain loop by introducing appropriate losses into the gain loop if the gain is too large. The attenuation is performed in the opposite path of the gain loop than the one where the talk is received from. For example, if there is no talk received from the PSTN input line i(t) 21 but there is room talk y(t) 27, the attenuation is applied to the receive path of the gain loop. Similarly, if there is no talk signal y(t) 27 received from the room but there is the input line talk signal i(t) 21 , the attenuation is applied to the transmit path of the gain loop. Accordingly, the present invention includes attenuation of the gain loop signal. Therefore, two stability attenuators 66, 68, one in each path of the gain loop, are used to reduce gain during the time when there is no talk in that path and the amplification is not needed. These stability attenuators 66, 68 are digital multipliers, each positioned after the respective AGCs 42, 44, and may be dynamically adjusted by an algorithm to stabilize the gain loop while attenuating active talkers as little as possible.
The handset 20 itself does not add any variation in the input signal m(n) 36 because the signal level for a given person is fairly constant and the microphone 22 of the handset is held closer to the person's mouth than the microphone 12 of the speakerphone 4. Therefore, there is no need to attenuate the amplified input signal m(n) 36 in the handset mode since there is no feedback path. Moreover, the present invention does not need additional AGC circuits for the handset-to-room and room-to-handset signals. Furthermore, the PSTN-handset two-way connection does not need any AGC circuits because the handset 20 can be easily matched to
standard levels of the PSTN line input signal i(t) 21 and output signal o(t) 33 required by existing standards.
Another way to prevent oscillations in the gain loop and avoid the possible feedback would be to break the gain loop by using the system in half-duplex mode. Since in this mode only one party can be heard at the time and the other party cannot interrupt it, this solution produces an inferior device. However, at the startup period at the beginning of the speakerphone mode, since the ECs 46, 48 have to be trained and no echo cancellation is provided, the system may have to be started and run for a short period of time in half-duplex mode until the training is accomplished. During the half-duplex period, the ECs 46, 48 are trained whenever one, but not both, room and PSTN talkers are active. AEC 46 is trained during PSTN input line talk or handset talk and LEC 48 is trained during room talk.
The principles of the invention were proven by simulation methods. In a simulation model, ECs 46, 48 were AFIR filters performing LMS algorithm with partial block updating. The AEC 46 had 520 taps and block size 26, and LEC 48 had 168 taps and block size 12.
Although Figure 2 only shows one handset 20, connections for additional handsets may be applied in the same way as discussed above. In an alternative embodiment of the present invention, in a speakerphone 6 shown in Figure 3, an additional handset 70 receives at its earpiece 74 the signal m(n) 36 from the first handset 20 added in a summing element 76 to the signal e(n) 38. The output from the handset microphone 72 is added to the signal e(n) 38 in the summing element 40 for transmission to the first handset earpiece 24. The output from the handset microphone 72 is added to the input signal m(n) 36 in a summing element 78 for transmission to the multiplier 41 , and replaces the output of the first handset 36 as the input to the multiplier 35.
In another alternative embodiment of the present invention, in a speakerphone system 8 shown in Figure 4, an additional speakerphone 80, which is preferably placed in another room, replaces the handset earpiece 24 and
microphone 22. Signal from the second speakerphone microphone 82 is digitized in an analog-to-digital converter 83. Room echo is estimated by the acoustic echo canceller 81 and subtracted from the digitized room signal. The residual output signal is amplified by an AGC 84 and sent through a stability attenuator 85 to the amplifiers 41 and 35 in the first speakerphone. Signal e(n) 38 is amplified in an AGC 86 and sent through a stability attenuator 87 to a digital-to-analog converter 88 to be output to a loudspeaker 89 of the second speakerphone.
The present invention, though applicable to any digital cordless telephone, is believed to be especially applicable to the telephones with digital spread spectrum. The third and additional parties with cordless handsets may be located in the same room or be positioned at another location in the vicinity of the room with the speakerphone 4. It is understood that the principles of this invention may be applied to other digital devices which work in full duplex mode, like digital telephone answering devices. Those skilled in the art will appreciate that various adaptations and modifications of the just-described preferred embodiment can be configured without departing from the scope and spirit of the invention. For instance, while the present invention has been described for three-way use, it is understood that the speakerphone 4 of the present invention could be modified to handle any number of users by incorporating additional digital wireless communication interfaces 20. Furthermore, the digital wireless communication interface 20, when located in a room away from the speakerphone 4, may be substituted with another speakerphone. This speakerphone only needs one echo canceller, AEC 46, because there is no other outside line and no other line echo signal. Therefore, it is to be understood that, within the scope of the appended claims, the invention may be practiced other than as specifically described herein.