US20020021799A1 - Multi-device audio-video combines echo canceling - Google Patents
Multi-device audio-video combines echo canceling Download PDFInfo
- Publication number
- US20020021799A1 US20020021799A1 US09/928,553 US92855301A US2002021799A1 US 20020021799 A1 US20020021799 A1 US 20020021799A1 US 92855301 A US92855301 A US 92855301A US 2002021799 A1 US2002021799 A1 US 2002021799A1
- Authority
- US
- United States
- Prior art keywords
- facilities
- speech
- echo canceling
- recognizing
- canceling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Definitions
- the invention relates to a method for operating a multi-device audio-video system that contains speech recognizing and echo canceling facilities. More in particular, the invention relates to a method as recited in the preamble of claim 1 .
- speech recognition has gotten in wide use, such including applications in consumer systems for the general market.
- the echo canceling in this respect functions on an operational level in that a particular device will not recognize speech that it is presently producing itself. A human or other external user must nevertheless receive the full spectral sound being produced by the device.
- the canceling is effected internally in the device, whereby the sound emitted by the device itself is functionally blocked from consideration.
- systems may be composed from various devices that each may have to recognize certain speech items from the user, it being impossible, however, to predict which items should not be recognized.
- the problem is aggravated in that the various devices of a particular system may come from different manufacturers.
- devices may be combined that had never been intended to be operated as a combination.
- Devices originating from the same manufacturer or originating from different manufacturers may contain various audio sources.
- the invention is characterized according to the characterizing part of claim 1 .
- the invention also relates to a multi-device system so operated as claimed in claim 8 .
- the invention also relates to a speech-enhanced device for use in a system according to the invention, as claimed in claim 15 . Further advantageous aspects of the invention are recited in dependent Claims.
- FIG. 1 a general speech-enhanced device for use with the present invention
- FIG. 2 a multi-device speech-enhanced system with distributed automatic speech recognition (ASR) and distributed automatic echo canceling (AEC);
- ASR distributed automatic speech recognition
- AEC distributed automatic echo canceling
- FIG. 3 ditto with distributed ASR and distributed AEC in a star configuration
- FIG. 4 ditto with distributed ASR and centralized AEC
- FIG. 5 ditto, with centralized ASR and centralized AEC;
- FIG. 6 ditto with centralized ASR and distributed AEC
- FIG. 7 ditto with distributed ASR and distributed AEC in an advanced setup.
- FIG. 1 illustrates a general speech-enhanced device 20 for use with the present invention.
- the prime user-directed functionality has been played down.
- Such functionality may, without any express or implied limitation, represent an audio or audio-video tuner, an audio player, an audio or audio-video recorder or an audio or audio-video composer.
- the detailing of the Figure has been limited to the control functionality.
- user control inputting has been immediate such as symbolized by the ingoing line of bi-directional line pair 46 , and such control may be mechanical through user buttons or the like, or remote through IR signaling or the like.
- the outputting of control signalizations has been through lamps or other visual display indicators, through text display, buzzers, and other.
- control signalizations may be exchanged through line 46 pair with other connected audio-video devices.
- Item 30 represents the user functionality of the General Speech Enhanced Device, that receives external control from lines 46 , and optionally produces audio on output 46 for general usability, such as broadcasted audio, and on line 38 for other purposes as will be discussed hereinafter. The latter via addition mechanism 32 is sent to loudspeakers 48 .
- Item 22 represents a Voice-Controlled User Interface that may produce feedback on line 34 to addition mechanism 32 for thereby canceling feedback sounds from outputting on loudspeakers 48 . Otherwise, item 22 may produce non-audio output on interface 46 for external usage, or for controlling device 30 .
- Speech input by an operator to the device may be done on microphone 28 .
- the speech so received can be outputted on the outgoing line of line pair 42 . It may also be used as an alternative to speech received on the ingoing line of line pair 42 for communicating to Automatic Echo Canceller block 26 .
- the latter will output a speech signal on the outgoing channel of bi-directional channel 40 .
- This speech signal closely corresponds to the speech signal received on microphone 28 , from which, however, any audio signal outputted by the device via item 48 illustrated in FIG. 1 has been deleted to a great extent.
- Such speech signal has been received on a dedicated channel indicated by 60 in the Figure.
- the speech signal so corrected for the audio output of the device itself can either be outputted on the outgoing channel of bi-directional speech channel 40 , or rather be sent to the input of speech recognition item 24 .
- the latter may alternatively select to receive externally transmitted speech received on the ingoing channel of bi-directional speech channel 40 .
- Item 24 will recognize the speech so received according to a strategy that without limitation may be conventional.
- the recognition result may be outputted as text on the outgoing channel of bi-directional channel pair 44 , or may be forwarded to Voice-Controlled User Interface item 22 .
- the latter may alternatively receive externally inputted text along the ingoing channel of bi-directional channel pair 44 .
- the VCUI module 22 can produce further control signals as discussed earlier, or produce audio output for feeding to loudspeaker boxes 48 , or output video display, which has not been discussed for brevity. Still further, VCUI module may generate a selective disable signal on line 36 for any or all of modules 24 , 26 , 28 , 48 for application in cascaded architectures. The usage thereof will be discussed in detail hereinafter.
- line pair 44 is optional, line out in line pair 42 may be left out, whereas certain other elements are not really necessary in one or more of the embodiments shown hereinafter.
- the microphone in line in line pair 42 will be of great usage in FIGS. 6, 7 (cf. connection 100 in particular),
- FIG. 2 illustrates a multi-device speech-enhanced system with distributed automatic speech recognition (ASR) and distributed automatic echo canceling (AEC).
- ASR distributed automatic speech recognition
- AEC distributed automatic echo canceling
- a brute-force remedy for stereo application would be to have all four channels, two for each device, and to execute echo canceling in each device separately. Internally in the device this will then require at least five channels, if also a microphone channel is required. If the number of channels rises further, the problem grows exponentially. Furthermore, the device must have enough processing power to execute at least fourfold echo canceling. The different devices must furthermore be connected to each other. Obviously, the solution so recited is both hardware and software intensive, and as such both expensive and prone to errors and malfunctioning.
- FIG. 3 illustrates the configuration of FIG. 2 enhanced with an interconnection pattern in a star configuration.
- the requirements are network interconnection, audio out, and multiple channel automatic echo canceling. Note that the requirements will grow exponentially if more than two devices are making up the system, or if the number of audio channels with respect to the audio rendering will grow, such as for effecting above-HIFI quality. It is recognized that in many situations such required technical facilities would prove to be excessive.
- FIG. 4 shows such system with distributed ASR and central AEC.
- n may have any realistic integer value.
- the wiring may often be quite simple, such as by connecting TV audio-out to an Auxiliary audio input that is often present on audio sets.
- AEC the speech signal must be transferred to the “line in” of the other device(s) to recognize the cleaned-up signal.
- the speech UI remains in fact separately in each device.
- further input channels may be used for future beam forming technology which requires multiple microphones and associated extra input channels.
- the system illustrated in the Figure is in the context of a VCR hooked up to a television set.
- the requirements for this approach are: speech out after echo canceling, speech in before automatic speech recognition, disable AEC, disable microphone, two-channel audio out.
- the subsystems AEC, mic, and the loudspeaker s are not operational, through the selective blocking in the device of FIG. 1 as incorporated in the VCR, and as indicated by their light printing.
- FIG. 5 illustrates a system with centralized ASR and centralized AEC, which may boil down to using a central Speech Control Box.
- a possible platform may be realized in a settop box.
- the organization realizes all advantages of the FIG. 4 configuration. Moreover, only a single speech recognizer mechanism is needed. The most apparent advantage in a user environment is the inherent absence of multiple recognizers in a single room, and furthermore, the possibility for improved controlling of various different devices and possible extension to a more powerful system. For simplicity, the Figure limits to only two devices, each with 2-channel AEC.
- one of the connected devices will still play the final audio via a two-channel output, which is usually effected by the audio device itself. This will force the user to connect all other devices immediately to a single audio output device.
- this option may be visualized as only a minor change to the SCB architecture which will allow different speech-enhanced audio devices to each play their respective own audio. Acoustic echo cancellation is done for all devices in a distributed manner, and therefore, sequentially in each separate device.
- ASR-AEC devices with two channels each in order to cancel two or more audio channels.
- a speech-enhanced audio set and a speech-enhanced television set may each have their own audio output, whereas the various stereo channels will be echo-cancelled in sequence.
- the final and clean speech signal is used in the central SCB in order to control the various devices.
- speech signals there are various different speech signals, all of which may be distorted.
- the delay incurred through executing the various steps in sequence may also cause problems.
- FIG. 6 illustrates another system embodiment comprising audio, TV, and SCB, with centralized ASR and distributed AEC, thus mitigating various of the above disadvantages.
- Particular requirements now include: speech out after echo canceling, disable ASR, disable AEC, disable microphone, line in, and bidirectional control link for each device, which may again be realized through a network.
- ASR has been selectively disabled.
- TV the ASR and microphone have been selectively disabled.
- SCB device microphone and AEC have been disabled.
- both audio device and television set may use their loudspeaker as shown.
- the SCB may be replaced by only the connected devices, where the clean speech signal is retrocoupled to all other devices.
- the key idea is to introduce robust ASR technology without the immediate need to connect all devices, and without the obligation to use exclusively the audio device for outputting the sound.
- This scheme has the following functional requirements: speech out after Automatic Echo Canceling, disable microphone and line in. As shown, the TV set has its microphone selectively disabled.
Abstract
A multi-device audio-video system contains speech recognizing facilities and echo canceling facilities. In particular, plural and functionally separate such speech recognizing facilities and echo canceling facilities are present. Now, the echo canceling facilities combine their forces for by one or more thereof canceling one or more mutually unique cancelable speech entities and combining such cancelled entities for overall non-recognition by the system.
Description
- The invention relates to a method for operating a multi-device audio-video system that contains speech recognizing and echo canceling facilities. More in particular, the invention relates to a method as recited in the preamble of
claim 1. Now, speech recognition has gotten in wide use, such including applications in consumer systems for the general market. The echo canceling in this respect functions on an operational level in that a particular device will not recognize speech that it is presently producing itself. A human or other external user must nevertheless receive the full spectral sound being produced by the device. Thus, the canceling is effected internally in the device, whereby the sound emitted by the device itself is functionally blocked from consideration. Now, systems may be composed from various devices that each may have to recognize certain speech items from the user, it being impossible, however, to predict which items should not be recognized. In particular, the problem is aggravated in that the various devices of a particular system may come from different manufacturers. In other cases, devices may be combined that had never been intended to be operated as a combination. Devices originating from the same manufacturer or originating from different manufacturers may contain various audio sources. - In consequence, amongst other things, it is an object of the present invention to provide a method for operating a multi-device system, wherein echo canceling has been designed on the level of the various devices, but is operative on the level of the comprehensive system.
- Now therefore, according to one of its aspects, the invention is characterized according to the characterizing part of
claim 1. - The invention also relates to a multi-device system so operated as claimed in claim8. The invention also relates to a speech-enhanced device for use in a system according to the invention, as claimed in claim 15. Further advantageous aspects of the invention are recited in dependent Claims.
- These and further aspects and advantages of the invention will be discussed more in detail hereinafter with reference to the disclosure of preferred embodiments, and in particular with reference to the appended Figures that show:
- FIG. 1, a general speech-enhanced device for use with the present invention;
- FIG. 2, a multi-device speech-enhanced system with distributed automatic speech recognition (ASR) and distributed automatic echo canceling (AEC);
- FIG. 3, ditto with distributed ASR and distributed AEC in a star configuration;
- FIG. 4, ditto with distributed ASR and centralized AEC;
- FIG. 5, ditto, with centralized ASR and centralized AEC;
- FIG. 6, ditto with centralized ASR and distributed AEC;
- FIG. 7, ditto with distributed ASR and distributed AEC in an advanced setup.
- FIG. 1 illustrates a general speech-enhanced
device 20 for use with the present invention. For simplicity, the prime user-directed functionality has been played down. Such functionality may, without any express or implied limitation, represent an audio or audio-video tuner, an audio player, an audio or audio-video recorder or an audio or audio-video composer. In contradistinction, the detailing of the Figure has been limited to the control functionality. Generally, user control inputting has been immediate such as symbolized by the ingoing line ofbi-directional line pair 46, and such control may be mechanical through user buttons or the like, or remote through IR signaling or the like. The outputting of control signalizations has been through lamps or other visual display indicators, through text display, buzzers, and other. Furthermore, control signalizations may be exchanged throughline 46 pair with other connected audio-video devices. -
Item 30 represents the user functionality of the General Speech Enhanced Device, that receives external control fromlines 46, and optionally produces audio onoutput 46 for general usability, such as broadcasted audio, and online 38 for other purposes as will be discussed hereinafter. The latter viaaddition mechanism 32 is sent toloudspeakers 48.Item 22 represents a Voice-Controlled User Interface that may produce feedback online 34 toaddition mechanism 32 for thereby canceling feedback sounds from outputting onloudspeakers 48. Otherwise,item 22 may produce non-audio output oninterface 46 for external usage, or for controllingdevice 30. - Speech input by an operator to the device may be done on
microphone 28. The speech so received can be outputted on the outgoing line ofline pair 42. It may also be used as an alternative to speech received on the ingoing line ofline pair 42 for communicating to Automatic Echo Cancellerblock 26. The latter will output a speech signal on the outgoing channel of bi-directionalchannel 40. This speech signal closely corresponds to the speech signal received onmicrophone 28, from which, however, any audio signal outputted by the device viaitem 48 illustrated in FIG. 1 has been deleted to a great extent. Such speech signal has been received on a dedicated channel indicated by 60 in the Figure. The speech signal so corrected for the audio output of the device itself can either be outputted on the outgoing channel ofbi-directional speech channel 40, or rather be sent to the input ofspeech recognition item 24. The latter may alternatively select to receive externally transmitted speech received on the ingoing channel of bi-directionalspeech channel 40.Item 24 will recognize the speech so received according to a strategy that without limitation may be conventional. The recognition result may be outputted as text on the outgoing channel ofbi-directional channel pair 44, or may be forwarded to Voice-ControlledUser Interface item 22. The latter may alternatively receive externally inputted text along the ingoing channel of bi-directionalchannel pair 44. TheVCUI module 22 can produce further control signals as discussed earlier, or produce audio output for feeding toloudspeaker boxes 48, or output video display, which has not been discussed for brevity. Still further, VCUI module may generate a selective disable signal online 36 for any or all ofmodules - In the various embodiments, certain elements of the device of FIG. 1 may be left out. In particular,
line pair 44 is optional, line out inline pair 42 may be left out, whereas certain other elements are not really necessary in one or more of the embodiments shown hereinafter. However, the microphone in line inline pair 42 will be of great usage in FIGS. 6, 7 (cf.connection 100 in particular), - FIG. 2 illustrates a multi-device speech-enhanced system with distributed automatic speech recognition (ASR) and distributed automatic echo canceling (AEC). The system has been illustrated as a combination of audio set and TV, although various other multi-device systems may be configured, such including the usage of more than two devices. In all subsequent Figures, a two-channel parallel setup such as for stereo audio or a multi-channel setup such as for use in surround sound and other sophisticated reproduction techniques may be used, without separate indication in the Figures of the various channels. Now, each device will need its own software layer for the VC User Interface. However, with such functionality built into various independent devices, the Voice Control may effectively fail when both devices are playing simultaneously. A brute-force remedy for stereo application would be to have all four channels, two for each device, and to execute echo canceling in each device separately. Internally in the device this will then require at least five channels, if also a microphone channel is required. If the number of channels rises further, the problem grows exponentially. Furthermore, the device must have enough processing power to execute at least fourfold echo canceling. The different devices must furthermore be connected to each other. Obviously, the solution so recited is both hardware and software intensive, and as such both expensive and prone to errors and malfunctioning.
- In this respect, FIG. 3 illustrates the configuration of FIG. 2 enhanced with an interconnection pattern in a star configuration. The requirements are network interconnection, audio out, and multiple channel automatic echo canceling. Note that the requirements will grow exponentially if more than two devices are making up the system, or if the number of audio channels with respect to the audio rendering will grow, such as for effecting above-HIFI quality. It is recognized that in many situations such required technical facilities would prove to be excessive.
- Now, a more straightforward solution uses only a single loudspeaker, in which only a single device will output all sounds generated by any of the devices in the system.
- The further Figures illustrate various non-limiting embodiments of systems according to the invention. In this respect, FIG. 4 shows such system with distributed ASR and central AEC. Now, only canceling of a single n-channel audio signal is needed, wherein n may have any realistic integer value. The wiring may often be quite simple, such as by connecting TV audio-out to an Auxiliary audio input that is often present on audio sets. Additionally however, after AEC the speech signal must be transferred to the “line in” of the other device(s) to recognize the cleaned-up signal. The speech UI remains in fact separately in each device. Additionally, further input channels may be used for future beam forming technology which requires multiple microphones and associated extra input channels. The system illustrated in the Figure is in the context of a VCR hooked up to a television set. The requirements for this approach are: speech out after echo canceling, speech in before automatic speech recognition, disable AEC, disable microphone, two-channel audio out. Note that in the VCR box the subsystems AEC, mic, and the loudspeaker s are not operational, through the selective blocking in the device of FIG. 1 as incorporated in the VCR, and as indicated by their light printing.
- FIG. 5 illustrates a system with centralized ASR and centralized AEC, which may boil down to using a central Speech Control Box. A possible platform may be realized in a settop box. The organization realizes all advantages of the FIG. 4 configuration. Moreover, only a single speech recognizer mechanism is needed. The most apparent advantage in a user environment is the inherent absence of multiple recognizers in a single room, and furthermore, the possibility for improved controlling of various different devices and possible extension to a more powerful system. For simplicity, the Figure limits to only two devices, each with 2-channel AEC. Requirements now are: a bi-directional control link for each device, that can readily been effected through a network such as a HAVi network, audio out, and possibly, additional audio inputs for still another audio device. As far as present in the Audio Set and TV devices, all elements depictured in FIG. 1, except the Audio set's loudspeakers, will be disabled, as indicated by their having been left out from the Figure.
- Now, in the setup of FIG. 5, one of the connected devices will still play the final audio via a two-channel output, which is usually effected by the audio device itself. This will force the user to connect all other devices immediately to a single audio output device. With distributed AEC, this option may be visualized as only a minor change to the SCB architecture which will allow different speech-enhanced audio devices to each play their respective own audio. Acoustic echo cancellation is done for all devices in a distributed manner, and therefore, sequentially in each separate device.
- Technically, we are now using two or more ASR-AEC devices with two channels each in order to cancel two or more audio channels. For example, a speech-enhanced audio set and a speech-enhanced television set may each have their own audio output, whereas the various stereo channels will be echo-cancelled in sequence. The final and clean speech signal is used in the central SCB in order to control the various devices. Now, there are various different speech signals, all of which may be distorted. Furthermore, the delay incurred through executing the various steps in sequence may also cause problems.
- In this respect, FIG. 6 illustrates another system embodiment comprising audio, TV, and SCB, with centralized ASR and distributed AEC, thus mitigating various of the above disadvantages. Particular requirements now include: speech out after echo canceling, disable ASR, disable AEC, disable microphone, line in, and bidirectional control link for each device, which may again be realized through a network. As shown, in the audio device ASR has been selectively disabled. Furthermore, in the TV, the ASR and microphone have been selectively disabled. Still further, in the SCB device, microphone and AEC have been disabled. In this setup, both audio device and television set may use their loudspeaker as shown.
- In particular, the SCB may be replaced by only the connected devices, where the clean speech signal is retrocoupled to all other devices. This in fact leads to a system that resembles the option of FIG. 2 which, although perhaps being a less obvious choice, could be a very practical one nevertheless. From a packaging point of view, the key idea is to introduce robust ASR technology without the immediate need to connect all devices, and without the obligation to use exclusively the audio device for outputting the sound. This leads in fact to the option of FIG. 7 with distributed ASR and distributed AEC in an advanced setup. This scheme has the following functional requirements: speech out after Automatic Echo Canceling, disable microphone and line in. As shown, the TV set has its microphone selectively disabled.
Claims (17)
1. A method for operating a user-interactive multi-device audio-video system that contains user speech recognizing facilities and echo canceling facilities for avoiding the recognizing of speech output from the system as user speech,
characterized in that in the presence of plural and functionally separate such speech recognizing facilities and echo canceling facilities, driving the echo canceling facilities to combine their forces for by one or more thereof canceling one or more mutually unique cancelable speech entities and combining such cancelled entities for overall non-recognition by the system.
2. A method as claimed in claim 1 , wherein such combining operates by arranging various echo canceling facilities in series (FIGS. 6, 7).
3. A method as claimed in claim 2 , and from said series arrangement feeding the speech recognizing facility in a centralized manner (FIG. 6).
4. A method as claimed in claim 2 , and from said series arrangement feeding various speech recognizing facilities in a distributed manner (FIG. 7).
5. A method as claimed in claim 1 , wherein such combining operates by centralizing said echo canceling facilities in the system and therefrom feeding various speech recognizing facilities in a distributed manner (FIG. 4).
6. A method as claimed in claim 1 , wherein such combining operates by centralizing said echo canceling facilities and speech recognizing facilities in a joint control facility (FIG. 5).
7. A method as claimed in claim 1 , wherein such combining operates by arranging various echo canceling facilities in a centralized control device (FIG. 4) and therefrom feeding various speech recognizing facilities in parallel.
8. A multi-device audio-video system that contains speech recognizing facilities and echo canceling facilities for avoiding the recognizing of speech output from the system as user speech,
characterized in that in the presence of plural and functionally separate such speech recognizing facilities and echo canceling facilities, the echo canceling facilities are arranged to combine their forces through joint canceling means for canceling one or more mutually unique cancelable speech entities and combining means for combining such cancelled entities for overall non-recognition by the system.
9. A system as claimed in claim 8 , wherein such combining means include a serial arrangement that arranges various echo canceling facilities in series (FIGS. 6, 7).
10. A system as claimed in claim 9 , arranged for from said series arrangement feeding the speech recognizing facility in a centralized manner (FIG. 6).
11. A system as claimed in claim 9 , arranged for from said series arrangement feeding various speech recognizing facilities in a distributed manner (FIG. 7).
12. A system as claimed in claim 8 , wherein such combining means have said echo canceling facilities centralized in a control device and are arranged for feeding various speech recognizing facilities in a distributed manner (FIG. 4).
13. A system as claimed in claim 8 , wherein such combining means are arranged for centralizing said echo canceling facilities and speech recognizing facilities in a joint control facility (FIG. 5).
14. A system as claimed in claim 8 , wherein such combining means are arranged for centralizing various echo canceling facilities (FIG. 4) and therefrom feeding various speech recognizing facilities in parallel.
15. A speech enhanced device for use in a system as claimed in claim 8 and having speech recognizing facilities and echo canceling facilities for avoiding the recognizing of speech output from the device as user speech,
being characterized by having speech input/output means interposed between said interconnected speech recognizing and echo canceling facilities, for intercoupling another such device.
16. A device as claimed in claim 15 , and having control means for selectively disabling one or more of said speech-recognizing facilities, said echo canceling facilities and audio output facilities of the device.
17. A device as claimed in claim 15 , and having microphone out means and furthermore control means for selectively controlling one or more of said speech recognizing facilities, said echo canceling facilities, and said microphone out means.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP00202856.1 | 2000-08-15 | ||
EP00202856 | 2000-08-15 |
Publications (1)
Publication Number | Publication Date |
---|---|
US20020021799A1 true US20020021799A1 (en) | 2002-02-21 |
Family
ID=8171920
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/928,553 Abandoned US20020021799A1 (en) | 2000-08-15 | 2001-08-13 | Multi-device audio-video combines echo canceling |
Country Status (6)
Country | Link |
---|---|
US (1) | US20020021799A1 (en) |
EP (1) | EP1312078A1 (en) |
JP (1) | JP2004506944A (en) |
KR (1) | KR20020040850A (en) |
CN (1) | CN1190775C (en) |
WO (1) | WO2002015169A1 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1496499A3 (en) * | 2003-07-07 | 2005-02-02 | Lg Electronics Inc. | Apparatus and method of voice recognition in an audio-video system |
US20090034712A1 (en) * | 2007-07-31 | 2009-02-05 | Scott Grasley | Echo cancellation in which sound source signals are spatially distributed to all speaker devices |
US20100034372A1 (en) * | 2008-08-08 | 2010-02-11 | Norman Nelson | Method and system for distributed speakerphone echo cancellation |
US20120136654A1 (en) * | 2010-01-13 | 2012-05-31 | Goertek Inc. | Apparatus And Method For Cancelling Echo In Joint Time Domain And Frequency Domain |
US20130142365A1 (en) * | 2011-12-01 | 2013-06-06 | Richard T. Lord | Audible assistance |
US8934652B2 (en) | 2011-12-01 | 2015-01-13 | Elwha Llc | Visual presentation of speaker-related information |
US9053096B2 (en) | 2011-12-01 | 2015-06-09 | Elwha Llc | Language translation based on speaker-related information |
US9064152B2 (en) | 2011-12-01 | 2015-06-23 | Elwha Llc | Vehicular threat detection based on image analysis |
US9107012B2 (en) | 2011-12-01 | 2015-08-11 | Elwha Llc | Vehicular threat detection based on audio signals |
US9159236B2 (en) | 2011-12-01 | 2015-10-13 | Elwha Llc | Presentation of shared threat information in a transportation-related context |
US9245254B2 (en) | 2011-12-01 | 2016-01-26 | Elwha Llc | Enhanced voice conferencing with history, language translation and identification |
US9368028B2 (en) | 2011-12-01 | 2016-06-14 | Microsoft Technology Licensing, Llc | Determining threats based on information from road-based devices in a transportation-related context |
US10875525B2 (en) | 2011-12-01 | 2020-12-29 | Microsoft Technology Licensing Llc | Ability enhancement |
US20220369030A1 (en) * | 2021-05-17 | 2022-11-17 | Apple Inc. | Spatially informed acoustic echo cancelation |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1314000C (en) * | 2004-10-12 | 2007-05-02 | 上海大学 | Voice enhancing device based on blind signal separation |
CN107396158A (en) * | 2017-08-21 | 2017-11-24 | 深圳创维-Rgb电子有限公司 | A kind of acoustic control interactive device, acoustic control exchange method and television set |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5548681A (en) * | 1991-08-13 | 1996-08-20 | Kabushiki Kaisha Toshiba | Speech dialogue system for realizing improved communication between user and system |
US5583965A (en) * | 1994-09-12 | 1996-12-10 | Sony Corporation | Methods and apparatus for training and operating voice recognition systems |
US5657425A (en) * | 1993-11-15 | 1997-08-12 | International Business Machines Corporation | Location dependent verbal command execution in a computer based control system |
US5761638A (en) * | 1995-03-17 | 1998-06-02 | Us West Inc | Telephone network apparatus and method using echo delay and attenuation |
US5765130A (en) * | 1996-05-21 | 1998-06-09 | Applied Language Technologies, Inc. | Method and apparatus for facilitating speech barge-in in connection with voice recognition systems |
US5867495A (en) * | 1996-11-18 | 1999-02-02 | Mci Communications Corporations | System, method and article of manufacture for communications utilizing calling, plans in a hybrid network |
US6006108A (en) * | 1996-01-31 | 1999-12-21 | Qualcomm Incorporated | Digital audio processing in a dual-mode telephone |
US6061653A (en) * | 1998-07-14 | 2000-05-09 | Alcatel Usa Sourcing, L.P. | Speech recognition system using shared speech models for multiple recognition processes |
US6219645B1 (en) * | 1999-12-02 | 2001-04-17 | Lucent Technologies, Inc. | Enhanced automatic speech recognition using multiple directional microphones |
US6230137B1 (en) * | 1997-06-06 | 2001-05-08 | Bsh Bosch Und Siemens Hausgeraete Gmbh | Household appliance, in particular an electrically operated household appliance |
US6505057B1 (en) * | 1998-01-23 | 2003-01-07 | Digisonix Llc | Integrated vehicle voice enhancement system and hands-free cellular telephone system |
US6587822B2 (en) * | 1998-10-06 | 2003-07-01 | Lucent Technologies Inc. | Web-based platform for interactive voice response (IVR) |
US6665645B1 (en) * | 1999-07-28 | 2003-12-16 | Matsushita Electric Industrial Co., Ltd. | Speech recognition apparatus for AV equipment |
US6839670B1 (en) * | 1995-09-11 | 2005-01-04 | Harman Becker Automotive Systems Gmbh | Process for automatic control of one or more devices by voice commands or by real-time voice dialog and apparatus for carrying out this process |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10257583A (en) * | 1997-03-06 | 1998-09-25 | Asahi Chem Ind Co Ltd | Voice processing unit and its voice processing method |
-
2001
- 2001-08-02 EP EP01967231A patent/EP1312078A1/en not_active Withdrawn
- 2001-08-02 JP JP2002520213A patent/JP2004506944A/en active Pending
- 2001-08-02 WO PCT/EP2001/008929 patent/WO2002015169A1/en active Application Filing
- 2001-08-02 CN CNB018024017A patent/CN1190775C/en not_active Expired - Fee Related
- 2001-08-02 KR KR1020027004598A patent/KR20020040850A/en not_active Application Discontinuation
- 2001-08-13 US US09/928,553 patent/US20020021799A1/en not_active Abandoned
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5548681A (en) * | 1991-08-13 | 1996-08-20 | Kabushiki Kaisha Toshiba | Speech dialogue system for realizing improved communication between user and system |
US5657425A (en) * | 1993-11-15 | 1997-08-12 | International Business Machines Corporation | Location dependent verbal command execution in a computer based control system |
US5583965A (en) * | 1994-09-12 | 1996-12-10 | Sony Corporation | Methods and apparatus for training and operating voice recognition systems |
US5761638A (en) * | 1995-03-17 | 1998-06-02 | Us West Inc | Telephone network apparatus and method using echo delay and attenuation |
US6839670B1 (en) * | 1995-09-11 | 2005-01-04 | Harman Becker Automotive Systems Gmbh | Process for automatic control of one or more devices by voice commands or by real-time voice dialog and apparatus for carrying out this process |
US6006108A (en) * | 1996-01-31 | 1999-12-21 | Qualcomm Incorporated | Digital audio processing in a dual-mode telephone |
US5765130A (en) * | 1996-05-21 | 1998-06-09 | Applied Language Technologies, Inc. | Method and apparatus for facilitating speech barge-in in connection with voice recognition systems |
US5867495A (en) * | 1996-11-18 | 1999-02-02 | Mci Communications Corporations | System, method and article of manufacture for communications utilizing calling, plans in a hybrid network |
US6230137B1 (en) * | 1997-06-06 | 2001-05-08 | Bsh Bosch Und Siemens Hausgeraete Gmbh | Household appliance, in particular an electrically operated household appliance |
US6505057B1 (en) * | 1998-01-23 | 2003-01-07 | Digisonix Llc | Integrated vehicle voice enhancement system and hands-free cellular telephone system |
US6061653A (en) * | 1998-07-14 | 2000-05-09 | Alcatel Usa Sourcing, L.P. | Speech recognition system using shared speech models for multiple recognition processes |
US6587822B2 (en) * | 1998-10-06 | 2003-07-01 | Lucent Technologies Inc. | Web-based platform for interactive voice response (IVR) |
US6665645B1 (en) * | 1999-07-28 | 2003-12-16 | Matsushita Electric Industrial Co., Ltd. | Speech recognition apparatus for AV equipment |
US6219645B1 (en) * | 1999-12-02 | 2001-04-17 | Lucent Technologies, Inc. | Enhanced automatic speech recognition using multiple directional microphones |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050033572A1 (en) * | 2003-07-07 | 2005-02-10 | Jin Min Ho | Apparatus and method of voice recognition system for AV system |
US8046223B2 (en) | 2003-07-07 | 2011-10-25 | Lg Electronics Inc. | Apparatus and method of voice recognition system for AV system |
EP1496499A3 (en) * | 2003-07-07 | 2005-02-02 | Lg Electronics Inc. | Apparatus and method of voice recognition in an audio-video system |
US20090034712A1 (en) * | 2007-07-31 | 2009-02-05 | Scott Grasley | Echo cancellation in which sound source signals are spatially distributed to all speaker devices |
US8223959B2 (en) | 2007-07-31 | 2012-07-17 | Hewlett-Packard Development Company, L.P. | Echo cancellation in which sound source signals are spatially distributed to all speaker devices |
US8433058B2 (en) * | 2008-08-08 | 2013-04-30 | Avaya Inc. | Method and system for distributed speakerphone echo cancellation |
US20100034372A1 (en) * | 2008-08-08 | 2010-02-11 | Norman Nelson | Method and system for distributed speakerphone echo cancellation |
US8868416B2 (en) * | 2010-01-13 | 2014-10-21 | Goertek Inc. | Apparatus and method for cancelling echo in joint time domain and frequency domain |
US20120136654A1 (en) * | 2010-01-13 | 2012-05-31 | Goertek Inc. | Apparatus And Method For Cancelling Echo In Joint Time Domain And Frequency Domain |
US9107012B2 (en) | 2011-12-01 | 2015-08-11 | Elwha Llc | Vehicular threat detection based on audio signals |
US8811638B2 (en) * | 2011-12-01 | 2014-08-19 | Elwha Llc | Audible assistance |
US8934652B2 (en) | 2011-12-01 | 2015-01-13 | Elwha Llc | Visual presentation of speaker-related information |
US9053096B2 (en) | 2011-12-01 | 2015-06-09 | Elwha Llc | Language translation based on speaker-related information |
US9064152B2 (en) | 2011-12-01 | 2015-06-23 | Elwha Llc | Vehicular threat detection based on image analysis |
US20130142365A1 (en) * | 2011-12-01 | 2013-06-06 | Richard T. Lord | Audible assistance |
US9159236B2 (en) | 2011-12-01 | 2015-10-13 | Elwha Llc | Presentation of shared threat information in a transportation-related context |
US9245254B2 (en) | 2011-12-01 | 2016-01-26 | Elwha Llc | Enhanced voice conferencing with history, language translation and identification |
US9368028B2 (en) | 2011-12-01 | 2016-06-14 | Microsoft Technology Licensing, Llc | Determining threats based on information from road-based devices in a transportation-related context |
US10079929B2 (en) | 2011-12-01 | 2018-09-18 | Microsoft Technology Licensing, Llc | Determining threats based on information from road-based devices in a transportation-related context |
US10875525B2 (en) | 2011-12-01 | 2020-12-29 | Microsoft Technology Licensing Llc | Ability enhancement |
US20220369030A1 (en) * | 2021-05-17 | 2022-11-17 | Apple Inc. | Spatially informed acoustic echo cancelation |
US11849291B2 (en) * | 2021-05-17 | 2023-12-19 | Apple Inc. | Spatially informed acoustic echo cancelation |
Also Published As
Publication number | Publication date |
---|---|
EP1312078A1 (en) | 2003-05-21 |
CN1190775C (en) | 2005-02-23 |
JP2004506944A (en) | 2004-03-04 |
KR20020040850A (en) | 2002-05-30 |
CN1388956A (en) | 2003-01-01 |
WO2002015169A1 (en) | 2002-02-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20020021799A1 (en) | Multi-device audio-video combines echo canceling | |
US10359991B2 (en) | Apparatus, systems and methods for audio content diagnostics | |
JP4792156B2 (en) | Voice control system with microphone array | |
JP4897169B2 (en) | Voice recognition device and consumer electronic system | |
EP1166596A4 (en) | Audio/video system and method enabling a user to select different views and sounds associated with an event | |
JP5859600B2 (en) | Speech capture and speech rendering | |
WO2004038697A1 (en) | Controlling an apparatus based on speech | |
JPH0316324A (en) | Training system for acoustic echo canceller | |
KR20000053029A (en) | Method and device for projecting sound sources onto loudspeakers | |
KR100629513B1 (en) | Optical reproducing apparatus and method capable of transforming external acoustic into multi-channel | |
JP3856136B2 (en) | AV system | |
JP2005094112A (en) | Performance monitor apparatus, control room speech unit, signal distributor, and studio speech system | |
JP2006101081A (en) | Acoustic reproduction device | |
KR0138211B1 (en) | Video compact disc player controller | |
JPH0815288B2 (en) | Audio transmission system | |
KR100309708B1 (en) | A guidance broadcasting apparatus for subway | |
JPH089500A (en) | Sound receiver | |
JP2021189283A (en) | Voice guidance device and voice guidance method | |
US893286A (en) | Multiphone. | |
JPH10136101A (en) | Video conference system | |
JP2002374600A (en) | Audio signal processing unit | |
KR20050068411A (en) | Mpeg dvd player based on most protocols | |
JPH0923389A (en) | Television receiver, remote control transmitter for television receiver, and television receiver system | |
US20040230433A1 (en) | Microphone system | |
JP2008211302A (en) | Signal communication apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: KONINKLIJKE PHILIPS ELECTRONICS N.V., NETHERLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:KAUFHOLZ, PAUL AUGUSTINUS PETER;REEL/FRAME:012255/0244 Effective date: 20010904 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION |