US20050049856A1 - Method and means for creating prosody in speech regeneration for laryngectomees - Google Patents

Method and means for creating prosody in speech regeneration for laryngectomees Download PDF

Info

Publication number
US20050049856A1
US20050049856A1 US10/940,183 US94018304A US2005049856A1 US 20050049856 A1 US20050049856 A1 US 20050049856A1 US 94018304 A US94018304 A US 94018304A US 2005049856 A1 US2005049856 A1 US 2005049856A1
Authority
US
United States
Prior art keywords
speech
consonant
vowel
component
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/940,183
Inventor
David Baraff
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/940,183 priority Critical patent/US20050049856A1/en
Publication of US20050049856A1 publication Critical patent/US20050049856A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/057Time compression or expansion for improving intelligibility
    • G10L2021/0575Aids for the handicapped in speaking

Definitions

  • This invention relates in general to the field of artificial speech for laryngectomees, (a laryngeally impaired individual). It relates as well to the field of voice analysis and synthesis such as has been used in the field of communications. It also relates to the field of voice instruction and training. It also relates to the field of computer controlled prosthetics, particularly as such involves correction of human speech from a voice impaired individual to enable such individual to create natural sounding speech by creating or reproducing prosody and other natural inflections in a human voice.
  • laryngectomees have not been able to use previous devices to their fullest potential. Firstly, even with devices which have built in pitch control, it is extremely difficult to coordinate the fingers to imitate natural speech prosody. The speaker requires a “good ear” for speech sound coupled with a very strong desire to spend hours of practicing to gain coordination. Many laryngectomees do not possess either the desire or the skill. Secondly, some of the subtleties of creating true prosody may occur in time scales faster than could be manually controlled.
  • Another feature of the present invention is its use for training of speech, insofar as it includes pattern recognition, of real time speech input.
  • a system for recognizing and coding speech is described in the U.S. Pat. No. 5,729,694 by Holzrichter et al.
  • This speech system relies on pre-coding parts of speech including the feature vectors as generated both by classical LPC coefficients and the inclusion of a physical mapping of the vocal tract elements by using electromagnetic radiation.
  • the system disclosed presently does not rely on electromagnetic radiation and includes the ability to pre-program specific lessons as generated by the laryngeally impaired individual in conjunction with his speech pathologist.
  • the disclosed invention provides natural prosody in real time to the speech of laryngeally impaired people (laryngectomees).
  • This present continuation application incorporates the applicant's co-pending patent application Ser. No. 09/641,157, now U.S. Pat. No. 6,795,807 issued Sep. 21, 2004 by reference, as though fully set forth herein.
  • the invention provides prosody through the means of software running on a digital signal processor and software program running in real time thereby providing more natural speech than is achievable through any manually controlled system.
  • the disclosed system has other capabilities providing increased naturalness including: noise cancellation of sound from a neck vibrator excitation source, feedback control to allow use of a microphone distant from the mouth, aspiration noise to mimic real speech, amplification selectively of consonants over vowels to assist in intelligibility, automatic gain control to allow for movement of the head with respect to the microphone, user selection of mood of speech, volume control, whisper speech, telephone mode, training aids, ability to interface with myoelectric signals to provide automatic hands free starting and stopping control as well as user controlled intonation, and the extraction of voice parameters from a user before laryngeal impairment to recreate the voice.
  • the unit provides “whisper” speech by using a white noise excitation instead of the glottal pulse excitation.
  • the unit can be used to change the excitation frequency of the sound source in real time. This is useful in use over the telephone or in a stand alone unit which may be used without the loudspeaker. Training aids using pattern recognition are programmed into the device to allow speech pathologists to provide lessons whereby the user gets feedback as to whether his articulation and time is being done according to instruction.
  • the unit is capable of being adapted to receive myoelectric signals for hands free operation.
  • the myoelectric signal can automatically turn the unit on and off and include user directed intonation. Without the myoelectic attachment the user can select from moods of speech which help express himself depending upon situation. Moods such as relaxed, tense, angry, confident can be generated by selecting various components of the prosody algorithm in combination with the glottal pulse parameters.
  • the algorithm disclosed with the present invention provides a means to determine and reproduce a speakers pitch to best reproduce the original voice and inflections of a speaker such as to make the speech more natural.
  • a computer software program listing is included with this disclosure which teaches one means to carry out the pitch determining algorithm which is taught herein.
  • the primary objective of the present invention is to provide intelligible and natural sounding speech for individuals with laryngeal impairment while including the feature of prosody as they speak. Accordingly, it is an object of this invention to recreate natural prosody without the conscious intervention of the user through use of a computer algorithm to process speech. It is also an object of the disclosed invention to provide for prosody and speech improvement by tapping the nerve signal generated in the larynx nerve which controls the larynx in normal speakers to that a signal can be provided for stopping and starting speech. It is also the object of the invention to utilize the same signal to provide information as to the larynx tension, which relates to the pitch of speech, such that the speaker's intent can be realized by utilization of the myoelectric signal to process speech.
  • a second object of the invention is to recreate speech sounding as much like the original voice of the speaker as possible by applying algorithms which duplicate the frequency range, the rise and fall times and other characteristics of the speaker in the original speech and comparing them with the rise and fall times of speech created using an artificial glottal pulse, utilizing a digital signal processor to correct for the difference to create speech similar to the speakers original voice.
  • a third objective of the invention is to provide feedback to the user as to how well he/she is doing in learning some of the fundamentals of how to make the speech device sound clearer by using pattern recognition such that useful information in the form of instruction can be provided for the user.
  • a further object of the invention is to recreate the natural voice of an individual which existed prior to laryngeal damage or removal.
  • FIG. 1 is a pictorial view depicting a user wearing an embodiment of the present invention and particularly illustrating a contact microphone and a neck vibrator worn about the neck of a user of the invention.
  • FIG. 2 is a block diagram of the electronic control circuit components used in the invention.
  • FIG. 3 is a block diagram of the algorithm used in the signal processing illustrating the main processing steps used in processing speech in the invention.
  • FIG. 4 describes the algorithm used to determine the pitch as described in the present invention.
  • FIG. 1 depicts some of the major components of the current invention, including an excitation device 2 on the neck together with a contact microphone 4 .
  • a radio frequency signal carries the information about the glottal pulse.
  • wires would generally be used to carry the signal.
  • a self contained neck vibrator 6 using an rf signal and its own batteries for power could be used.
  • their own voice sound may be used as the primary excitation.
  • a microphone is worn in front of the mouth, in the mouth, or coupled through tissue or bone to the vocal tract.
  • the neck mounted device and the microphone are connected to a control circuit directly by wires, or through electromagnetic field transmission such as a radio frequency transmission or infrared light coupling system.
  • the unit may also be adapted to directly connect to a telecommunication device rather than be coupled to a audio output device for local voice reproduction.
  • the control unit may be worn on the belt or any other convenient location such as a pocket or other element of clothing.
  • the control unit performs the following functions.
  • the analog electrical signal from the microphone input 10 is converted to a digital signal by an analog to digital converter 12 .
  • the digital signal is analyzed within the digital signal processor 14 .
  • the digital signal processor 14 converts the basic voice signals into an LPC method.
  • the voice signal is re-synthesized using the LPC method and the generation of a glottal pulse, which has been designed to sound like a normal human glottal pulse.
  • the voice frequency is selected on the basis of an algorithm which determines both the amplitude and rate of change of the amplitude of the voice signal. A calculation is performed using both the amplitude and the rate of change of amplitude to determine what the voice frequency should be to adjust the sound of the voice to be more natural.
  • the major hardware components include the microphone input 10 and loud speaker output devices 8 which are interfaced through an analog to digital converter 12 , such as the Motorola MC145483. Additional power gain is provided to the loud speaker through an amplifier such as could in a device such as chip LM871.
  • the digital signal resulting from the conversion of the speech input is introduced into a digital signal processor (DSP) such as the Texas Instruments TMS320C31, which is a high speed processor 14 which requires little power to operate, therefore making it a good choice for portable operating.
  • DSP digital signal processor
  • This processor 14 is interfaced with erasable, programmable read-only memory 18 containing the program control and with random access memory 16 for performing calculations in real time. It can be appreciated that various switches can be implemented such as power switch 24 , calibration switch 26 and pitch adjustment switches 22 as shown in FIG. 2 .
  • a power supply 30 converts and conditions the voltage from rechargeable batteries 34 .
  • Signal output from the DSP 14 also goes to either the transmitter circuit which sends a signal to the oral unit to recreate voice or to an amplifier which drives a conventional neck vibrator 6 with a square wave signal. A square wave signal provides the best power efficiency for driving the neck vibrator if such a vibrator is attached.
  • Oscillator 28 determines the clock speed or cycle speed of DSP 14 . It can be appreciated by those skilled in the art that the design and operation of DSP 14 can be a varied design and implemented with a variety of different commonly available hardware.
  • Glottal pulse generation 20 is driven by digital signal processor 14 as shown in FIG. 2 .
  • the system need only be able to process the speech input from the user by applying the decision making process inherent in the algorithm disclosed below such as to generate reconditioned speech, providing a more natural reproduction of the speakers otherwise impaired voice. Whether such processing is accomplished with a digital signal processor, in an analog domain or in some other fashion, the out put of the system can be accomplished by carrying out the processing technique and algorithm method described in the present invention.
  • FIG. 3 a flow chart diagram describing the main processing and overall logic approach to the operation of the device is disclosed.
  • the processor When the power is applied to the circuit, the processor resets and initializes all parameters. Parameters to be set are, for example, male or female voice, telephone mode, whisper mode and other parameters relating to frequency adjustment. If the activate button is pressed, the processor starts to analyze speech information coming in through the microphone input 10 . If the activate button is not depressed, the unit goes into the sleep mode where the parametric information is saved and ready to use, but the processor is drawing very low current.
  • the input signal undergoes a gain boost for the lower frequencies. Then the signal is pre-emphasized with another filter.
  • Preemphasis The digitized speech signal (proc_array in main program echo.c) is put through first-order system.
  • the framesize is 128 samples; the frame overlap is 48 samples. Accordingly, only 80 new samples are required to complete a frame for analysis.
  • the frame time would be 16 milliseconds in absence of the overlap; however taking the overlap into account, the frame time is only ten milliseconds.
  • FRAMESIZE is set to be 128 and the term OVERLAP is set to 48.
  • the signal is windowed using a Hamming window, and then it goes through LPC analysis.
  • the LPC method uses the reflection (or PARCOR) coefficients, RMS (root mean square) of the energy and gain term of the LPC model based on the Durbin's algorithm. This technique is well known and described in the literature.
  • a comb filter is added. In effect the comb filter calculates the minimum energy in the signal. This energy level is typical of silence in the speech, but either the oral stimulator or the neck vibrator may have some residual noise associated with it which is then removed.
  • An autocalibration algorithm continuously calculates the average RMS energy of the signal to update the variable detection discrimination function. This is important because variation in the input level can effect the decision level of the frequency determining algorithm.
  • the phone vibration unit takes the calculated pitch of the output signal and modulates the neck vibrator or oral unit output signal to track the dominant pitch of speech. This is useful when a speaker is talking directly into a telephone device.
  • Automatic gain control is also used on the output to adjust the sound level from the loud speakers. This prevents the output from overloading and keeps a relatively constant output level.
  • FIG. 4 discloses the analysis method used in the pitch determining algorithm.
  • the algorithm to determine pitch uses phoneme detection and is based on the relative amplitude of the signal. Depending on the amplitude a phoneme is classified either as a vowel, a consonant or silence. An averaging function is used to prevent “unnatural” gain changes from frame to frame.
  • a pitch generation function estimates the pitch based on the RMS of the current and adjacent frames.
  • a synthesis function provides the synthesis of the output speech using a lattice filter model.
  • T.G. determines the ratio of pitch change with change in power of the signal. Minimum pitch is defined as the lowest frequency of the output. The maximum pitch is defined as the highest frequency of the output. The rate increase is simply the rate at which the pitch increases. Likewise, rate decrease is simply the rate at which the pitch decreases.
  • the consonant noise level is the relative noise level of consonants in the voice signal being processed.
  • a level is set for the minimum pitch. Another level is set for the maximum pitch.
  • An independent parameter is set for the rate of pitch increase and another is set for the rate of decrease.
  • a third parameter determines the overall ratio of pitch change with change in power.
  • Certain decision levels trigger various pitch increase and decreases rules.
  • the decision levels which are important include:
  • K1 determines the threshold (relative power level) to change from a consonant to vowel.
  • K2 determines the threshold that must be reached to change from silence to consonant.
  • K3 determines the threshold to change from vowel to consonant.
  • K4 determines the threshold to change from consonant to vowel.
  • K5 a consonant decision will remain a consonant unless the K4 threshold is reached and the change in energy is less than the K5 threshold.
  • K6 a consonant decision will remain a consonant unless the K4 threshold is reached and the change in energy is greater than the K6 threshold.
  • the signal power level is compared with K1, K2 or K3. If it is less than K2, it is classified as silence and no LPC speech construction occurs. If it is greater than K2 it is tested as a consonant. There is no direct path from silence to vowel. Once the signal has been classified as a consonant it is tested against new parameters. If the level is greater than K1 it is classified as a vowel. If it is less than K1 it is tested against K4. If it is greater than K4 it is classified as a vowel. If it is less than K4 it remains a consonant. The decision will maintain consonant status unless the K4 threshold is reached and the change in energy is less than the K5 threshold.
  • the selection of the threshold values is determined by the desired reproduction of the sound of the voice being processed. It is useful to record and analyze the natural sound of an intended user of the invention, if the opportunity is present, prior to any surgical procedure which may alter the voice. In such a fashion, the constants desirable to dial into the processing for switching or selection may be more readily determined rather that empirically adjusting the values of K to match the desired end effect.
  • a computer listing to carry out the invention and which allows one to practice the method so described in the following table which comprises the computer code listing carries out the invention as illustrated in this disclosure. Table 1 attached provides a computer code listing which one skilled in the art may use to carry out the invention utilizing digital processing means.

Abstract

A device and a method to be used by laryngeally impaired people to improve the naturalness of their speech. An artificial sound creating mechanism which forms a simulated glottal pulse in the vocal tract is utilized. An artificial glottal pulse is compared with the natural spectrum and an inverse filter is generated to provide an output signal which would better reproduce natural sound. A digital signal processor introduces a variation of pitch based on an algorithm developed for this purpose; i.e. creating prosody. The algorithm uses primarily the relative amplitude of the speech signal and the rise and fall rates of the amplitude as a basis for setting the frequency of the speech. The invention also clarifies speech of laryngectomees by sensing the presence of consonants in the speech and appropriately amplifying them with respect to the vowel sounds.

Description

    REFERENCE TO PRIOR APPLICATIONS
  • This application is a continuation of application Ser. No. 09/641,157 filed Aug. 17, 2000, being U.S. Pat. No. 6,795,807 issued Sep. 21, 2004 which in turn claimed the benefit of the filing date of the applicant's provisional patent application No. 60/149,106 filed Aug. 17,1999.
  • REFERENCE TO COMPUTER PROGRAM LISTING ON COMPACT DISC
  • Included with this application is a compact disc named application Ser. No. 09/641,157 Baraff which contains five separate files, together which comprise table 1 referenced in this specification. The file names, date of creation on compact disc and file sizes are as follows:
  • Main program file application Ser. No. 09/641,157 Baraff.txt, created Sep. 12, 2004 of size 27.0 KB;
  • Pitch program file application Ser. No. 09/641,157 Baraff.txt, created Sep. 12, 2004 of size 4.09 KB;
  • Synth program file application Ser. No. 09/641,157 Baraff.txt, created Sep. 12, 2004 of size 5.46 KB;
  • LPC program file application Ser. No. 09/641,157 Baraff.txt, created Sep. 12, 2004 of size 1.86 KB; and
  • Vowel program file application Ser. No. 09/641,157 Baraff.txt created Sep. 12, 2004 of size 1.56 KB.
  • AUTHORIZATION UNDER 37 C.F.R. §1.71(d)
  • A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyrights whatsoever.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • This invention relates in general to the field of artificial speech for laryngectomees, (a laryngeally impaired individual). It relates as well to the field of voice analysis and synthesis such as has been used in the field of communications. It also relates to the field of voice instruction and training. It also relates to the field of computer controlled prosthetics, particularly as such involves correction of human speech from a voice impaired individual to enable such individual to create natural sounding speech by creating or reproducing prosody and other natural inflections in a human voice.
  • 2. Description of Prior Art
  • There have been attempts in the past to create means to improve impaired speech, particularly from laryngeally impaired individuals. No speech devices to date have been able to capture, in sufficient detail, information about the specific speaker to recreate his/her own voice. Artificial devices to create a simulated glottal pulse with a manual ability to change frequency have been known for many years. One of the more recent devices has utilized a small loudspeaker mounted in the mouth on the laryngectomee typically on a denture. This was described in U.S. Pat. No. 5,326,349 by Baraff. Some devices which vibrate the neck have been fitted with a control to enable the user to change the pitch of the speech manually as described in U.S. Pat. No. 5,812,681 by Griffin. All of these devices have the drawback of sounding very mechanical. Even when a user has manually changed the pitch, the sound has not been close to the natural sound of the human being. In devices without myoelectric control it is still necessary for the user to time the onset and fall of the glottal pulse sound manually. This timing takes practice and corrective feedback is useful in minimizing the training time.
  • There are a number of reasons that laryngectomees have not been able to use previous devices to their fullest potential. Firstly, even with devices which have built in pitch control, it is extremely difficult to coordinate the fingers to imitate natural speech prosody. The speaker requires a “good ear” for speech sound coupled with a very strong desire to spend hours of practicing to gain coordination. Many laryngectomees do not possess either the desire or the skill. Secondly, some of the subtleties of creating true prosody may occur in time scales faster than could be manually controlled.
  • A number of schemes have been developed to create speech from text. One such process is described in the patent by Sharman, U.S. Pat. No. 5,774,854. Conventional speech systems operate in a sequential manner, hence, they do not create prosody until an entire sentence is divided into elements of speech such as words and phonemes. Most of these schemes rely on pre-programmed templates to create prosody. These schemes using a programmed template would not be useful in a real time creation of speech for the laryngectomee because they require the understanding of the word and context to be applied. Although Sharman refers to “real-time” operation, because the text is already present in sentence form, it is not in “real-time” with regard to a speech input such as in the present invention. Real-time speech to speech requires that the analysis be completed within 50 milliseconds or less, that is, well before the entire word has even been spoken. Clearly techniques which are based on understanding the word before applying prosody will not be useful to solve this problem.
  • A further element of the disclosed invention, the ability to simulate emotions in speech, is perhaps suggested in U.S. Pat. No. 5,860,064, which creates emotion in speech output only in a text to speech system. This system again does not operate in real time with regard to a speech to speech function.
  • Another feature of the present invention is its use for training of speech, insofar as it includes pattern recognition, of real time speech input. A system for recognizing and coding speech is described in the U.S. Pat. No. 5,729,694 by Holzrichter et al. This speech system relies on pre-coding parts of speech including the feature vectors as generated both by classical LPC coefficients and the inclusion of a physical mapping of the vocal tract elements by using electromagnetic radiation. The system disclosed presently does not rely on electromagnetic radiation and includes the ability to pre-program specific lessons as generated by the laryngeally impaired individual in conjunction with his speech pathologist. Other devices found in the prior art have left the control of prosody to the control of the laryngectomee and required a high level of manual dexterity to provide inflection and naturalness. In practice, very few laryngectomees use this capability because the timing and control is too difficult.
  • SUMMARY OF THE INVENTION
  • The disclosed invention provides natural prosody in real time to the speech of laryngeally impaired people (laryngectomees). This present continuation application incorporates the applicant's co-pending patent application Ser. No. 09/641,157, now U.S. Pat. No. 6,795,807 issued Sep. 21, 2004 by reference, as though fully set forth herein. The invention provides prosody through the means of software running on a digital signal processor and software program running in real time thereby providing more natural speech than is achievable through any manually controlled system.
  • In addition to providing prosody, the disclosed system has other capabilities providing increased naturalness including: noise cancellation of sound from a neck vibrator excitation source, feedback control to allow use of a microphone distant from the mouth, aspiration noise to mimic real speech, amplification selectively of consonants over vowels to assist in intelligibility, automatic gain control to allow for movement of the head with respect to the microphone, user selection of mood of speech, volume control, whisper speech, telephone mode, training aids, ability to interface with myoelectric signals to provide automatic hands free starting and stopping control as well as user controlled intonation, and the extraction of voice parameters from a user before laryngeal impairment to recreate the voice.
  • An automatic gain control system has been provided to regulate the output. The unit provides “whisper” speech by using a white noise excitation instead of the glottal pulse excitation. The unit can be used to change the excitation frequency of the sound source in real time. This is useful in use over the telephone or in a stand alone unit which may be used without the loudspeaker. Training aids using pattern recognition are programmed into the device to allow speech pathologists to provide lessons whereby the user gets feedback as to whether his articulation and time is being done according to instruction. The unit is capable of being adapted to receive myoelectric signals for hands free operation. In addition in the case of laryngeally impaired individuals with the larynx nerve replaced to a neck muscle nerve the myoelectric signal can automatically turn the unit on and off and include user directed intonation. Without the myoelectic attachment the user can select from moods of speech which help express himself depending upon situation. Moods such as relaxed, tense, angry, confident can be generated by selecting various components of the prosody algorithm in combination with the glottal pulse parameters. The algorithm disclosed with the present invention provides a means to determine and reproduce a speakers pitch to best reproduce the original voice and inflections of a speaker such as to make the speech more natural. A computer software program listing is included with this disclosure which teaches one means to carry out the pitch determining algorithm which is taught herein.
  • It is, therefore, the primary objective of the present invention is to provide intelligible and natural sounding speech for individuals with laryngeal impairment while including the feature of prosody as they speak. Accordingly, it is an object of this invention to recreate natural prosody without the conscious intervention of the user through use of a computer algorithm to process speech. It is also an object of the disclosed invention to provide for prosody and speech improvement by tapping the nerve signal generated in the larynx nerve which controls the larynx in normal speakers to that a signal can be provided for stopping and starting speech. It is also the object of the invention to utilize the same signal to provide information as to the larynx tension, which relates to the pitch of speech, such that the speaker's intent can be realized by utilization of the myoelectric signal to process speech.
  • A second object of the invention is to recreate speech sounding as much like the original voice of the speaker as possible by applying algorithms which duplicate the frequency range, the rise and fall times and other characteristics of the speaker in the original speech and comparing them with the rise and fall times of speech created using an artificial glottal pulse, utilizing a digital signal processor to correct for the difference to create speech similar to the speakers original voice.
  • A third objective of the invention is to provide feedback to the user as to how well he/she is doing in learning some of the fundamentals of how to make the speech device sound clearer by using pattern recognition such that useful information in the form of instruction can be provided for the user.
  • It is also an object of the invention to allow the user to change the mood of his speech through various algorithms which signal calmness, levity, anger, friendship, command etc., by altering setting of the disclosed prosody algorithm.
  • A further object of the invention is to recreate the natural voice of an individual which existed prior to laryngeal damage or removal.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a pictorial view depicting a user wearing an embodiment of the present invention and particularly illustrating a contact microphone and a neck vibrator worn about the neck of a user of the invention.
  • FIG. 2 is a block diagram of the electronic control circuit components used in the invention.
  • FIG. 3 is a block diagram of the algorithm used in the signal processing illustrating the main processing steps used in processing speech in the invention.
  • FIG. 4 describes the algorithm used to determine the pitch as described in the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
  • FIG. 1 depicts some of the major components of the current invention, including an excitation device 2 on the neck together with a contact microphone 4. Generally for devices mounted inside the mouth, a radio frequency signal carries the information about the glottal pulse. For neck mounted vibrators, wires would generally be used to carry the signal. However, a self contained neck vibrator 6 using an rf signal and its own batteries for power could be used. For the case of some tracheo-esophageal puncture speakers, their own voice sound may be used as the primary excitation.
  • A microphone is worn in front of the mouth, in the mouth, or coupled through tissue or bone to the vocal tract. The neck mounted device and the microphone are connected to a control circuit directly by wires, or through electromagnetic field transmission such as a radio frequency transmission or infrared light coupling system. The unit may also be adapted to directly connect to a telecommunication device rather than be coupled to a audio output device for local voice reproduction. The control unit may be worn on the belt or any other convenient location such as a pocket or other element of clothing. The control unit performs the following functions. The analog electrical signal from the microphone input 10 is converted to a digital signal by an analog to digital converter 12. The digital signal is analyzed within the digital signal processor 14. The digital signal processor 14 converts the basic voice signals into an LPC method. The voice signal is re-synthesized using the LPC method and the generation of a glottal pulse, which has been designed to sound like a normal human glottal pulse. The voice frequency is selected on the basis of an algorithm which determines both the amplitude and rate of change of the amplitude of the voice signal. A calculation is performed using both the amplitude and the rate of change of amplitude to determine what the voice frequency should be to adjust the sound of the voice to be more natural.
  • Turning now to FIG. 2, the control circuitry is more particularly described using the main hardware elements, which carry out the method disclosed. The major hardware components include the microphone input 10 and loud speaker output devices 8 which are interfaced through an analog to digital converter 12, such as the Motorola MC145483. Additional power gain is provided to the loud speaker through an amplifier such as could in a device such as chip LM871. The digital signal resulting from the conversion of the speech input is introduced into a digital signal processor (DSP) such as the Texas Instruments TMS320C31, which is a high speed processor 14 which requires little power to operate, therefore making it a good choice for portable operating. This processor 14 is interfaced with erasable, programmable read-only memory 18 containing the program control and with random access memory 16 for performing calculations in real time. It can be appreciated that various switches can be implemented such as power switch 24, calibration switch 26 and pitch adjustment switches 22 as shown in FIG. 2. A power supply 30 converts and conditions the voltage from rechargeable batteries 34. Signal output from the DSP 14 also goes to either the transmitter circuit which sends a signal to the oral unit to recreate voice or to an amplifier which drives a conventional neck vibrator 6 with a square wave signal. A square wave signal provides the best power efficiency for driving the neck vibrator if such a vibrator is attached. Oscillator 28 determines the clock speed or cycle speed of DSP 14. It can be appreciated by those skilled in the art that the design and operation of DSP 14 can be a varied design and implemented with a variety of different commonly available hardware.
  • Glottal pulse generation 20 is driven by digital signal processor 14 as shown in FIG. 2. The system need only be able to process the speech input from the user by applying the decision making process inherent in the algorithm disclosed below such as to generate reconditioned speech, providing a more natural reproduction of the speakers otherwise impaired voice. Whether such processing is accomplished with a digital signal processor, in an analog domain or in some other fashion, the out put of the system can be accomplished by carrying out the processing technique and algorithm method described in the present invention.
  • Turning now to FIG. 3, a flow chart diagram describing the main processing and overall logic approach to the operation of the device is disclosed. When the power is applied to the circuit, the processor resets and initializes all parameters. Parameters to be set are, for example, male or female voice, telephone mode, whisper mode and other parameters relating to frequency adjustment. If the activate button is pressed, the processor starts to analyze speech information coming in through the microphone input 10. If the activate button is not depressed, the unit goes into the sleep mode where the parametric information is saved and ready to use, but the processor is drawing very low current.
  • When the activate button is depressed, the input signal undergoes a gain boost for the lower frequencies. Then the signal is pre-emphasized with another filter. (Preemphasis—The digitized speech signal (proc_array in main program echo.c) is put through first-order system. In this case, the output s1(n) is related to the input s(n) by the difference equation: S1(n)=s(n)−0.94s(n-1), where n is the framesize. The framesize is 128 samples; the frame overlap is 48 samples. Accordingly, only 80 new samples are required to complete a frame for analysis. With a framesize of 128 samples and a sample rate of eight Kilohertz, the frame time would be 16 milliseconds in absence of the overlap; however taking the overlap into account, the frame time is only ten milliseconds. (In the example computer program shown in table 1 attaches, the term FRAMESIZE is set to be 128 and the term OVERLAP is set to 48.) The signal is windowed using a Hamming window, and then it goes through LPC analysis. The LPC method uses the reflection (or PARCOR) coefficients, RMS (root mean square) of the energy and gain term of the LPC model based on the Durbin's algorithm. This technique is well known and described in the literature. (See for example, Page 115 in Fundamentals of Speech Recognition, by Rabiner and Juang.) A comb filter is added. In effect the comb filter calculates the minimum energy in the signal. This energy level is typical of silence in the speech, but either the oral stimulator or the neck vibrator may have some residual noise associated with it which is then removed.
  • An autocalibration algorithm continuously calculates the average RMS energy of the signal to update the variable detection discrimination function. This is important because variation in the input level can effect the decision level of the frequency determining algorithm.
  • The phone vibration unit takes the calculated pitch of the output signal and modulates the neck vibrator or oral unit output signal to track the dominant pitch of speech. This is useful when a speaker is talking directly into a telephone device.
  • Automatic gain control is also used on the output to adjust the sound level from the loud speakers. This prevents the output from overloading and keeps a relatively constant output level.
  • When the activate button is not pressed the unit goes into the sleep mode. This disables the serial port, enables the initialization and sets the processor to idle. When the activate button is depressed again the unit comes out of sleep mode using initialization settings which were present following reset.
  • FIG. 4 discloses the analysis method used in the pitch determining algorithm. The algorithm to determine pitch uses phoneme detection and is based on the relative amplitude of the signal. Depending on the amplitude a phoneme is classified either as a vowel, a consonant or silence. An averaging function is used to prevent “unnatural” gain changes from frame to frame. A pitch generation function estimates the pitch based on the RMS of the current and adjacent frames. A synthesis function provides the synthesis of the output speech using a lattice filter model. In considering FIG. 4, there are certain input voice parameters of interest. T.G. determines the ratio of pitch change with change in power of the signal. Minimum pitch is defined as the lowest frequency of the output. The maximum pitch is defined as the highest frequency of the output. The rate increase is simply the rate at which the pitch increases. Likewise, rate decrease is simply the rate at which the pitch decreases. The consonant noise level is the relative noise level of consonants in the voice signal being processed.
  • A level is set for the minimum pitch. Another level is set for the maximum pitch. An independent parameter is set for the rate of pitch increase and another is set for the rate of decrease. A third parameter determines the overall ratio of pitch change with change in power.
  • Certain decision levels trigger various pitch increase and decreases rules. The decision levels which are important include:
  • K1—determines the threshold (relative power level) to change from a consonant to vowel.
  • K2—determines the threshold that must be reached to change from silence to consonant.
  • K3—determines the threshold to change from vowel to consonant.
  • K4—determines the threshold to change from consonant to vowel.
  • K5—a consonant decision will remain a consonant unless the K4 threshold is reached and the change in energy is less than the K5 threshold.
  • K6—a consonant decision will remain a consonant unless the K4 threshold is reached and the change in energy is greater than the K6 threshold.
  • The signal power level is compared with K1, K2 or K3. If it is less than K2, it is classified as silence and no LPC speech construction occurs. If it is greater than K2 it is tested as a consonant. There is no direct path from silence to vowel. Once the signal has been classified as a consonant it is tested against new parameters. If the level is greater than K1 it is classified as a vowel. If it is less than K1 it is tested against K4. If it is greater than K4 it is classified as a vowel. If it is less than K4 it remains a consonant. The decision will maintain consonant status unless the K4 threshold is reached and the change in energy is less than the K5 threshold. If the K4 threshold is reached and the change in energy is greater than the K6 threshold, a vowel decision is made. The reason for these various levels is to generate a hysteresis so that the signal level does not rapidly swing from consonant to vowel or silence with minor fluctuations in signal power.
  • The selection of the threshold values is determined by the desired reproduction of the sound of the voice being processed. It is useful to record and analyze the natural sound of an intended user of the invention, if the opportunity is present, prior to any surgical procedure which may alter the voice. In such a fashion, the constants desirable to dial into the processing for switching or selection may be more readily determined rather that empirically adjusting the values of K to match the desired end effect. However, In accordance with the invention which is disclosed, a computer listing to carry out the invention and which allows one to practice the method so described in the following table which comprises the computer code listing carries out the invention as illustrated in this disclosure. Table 1 attached provides a computer code listing which one skilled in the art may use to carry out the invention utilizing digital processing means.
  • From the foregoing description it will be readily apparent that a speaking device for laryngectomees has been developed which allows for a more natural and more understandable speech. The naturalness is provided primarily by the inclusion of prosody. Other effects including consonant amplification, the inclusion of aspiration noise, variation of the glottal pulse with the frequency are included. The improved understandability is due to the relative amplification of consonants, by the injection of aspiration sounds, and also by the injection of white noise to accentuate fricative sounds. The entire device is conveniently packaged to be worn or carried easily and is battery powered. The method also taught with the present disclosure provides a method of processing speech in real time to provide a more natural sounding output from an altered or impaired voice input.
  • Although the invention has been described in terms of the preferred embodiment and with particular examples that are used to illustrate carrying out the principals of the invention, it would be appreciated by those skilled in the art that other variations or adaptations of the principal disclosed herein, could be adopted using the same ideas taught herewith. Such applications and principals are considered to be within the scope and spirit of the invention disclosed and is otherwise described in the appended claims. Such adaptations further include use of analog processing to select and analyze the input speech to be precessed. The method of impaired speech correction may be carried out by other electronic means, whether digital or analog, which provide the same type of signal processing to accomplish the speech conversion taught herein in real time or in a delayed environment. Such uses could include adaptation of speech to text conversion for laryngeally impaired individuals, or similar applications in telecommunications devices.

Claims (6)

1. A method of creating or reproducing prosody in speech using a Linear Predictive Coding algorithm, comprising the steps of:
dividing speech to be processed into components of silent, consonant and vowel;
processing said silent component to determine a threshold level to alter said component to consonant sound or to maintain silent sound;
wherein further said consonant component is selected from a threshold value to determine whether said consonant component exceeds a threshold to be modified to a vowel, or selected for additional threshold measurement to change said consonant component from a consonant to a vowel;
wherein further said vowel component is measured against a threshold level set to determine whether said vowel component is changed from a vowel to a consonant.
2. A device for creating prosody comprising:
an algorithm means for analyzing and synthesizing speech when information about input speech frequency is not available;
wherein the algorithm varies amplification of consonant sounds with respect to vowel sounds;
wherein further, said device creates aspiration noise enabling the user to produce prosody.
3. A method of analyzing speech in real time when information about speech frequency is not available, and creating prosody, comprising:
division of speech, namely into silence, consonant, and vowel sounds;
wherein the component categories of silence, consonant, and vowel are characterized by different threshold levels.
4. A means for creating or reproducing prosody in speech comprising:
an analog to digital converting means to convert analog human speech to a digital equivalent;
a digital signal processor to process said digital equivalent signal;
an electronic memory means to store an instruction set to operate said digital signal processing means;
means to process said digital signal processor output to convert said output to a reconditioned analog voice signal; and
an instruction set stored in said electronic memory means to control said processing by said digital signal processor to alter the reconditioned analog voice signal in accordance with the intended sound of the speech being processed.
5. The invention of claim 4 wherein further said digital signal processing means selects the input to said digital signal processing means to alternate and select between silent, consonant and vowel components of the inputted human speech being processed;
wherein further, the silence component is capable of being further divided into silence or a consonant sound;
wherein the consonant component is capable of being further divided into silence or, upon reaching another pre-set threshold level, into a vowel sounds or a consonant sound;
wherein the vowel component is processed to be further divided into a consonant sounds or a vowel sound.
6. A means for creating or reproducing prosody in speech comprising:
an analog to digital converting means to convert analog human speech to a digital equivalent;
a digital signal processor to process said digital equivalent signal;
an electronic memory means to store an instruction set to operate said digital signal processing means;
means to process said digital signal processor output to convert said output to a reconditioned analog voice signal; and
an instruction set stored in said electronic memory means to control said processing by said digital signal processor to alter the reconditioned analog voice signal in accordance with the intended sound of the speech being processed;
wherein further said digital signal processing means selects the input to said digital signal processing means to alternate and select between silent, consonant and vowel components of the inputted human speech being processed;
wherein further, the silence component is capable of being further divided into silence or a consonant sound;
wherein the consonant component is capable of being further divided into silence or, upon reaching another pre-set threshold level, into a vowel sounds or a consonant sound;
wherein the vowel component is processed to be further divided into a consonant sounds or a vowel sound.
US10/940,183 1999-08-17 2004-09-14 Method and means for creating prosody in speech regeneration for laryngectomees Abandoned US20050049856A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/940,183 US20050049856A1 (en) 1999-08-17 2004-09-14 Method and means for creating prosody in speech regeneration for laryngectomees

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14910699P 1999-08-17 1999-08-17
US09/641,157 US6795807B1 (en) 1999-08-17 2000-08-17 Method and means for creating prosody in speech regeneration for laryngectomees
US10/940,183 US20050049856A1 (en) 1999-08-17 2004-09-14 Method and means for creating prosody in speech regeneration for laryngectomees

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/641,157 Continuation US6795807B1 (en) 1999-08-17 2000-08-17 Method and means for creating prosody in speech regeneration for laryngectomees

Publications (1)

Publication Number Publication Date
US20050049856A1 true US20050049856A1 (en) 2005-03-03

Family

ID=32993450

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/641,157 Expired - Lifetime US6795807B1 (en) 1999-08-17 2000-08-17 Method and means for creating prosody in speech regeneration for laryngectomees
US10/940,183 Abandoned US20050049856A1 (en) 1999-08-17 2004-09-14 Method and means for creating prosody in speech regeneration for laryngectomees

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/641,157 Expired - Lifetime US6795807B1 (en) 1999-08-17 2000-08-17 Method and means for creating prosody in speech regeneration for laryngectomees

Country Status (1)

Country Link
US (2) US6795807B1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060189462A1 (en) * 2005-01-14 2006-08-24 Nautilus, Inc. Exercise device
US20080065381A1 (en) * 2006-09-13 2008-03-13 Fujitsu Limited Speech enhancement apparatus, speech recording apparatus, speech enhancement program, speech recording program, speech enhancing method, and speech recording method
JP2008544832A (en) * 2005-07-01 2008-12-11 ザ・ユーエスエー・アズ・リプリゼンティド・バイ・ザ・セクレタリー・デパートメント・オブ・ヘルス・アンド・ヒューマン・サーヴィスィズ System and method for restoring motor control via stimulation of an alternative site instead of an affected part
US20090054980A1 (en) * 2006-03-30 2009-02-26 The Government Of The U.S, As Represented By The Secretary,Department Of Health And Human Services Device for Volitional Swallowing with a Substitute Sensory System
US20090187124A1 (en) * 2005-07-01 2009-07-23 The Government Of The Usa, As Represented By The Secretary, Dept. Of Health & Human Services Systems and methods for recovery from motor control via stimulation to a substituted site to an affected area
CN108461090A (en) * 2017-02-21 2018-08-28 宏碁股份有限公司 Speech signal processing device and audio signal processing method
US11344471B2 (en) 2013-03-13 2022-05-31 Passy-Muir, Inc. Systems and methods for stimulating swallowing
US11413214B2 (en) * 2019-04-19 2022-08-16 Passy-Muir, Inc. Methods of vibrationally exciting a laryngeal nerve
US20230290333A1 (en) * 2019-05-06 2023-09-14 Gn Hearing A/S Hearing apparatus with bone conduction sensor

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050004604A1 (en) * 1999-03-23 2005-01-06 Jerry Liebler Artificial larynx using coherent processing to remove stimulus artifacts
US20060069567A1 (en) * 2001-12-10 2006-03-30 Tischer Steven N Methods, systems, and products for translating text to speech
US7483832B2 (en) * 2001-12-10 2009-01-27 At&T Intellectual Property I, L.P. Method and system for customizing voice translation of text to speech
JP3908965B2 (en) * 2002-02-28 2007-04-25 株式会社エヌ・ティ・ティ・ドコモ Speech recognition apparatus and speech recognition method
JP4447857B2 (en) * 2003-06-20 2010-04-07 株式会社エヌ・ティ・ティ・ドコモ Voice detection device
US20060046232A1 (en) * 2004-09-02 2006-03-02 Eran Peter Methods for acquiring language skills by mimicking natural environment learning
US20060167691A1 (en) * 2005-01-25 2006-07-27 Tuli Raja S Barely audible whisper transforming and transmitting electronic device
US20070033009A1 (en) * 2005-08-05 2007-02-08 Samsung Electronics Co., Ltd. Apparatus and method for modulating voice in portable terminal
CN101578659B (en) * 2007-05-14 2012-01-18 松下电器产业株式会社 Voice tone converting device and voice tone converting method
JP5500125B2 (en) * 2010-10-26 2014-05-21 パナソニック株式会社 Hearing aid
JP5862349B2 (en) * 2012-02-16 2016-02-16 株式会社Jvcケンウッド Noise reduction device, voice input device, wireless communication device, and noise reduction method
US9508329B2 (en) * 2012-11-20 2016-11-29 Huawei Technologies Co., Ltd. Method for producing audio file and terminal device
RU2530268C2 (en) 2012-11-28 2014-10-10 Общество с ограниченной ответственностью "Спиктуит" Method for user training of information dialogue system
TW201446226A (en) * 2013-06-04 2014-12-16 jing-feng Liu Artificial sounding device
US20150325249A1 (en) * 2013-07-26 2015-11-12 Marlena Nunn Russell Reverse Hearing Aid [RHA]
US10255903B2 (en) 2014-05-28 2019-04-09 Interactive Intelligence Group, Inc. Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
EP3149727B1 (en) * 2014-05-28 2021-01-27 Interactive Intelligence Group, Inc. Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US10014007B2 (en) 2014-05-28 2018-07-03 Interactive Intelligence, Inc. Method for forming the excitation signal for a glottal pulse model based parametric speech synthesis system
US10154899B1 (en) 2016-05-12 2018-12-18 Archer Medical Devices LLC Automatic variable frequency electrolarynx
CN106843490B (en) * 2017-02-04 2020-02-21 广东小天才科技有限公司 Ball hitting detection method based on wearable device and wearable device
US10916250B2 (en) 2018-06-01 2021-02-09 Sony Corporation Duplicate speech to text display for the deaf
US10916159B2 (en) 2018-06-01 2021-02-09 Sony Corporation Speech translation and recognition for the deaf

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5737719A (en) * 1995-12-19 1998-04-07 U S West, Inc. Method and apparatus for enhancement of telephonic speech signals
US6067518A (en) * 1994-12-19 2000-05-23 Matsushita Electric Industrial Co., Ltd. Linear prediction speech coding apparatus
US6332121B1 (en) * 1995-12-04 2001-12-18 Kabushiki Kaisha Toshiba Speech synthesis method

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3704345A (en) * 1971-03-19 1972-11-28 Bell Telephone Labor Inc Conversion of printed text into synthetic speech
US3894195A (en) * 1974-06-12 1975-07-08 Karl D Kryter Method of and apparatus for aiding hearing and the like
JPS58143394A (en) * 1982-02-19 1983-08-25 株式会社日立製作所 Detection/classification system for voice section
US4696040A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with energy normalization and silence suppression
US5748838A (en) * 1991-09-24 1998-05-05 Sensimetrics Corporation Method of speech representation and synthesis using a set of high level constrained parameters
US5305420A (en) * 1991-09-25 1994-04-19 Nippon Hoso Kyokai Method and apparatus for hearing assistance with speech speed control function
US5326349A (en) 1992-07-09 1994-07-05 Baraff David R Artificial larynx
US5636325A (en) * 1992-11-13 1997-06-03 International Business Machines Corporation Speech synthesis and analysis of dialects
US5860064A (en) 1993-05-13 1999-01-12 Apple Computer, Inc. Method and apparatus for automatic generation of vocal emotion in a synthetic text-to-speech system
GB2291571A (en) 1994-07-19 1996-01-24 Ibm Text to speech system; acoustic processor requests linguistic processor output
US5592585A (en) * 1995-01-26 1997-01-07 Lernout & Hauspie Speech Products N.C. Method for electronically generating a spoken message
US5920840A (en) 1995-02-28 1999-07-06 Motorola, Inc. Communication system and method using a speaker dependent time-scaling technique
WO1997016134A1 (en) 1995-10-30 1997-05-09 Clifford Jay Griffin Artificial larynx with pressure sensitive frequency control
US6006175A (en) * 1996-02-06 1999-12-21 The Regents Of The University Of California Methods and apparatus for non-acoustic speech characterization and recognition
US5729694A (en) * 1996-02-06 1998-03-17 The Regents Of The University Of California Speech coding, reconstruction and recognition using acoustics and electromagnetic waves
JP3687181B2 (en) * 1996-04-15 2005-08-24 ソニー株式会社 Voiced / unvoiced sound determination method and apparatus, and voice encoding method
JP3006677B2 (en) 1996-10-28 2000-02-07 日本電気株式会社 Voice recognition device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6067518A (en) * 1994-12-19 2000-05-23 Matsushita Electric Industrial Co., Ltd. Linear prediction speech coding apparatus
US6332121B1 (en) * 1995-12-04 2001-12-18 Kabushiki Kaisha Toshiba Speech synthesis method
US5737719A (en) * 1995-12-19 1998-04-07 U S West, Inc. Method and apparatus for enhancement of telephonic speech signals

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060189462A1 (en) * 2005-01-14 2006-08-24 Nautilus, Inc. Exercise device
US10071016B2 (en) 2005-07-01 2018-09-11 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Systems for recovery from motor control via stimulation to a substituted site to an affected area
US8579839B2 (en) 2005-07-01 2013-11-12 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Methods for recovery from motor control via stimulation to a substituted site to an affected area
US20090187124A1 (en) * 2005-07-01 2009-07-23 The Government Of The Usa, As Represented By The Secretary, Dept. Of Health & Human Services Systems and methods for recovery from motor control via stimulation to a substituted site to an affected area
US20100049103A1 (en) * 2005-07-01 2010-02-25 The Usa As Represented By The Secretary, Dept Of Health And Human Services Systems and methods for recovery from motor control via stimulation to a substituted site to an affected area
EP2382958A1 (en) * 2005-07-01 2011-11-02 The Government of the United States of America, as represented by the Secretary, Department of Health and Human Services Systems for recovery of motor control via stimulation to a substitute site for an affected area
JP2008544832A (en) * 2005-07-01 2008-12-11 ザ・ユーエスエー・アズ・リプリゼンティド・バイ・ザ・セクレタリー・デパートメント・オブ・ヘルス・アンド・ヒューマン・サーヴィスィズ System and method for restoring motor control via stimulation of an alternative site instead of an affected part
US8388561B2 (en) 2005-07-01 2013-03-05 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Systems and methods for recovery from motor control via stimulation to a substituted site to an affected area
AU2011201177B2 (en) * 2005-07-01 2013-03-21 The Usa As Represented By The Secretary, Department Of Health And Human Services Systems and methods for recovery of motor control via stimulation to a substituted site to an affected area
US20090054980A1 (en) * 2006-03-30 2009-02-26 The Government Of The U.S, As Represented By The Secretary,Department Of Health And Human Services Device for Volitional Swallowing with a Substitute Sensory System
US8852074B2 (en) 2006-03-30 2014-10-07 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Device for volitional swallowing with a substitute sensory system
US8449445B2 (en) 2006-03-30 2013-05-28 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Device for volitional swallowing with a substitute sensory system
US8190432B2 (en) * 2006-09-13 2012-05-29 Fujitsu Limited Speech enhancement apparatus, speech recording apparatus, speech enhancement program, speech recording program, speech enhancing method, and speech recording method
US20080065381A1 (en) * 2006-09-13 2008-03-13 Fujitsu Limited Speech enhancement apparatus, speech recording apparatus, speech enhancement program, speech recording program, speech enhancing method, and speech recording method
US8808207B2 (en) 2008-09-16 2014-08-19 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Systems and methods for recovery from motor control via stimulation to a substituted site to an affected area
US11344471B2 (en) 2013-03-13 2022-05-31 Passy-Muir, Inc. Systems and methods for stimulating swallowing
US11850203B2 (en) 2013-03-13 2023-12-26 Passy-Muir, Inc. Systems and methods for stimulating swallowing
CN108461090A (en) * 2017-02-21 2018-08-28 宏碁股份有限公司 Speech signal processing device and audio signal processing method
US11850205B2 (en) * 2019-04-19 2023-12-26 Passy-Muir, Inc. Methods of vibrationally exciting a laryngeal nerve
US11413214B2 (en) * 2019-04-19 2022-08-16 Passy-Muir, Inc. Methods of vibrationally exciting a laryngeal nerve
US11419784B2 (en) * 2019-04-19 2022-08-23 Passy-Muir, Inc. Vibratory nerve exciter
US20220378650A1 (en) * 2019-04-19 2022-12-01 Passy-Muir, Inc. Methods of vibrationally exciting a laryngeal nerve
US20230290333A1 (en) * 2019-05-06 2023-09-14 Gn Hearing A/S Hearing apparatus with bone conduction sensor

Also Published As

Publication number Publication date
US6795807B1 (en) 2004-09-21

Similar Documents

Publication Publication Date Title
US6795807B1 (en) Method and means for creating prosody in speech regeneration for laryngectomees
US11878169B2 (en) Somatic, auditory and cochlear communication system and method
Traunmüller et al. Acoustic effects of variation in vocal effort by men, women, and children
US7162415B2 (en) Ultra-narrow bandwidth voice coding
Syrdal et al. Applied speech technology
Doi et al. Alaryngeal speech enhancement based on one-to-many eigenvoice conversion
KR101475894B1 (en) Method and apparatus for improving disordered voice
KR20170071585A (en) Systems, methods, and devices for intelligent speech recognition and processing
Fuchs et al. The new bionic electro-larynx speech system
Strik et al. Control of fundamental frequency, intensity and voice quality in speech
EP1271469A1 (en) Method for generating personality patterns and for synthesizing speech
Greenberg et al. The analysis and representation of speech
JPH05307395A (en) Voice synthesizer
Raitio Hidden Markov model based Finnish text-to-speech system utilizing glottal inverse filtering
Barney XLIV A discussion of some technical aspects of speech aids for postlaryngectomized patients
Deng et al. Speech analysis: the production-perception perspective
JPH0475520B2 (en)
JP3742206B2 (en) Speech synthesis method and apparatus
Raitio Voice source modelling techniques for statistical parametric speech synthesis
Lawlor A novel efficient algorithm for voice gender conversion
JP3368949B2 (en) Voice analysis and synthesis device
JP2019087798A (en) Voice input device
WO2001018791A1 (en) Pitch variation in artificial speech
Nakamura Speaking-aid systems using statistical voice conversion for electrolaryngeal speech
Hagmüller Speech enhancement for disordered and substitution voices

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION