CA2017703C - Text-to-speech synthesizer having formant-rule and speech-parameter synthesis modes - Google Patents

Text-to-speech synthesizer having formant-rule and speech-parameter synthesis modes

Info

Publication number
CA2017703C
CA2017703C CA002017703A CA2017703A CA2017703C CA 2017703 C CA2017703 C CA 2017703C CA 002017703 A CA002017703 A CA 002017703A CA 2017703 A CA2017703 A CA 2017703A CA 2017703 C CA2017703 C CA 2017703C
Authority
CA
Canada
Prior art keywords
speech
formant
phoneme
group
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA002017703A
Other languages
French (fr)
Other versions
CA2017703A1 (en
Inventor
Yukio Mitome
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Publication of CA2017703A1 publication Critical patent/CA2017703A1/en
Application granted granted Critical
Publication of CA2017703C publication Critical patent/CA2017703C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers

Abstract

ABSTRACT OF THE DISCLOSURE

A text-to-speech synthesizer comprises an analyzer that decomposes a sequence of input characters into phoneme components and classifies them as a first group of phoneme components or a second group if they are to be synthesized by a speech parameter or by a formant rule respectively. Speech parameters derived from natural human speech are stored in first memory locations corresponding to the phoneme components of the first group and the stored speech parameters are recalled from the first memory in response to each of the phoneme components of the first group.
formant rules capable of generating formant transition patterns are stored in second memory locations corresponding to the phoneme components of the second group the formant rules being recalled from the second memory in response to each of the phoneme components of the second group. Formant transition patterns are derived from the formant rule recalled from the second memory and formants of the derived transition patterns are converted into corresponding speech parameters. Spoken words are digitally synthesized from the speech parameters recalled from the first memory as well as from those supplied from the converted speech parameters.

Description

-`" 2017703 TITLE OF THE INVENTION
2"Text-to-Speech Synthesizer Having Formant-Rule And 3Speech-Parameter Synthesis Modes"
4BACKGROUND OF THE INV~NTION
sThe present invention relates generally to speech synthesis systems, 6 and more particularly to a text-to-speech synthesizer.
Two approaches are available for text-to-speech synthesis systems.
8 In the first approach, speech parameters are extracted from human 9 speech by analyzing semisyllables, consonants and vowels and their 10 various combinations and stored in memory. Text inputs are used to 11 address the memory to read speech parameters and an original sound 12 corresponding to an input character string is reconstructed by 13 concatenating the speech parameters. As described in "Japanese Text-14 to-Speech Synthesizer Based On Residual Excited Speech Synthesis", 15 Kazuo Hakoda et al., ICASSP '86 (International Conference On Acoustics 16 Speech and Signal Processing '86, Proceedings 45-8, pages 2431 to 17 2434), Linear Predictive Coding (LPC) technique is employed to analyze 18 human speech into consonant-vowel (CV) sequences, vowel (V) 19 sequences, vowel-consonant ~VC) sequences and vowel-vowel (W) 20 sequences as speech units and speech parameters known as LSP (Line 21 Spectrum Pair) are extracted from the analyzed speech units. Text input 22 iS represented by speech units and speech parameters corresponding 23 to the speech units are concatenated to produce continuous speech 24 parameters. These speech parameters are given to an LSP synthesizer.
25 Although a high degree of articulation can be obtained if a sufficient 26 number of high-quality speech units are collected, there is a substantial 27 difference between sounds collected from speech units and those 28 appearing in texts, resulting in a loss of naturalness. For example, a ~p concatenation of recorded semisyllables lacks smoothness in the 2 synthesized speech and gives an impression that they were simply 3 linked together.
4 According to the second approach, rules for formant are derived s from strings of phonemes and stored in a memory as described in 6 USpeech Synthesis And Recognition", pages 81 to 101, J. N. Homes, Van 7 Nostrand Reinhold (UK) Co. Ltd. Speech sounds are synthesized from 8 the formant transition patterns by reading the formant rules from the 9 memory in response to an input character string. While this technique is 10 advantageous for improving the naturalness of speech by repetitive 11 experiments of synthesis, the formant rules are difficult to improve in 12 terms of constants because of their short durations and low power 13 levels, resulting in a low degree of articulation with respect to 1 4 consonants.
SUMMARY OF THE !NVENTION
16 It is therefore an object of the present invention to provide a text-to-17 speech synthesizer which provides high-degree of articulation and high 18 degree of flexibility to improve the naturalness of synthesized speech.
19 This object is obtained by combining the advantageous features of 20 the speech parameter synthesis and the formant ruie-based speech 2 1 synthesis.
22 According to the present invention, there is provided a text-to-23 speech synthesizer which comprises an analyzer that decomposes a 24 sequence of input characters into phoneme components and classifies 25 them as a first group of phoneme components or a second group if 26 they are to be synthesized by a speech parameter or by a formant 2 7 rule, respectively. Speech parameters derived from natural human 2 8 speech are stored in first memory locations corresponding to the phoneme components of the first group and the stored speech 2 parameters are recalled from the first memory in response to each of 3 the phoneme corrnponents of the first group. Formant rules capable of 4 generating formant transition patterns are stored in second memory s locations corresponding to the phoneme components of the second 6 group, the formant rules being recalled from the second memory in 7 response to each of the phoneme components of the second group.
8 Formant transition patterns are derived from the formant rule recalled 9 from the second memory. A parameter converter is provided for 10 converting formants of the derived formant transition patterns into 11 corresponding speech parameters. A speech synthesizer is responsive 12 to the speech parameters recalled from the first memory and to the 13 speech parameters converted by the parameter converter for 14 synthesizing a human speech.
I S BRIEF DESCRIPTION OF THE DRAWINGS
16 The present invention will be described in further detail with 17 reference to the accompanying drawings, in which:
18 Fig. I is a block d;agram of a rule-based text-to-speech synthesizer 19 of the present invention;
Fig. 2 shows details of the parameter memory of Fig. 1;
21 Fig. 3 shows details of the formant rule memory of Fig. 1;
22 Fig. 4 is a block diagram of the parameter converter of Fig. 1;
23 Fig. 5 is a timing diagram associated with the parameter converter 24 of Fig. 4; and Fig. 6 is a block diagram of the digital speech synthesizer of Fig. 1.

27 In Fig. 1, there is shown a text-to-speech synthesizer according to 2 8 the present invention. The synthesizer of this invention generally comprises a text analysis system 10 of well known circuitry and a rule-2 based speech synthesis system 20. Text analysis system 10 is made up 3 of a test-to-phoneme conversion unit 11 and a prosodic rule procedural 4 unit 12. A text input, or a string of characters is fed to the text analysis S system 10 and converted into a string of phonemes. If a word "say" is 6 the text input, it is translated into a string of phonetic signs "s~t 120] ei [t 7 gO, f (0, 120) (30, 140) .... ~", where t in the brackets [ ] indicates the 8 duration (in milliseconds) of a phoneme preceding the left bracket and 9 the numerals in each parenthesis respectively represent the time (in 10 milliseconds) with respect to the beginning of a phoneme preceding the 11 left bracket and the frequency (Hz) of a component of the phoneme at 12 each instant of time.
13 Rule-based speech synthesis system 20 comprises a phoneme 14 string analyzer 21 connected to the output of text analysis system 10 15 and a mode discrimination table 22 which is accessed by the analyzer 16 21 with the input phoneme strings. Mode discrimination table 22 is a 17 dictionary that holds a multitude of sets of phoneme strings and 18 corresponding synthesis modes indicating whether the corresponding 19 phoneme strings are to be synthesized with a speech parameter or a 20 formant rule. The application of the phoneme strings from analyzer 21 21 to table 22 will cause phoneme strings having the same phoneme as 22 the input string to be sequentially read out of table 22 into analyzer 21 23 along with corresponding synthesis mode data. Analyzer 21 seeks a 24 match between each of the constituent phonemes of the input string with each phoneme in the output strings from table 22 by ignoring the 26 brackets in both of the input and output strings.
27 Using the above example, there will be a match between the input 28 characters "se" and "S[e]" in the output string and the corresponding mode data indicates that the character "S" is to be synthesized using a 2 formant rule. Analyzer 21 proceeds to detect a further match between 3 characters "ei" of the input string and the characters Nei" of the output 4 string "[s]ei" which is classified as one to be synthesized with a speech 5 parameter. If "parameter mode" indication is given by table 22, 6 analyzer 21 supplies a corresponding phoneme to a parameter address 7 tàble 24 and communicates this fact to a sequence controller 23. If a 8 "formant mode" indication is given, analyzer 21 supplies a 9 corresponding phoneme to a formant rule address table 27 and 10 communicates this fact to controller 23.
11 Sequence controller 23 supplies various timing signals to all parts of 12 the system. During a parameter synthesis mode, controller 23 applies a 13 command signal to a parameter memory 25 to permit it to read its 14 contents in response to an address from table 24 and supplies ik output 15 to the left position of a switch 27, and thence to a digital speech 16 synthesizer 32. During a rule synthesis mode, controller 23 supplies 17 timing signals to a formant rule memory 29 to cause it to read its 18 contents in response to an address given by address table 28 into 19 formant pattern generator 30 which is also controlled to provide its 20 output to a parameter converter 31 21 Parameter address table 24 holds parameter-related phoneme 22 strings as its entries, starting addresses respectively corresponding to the 23 entries and identifying the beginning of storage locations of memory 24 25, and numbers of data sets contained in each storage location of 2 s memory 25. For example, the phoneme string "[s]ei" has a 26 cor~esponding starting address "XXXXX" of a location of memory 25 in ~7 which "400" data sets are stored.
28 According to linear predictive coding techniques, coefficients known NE- 261 2~177~

as AR (Auto-Regressive) parameters are used as e~uivalents to LPC
2 parameters. These parameters can be obtained by a computer 3 analysis of human speech with a relatively small amount of 4 computations to approximate the spectrum of speech, while ensuring a 5 high level of articulation. Parameter memory 25 stores the AR
6 parameters as well as ARMA (Auto-Regressis Moving Average) 7 parameters which are also known in the art. As shown in Fig. 2, 8 parameter memory 25 stores source codes, AR parameters ai and MA
9 parameters bi. Data in each item are addressed by a starting address 10 supplied from parameter address table 24: The source code includes 11 entries for identifying the type of a source wave (noise or periodic 12 pulse) and the amplitude of the source wave. A starting address is 13 supplied from 24 to memory 25 to read a source code and AR and MA
14 parameters in the amount as indicated by the corresponding quantity 15 data. The AR parameters are supplied in the form of a series of digital 16 data a1 ~ a2N and the MA parameters as a series of digital data b1 ~
17 b2N and coupled through the right position of switch 27 to synthesizer 18 32.
19 Formant rule address table 28 contains phoneme strings as its entries and addresses of the formant rule memory 29 corresponding to 21 the phoneme strings. In response to a phoneme string supplied from 22 analyzer 21, a corresponding address is read out of address table 28 23 into formant rule memory 29.
24 As shown in Fig. 3, formant rule memory 29 stores a set of formants and preferably a set of antiformants that are used by formant pattern 26 generator 30 to generate formant transition patterns. Each formant is 27 defined by frequency data F (t;, fj) and bandwidth data B (tj, b;), where 28 t indicates time, f indicates frequency, and b indicates bandwidth, and . . :

NE-261 2~i77~3 each antiforrnant is defined by frequency data AF (tj, fj) and bandwidth 2 data AB (tj, fj). The formants and antiformants data are sequentially 3 read out of memory 29 into formant pattern generator 30 as a function 4 of a corresponding address supplied from address table 28. Formant S pattern generator 30 produces a set of frequency and bandwidth 6 parameters for each formant transition and supplies its output to 7 parameter converter 31. Details of forrnant pattern generator 30 are 8 described in pages 84 to 90 of "Speech Synthesis And Recognition"
9 referred to above.
10 The effect of parameter converter 31 is to convert the formant 11 parameter sequence from pattern generator 30 into a sequence of 12 speech synthesis parameters of the same format as those stored in l 3 parameter memory 25.
14 As illustrated in Fig. 4, parameter converter 31 comprises a 15 coefficients memory 40, a coefficient generator 41, a digital all-zero filter16 41 and a digital unit impulse generator 43. Memory 40 includes a 17 frequency table 50 and a bandwidth table ~1 for respectively receiving 18 frequency and bandwidth parameters from the formànt pattern 19 generator 30. Each of the frequency parameters in table 50 is recalled in response to the frequency value F or AF from the formant pattern 21 generator 30 and represents the cosine of the displacement angle of a 22 resonance pole for each formant frequency as given by C=c0s(27~F/fs), 23 where F is the frequency parameter of either a formant or antiformant 24 parameter and fs represents the sampling frequency. On the other hand, each of the parameters in table 51 is recalled in response to thç
26 bandwidth value B or AB from the pattern generator 30 and represents 27 the radius of the pole for each bandwidth as given by R=exp(-7~B/fs), 28 where B is the bandwidth parameter from generator 30 for both NE-261 2~17703 fs)rmants and antiformants.
2 Coefficient generator 41 is made up of a C-register 52 and an R-3 register 53 which are connected to receive data from tables 50 and 51, 4 respectively. The output of C-register 52 is multiplied by "2" by a S multiplier 54 and supplied through a switch 55 to a multiplier 56 where it 6 is multiplied with the output of R-register 53 to produce a first-order 7 coefficient A which is equal to 2xCxR when switch 55 is positioned to 8 the left in response to a timing signal from controller 23. When switch 9 55 is positioned to the right in response to a timing signal from controller23, the output of R-register 53 is squared by multiplier 56 to produce a 11 second-order coeffici~nt B which is equal to by R`(R.
12 Digital all-zero filter 42 comprises a selector means 57 and a series 13 of digital second-order transversal filters 58-1~58-N which are 14 connected from unit impulse generator 43 to the left position of switch 15 27. The signals A and B from generator 41 are alternately supplied 16 through selector 57 as a sequence (-A1~ B1), (-A2, B2), ( AN~ BN) to 17 transversal filters 58-1 ~ 58-N, respectively. Each transversal filter 18 comprises a tapped delay line consisting of delay elements 6û and 61.
19 Multipliers 62 and 63 are coupled respectively to successive taps of the 20 delay line for multiplying digital values appearing at the respective taps 21 with the digital values A and B from selector 57. The output of impulse 22 generator 43 and the outputs of multipliers 62 and 63 are summed 23 altogether by an adder 64 and fed to a succeeding transversal filter.
24 Data representing a unit impulse is generated by impulse generator 43 2s in response to an enable pulse from controller 23. This unit impulse is 26 successively converted into a series of impulse responses, or digital 27 values a1 ~ a2N of different height and polarity as formant parameters as ? 8 shown in Fig. 5, and supplied through the left position of switch 27 to NE-261 2~17703 speech synthesizer 32. Likewise, a series of digital values bl ~ b2N is 2 generated as antiformant parameters in response to a subsequent digital 3 unit impulse.
4 In Fig. 6, speech synthesizer 32 is shown as comprising a digital S source wave generator 70 which generates noise or a periodic pulse in 6 digital form. During a parameter synthesis mode, speech synthesizer 7 32 is responsive to a source code supplied through a selector means 71 8 from the output of switch 27 and during a rule synthesis mode it is 9 responsive to a source code supplied from controller 23. The output of 1 0 source wave generator 71 is fed to an input adder 72 whose output is 1 1 coupled to an output adder 76. A tapped delay line consisting of delay 12 elements 73-1 ~ 73-2N is connected to the output of adder 72 and tap-1 3 weight multipliers 74-1 ~ 74-2N are connected respectively to successive 14 taps of the delay line to supply weighted successive outputs to input 1 5 adder 72. Similarly, tap-weight multipliers 75-1 ~ 75-2N are connected 1 6 respectively to successive taps of the delay line to supply weighted 17 successive outputs to output adder 76. The tap weights of multipliers 1 8 74-1 to 74-2N are respectively controlled by the tap-weight values a1 1 9 through a2N supplied sequentially through selector 70 to reflect the AR
20 parameters and those of multipliers 75-1 to 75-2N are respectively 21 controlled by the digital values bl through b2N which are also supplied 22 sequentially through selector 70 to reflect the ARMA parameters. In this 23 way, spoken words are digitally synthesized at the output of adder 76 24 and coupled through an output terminal 77 to~a digital-to-analog 2 5 converter, not shown, where it is converted to analog form.
26 The foregoing description shows only one preferred embodiment 27 of the present invention. Various modifications are apparent to those 28 skilled in the art without departing from the scope of the present invention which is only limited by the appended claims. For example, 2 the ARMA parameters could be dispensed with depending on the 3 degree of qualities required.

Claims (5)

1. A text-to-speech synthesizer comprising:
analyzer means for decomposing a sequence of input characters into phoneme components and classifying the decomposed phoneme components as a first group of phoneme components if each phoneme component is to be synthesized by a speech parameter and classifying said phoneme components as a second group of phoneme components if each phoneme component is to be synthesized by a formant rule;
first memory means for storing speech parameters derived from natural human speech, said speech parameters corresponding to the phoneme components of said first group and being recalled from said first memory means in response to each of the phoneme components of the first group;
second memory means for storing formant rules capable of generating formant transition patterns, said formant rules corresponding to the phoneme components of said second group and being recalled from said second memory means in response to each of the phoneme components of the second group;
means for deriving formant transition patterns from the formant rule recalled from said second memory means;
parameter converter means for converting formants of said derived formant transition patterns into corresponding speech parameters; and speech synthesizer means responsive to the speech parameters recalled from said first memory means and to the speech parameters converted by said parameter converter means for synthesizing a human speech.
2. A text-to-speech synthesizer as claimed in claim 1, wherein said speech parameters stored in said first memory means are represented by auto-regressive (AR) parameters, and said formants of said derived formant transition patterns are represented by frequency and bandwidth values, wherein said parameter converter means comprises:
means for converting the frequency value of said formants into a value equal to C=cos(2.pi.F/fs), where F is said frequency value and fs represents a sampling frequency, and converting the bandwidth value of said formants into a value equal to R=exp(-.pi.B/fs), where B is the bandwidth value;
means for generating a first signal representative of a value 2xCxR
and a second signal representative of a value R2;
unit impulse generator for generating a unit impulse; and a series of second-order transversal filters connected in series from said unit impulse generator to said speech synthesizer means, each of said second-order transversal filters including a tapped delay line, first and second tap-weight multipliers connected respectively to successive taps of said tapped delay line, and an adder for summing the outputs of said multipliers with said unit impulse, said first and second multipliers multiplying signals at said successive taps with said first and second signals, respectively.
3. A text-to-speech synthesizer as claimed in claim 1, wherein said speech parameters in said first memory means are represented by auto-regressive (AR) parameters and auto-regressive moving average (ARMA) parameters, and said formant rules in said second memory means being further capable of generating antiformant transition patterns, each of said formants and said antiformants being represented by frequency and bandwidth values, wherein said parameter converter means comprises:
means for converting the frequency value of said formants into a value equal to C=cos(2.pi.F/fs), where F is said frequency value and fs represents a sampling frequency, and converting the bandwidth value of said formants into a value equal to R=exp(-.pi.B/fs), where B is the bandwidth value;
means for generating a first signal representative of a value 2xCxR
and a second signal representative of a value R2;
unit impulse generator means for generating a unit impulse; and a series of second-order transversal filters connected in series from said unit impulse generator to said speech synthesizer means, each of said second-order transversal filters including a tapped delay line, first and second tap-weight multipliers connected respectively to successive taps of said tapped delay line, and an adder for summing the outputs of said multipliers with said unit impulse, said first and second multipliers multiplying signals at said successive taps with said first and second signals, respectively.
4. A text-to-speech synthesizer as claimed in claim 1, wherein said analyzer means comprises a memory for storing a plurality of phoneme component strings and a corresponding number of indications classifying said phoneme component strings as falling into said first group or said second group, and means for detecting a match between a decomposed phoneme component and a phoneme component in said phoneme component strings and classifying the decomposed phoneme component as said first or second group according to the corresponding indication if said match is detected.
5. A text-to-speech synthesizer as claimed in claim 1, wherein said speech synthesizer means comprises:
source wave generator means for generating a source wave;
input and output adders connected in series from said source wave generator means to an output terminal of said text-to-speech synthesizer;
a tapped delay line connected to the output of said input adder;
a plurality of first tap-weight multipliers having input terminals respectively connected to successive taps of said tapped-delay line and output terminals connected to input terminals of said input adder, said first tap-weight multipliers respectively multiplying signals at said successive taps with signals supplied from said first memory means and said parameter converter means; and a plurality of second tap-weight multipliers having input terminals respectively connected to successive taps of said tapped-delay line and output terminals connected to input terminals of said output adder, said second tap-weight multipliers respectively multiplying signals at said successive taps with signals supplied from said first memory means and said parameter converter means.
CA002017703A 1989-05-29 1990-05-29 Text-to-speech synthesizer having formant-rule and speech-parameter synthesis modes Expired - Fee Related CA2017703C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP1135595A JPH031200A (en) 1989-05-29 1989-05-29 Regulation type voice synthesizing device
JP1-135595 1989-05-29

Publications (2)

Publication Number Publication Date
CA2017703A1 CA2017703A1 (en) 1990-11-29
CA2017703C true CA2017703C (en) 1993-11-30

Family

ID=15155495

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002017703A Expired - Fee Related CA2017703C (en) 1989-05-29 1990-05-29 Text-to-speech synthesizer having formant-rule and speech-parameter synthesis modes

Country Status (3)

Country Link
US (1) US5204905A (en)
JP (1) JPH031200A (en)
CA (1) CA2017703C (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0573100A (en) * 1991-09-11 1993-03-26 Canon Inc Method and device for synthesising speech
JPH05181491A (en) * 1991-12-30 1993-07-23 Sony Corp Speech synthesizing device
JP2782147B2 (en) * 1993-03-10 1998-07-30 日本電信電話株式会社 Waveform editing type speech synthesizer
CA2119397C (en) * 1993-03-19 2007-10-02 Kim E.A. Silverman Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation
US6502074B1 (en) * 1993-08-04 2002-12-31 British Telecommunications Public Limited Company Synthesising speech by converting phonemes to digital waveforms
US5987412A (en) * 1993-08-04 1999-11-16 British Telecommunications Public Limited Company Synthesising speech by converting phonemes to digital waveforms
JP3450411B2 (en) * 1994-03-22 2003-09-22 キヤノン株式会社 Voice information processing method and apparatus
US5633983A (en) * 1994-09-13 1997-05-27 Lucent Technologies Inc. Systems and methods for performing phonemic synthesis
US5787231A (en) * 1995-02-02 1998-07-28 International Business Machines Corporation Method and system for improving pronunciation in a voice control system
US6038533A (en) * 1995-07-07 2000-03-14 Lucent Technologies Inc. System and method for selecting training text
US5751907A (en) * 1995-08-16 1998-05-12 Lucent Technologies Inc. Speech synthesizer having an acoustic element database
US6240384B1 (en) * 1995-12-04 2001-05-29 Kabushiki Kaisha Toshiba Speech synthesis method
US5761640A (en) * 1995-12-18 1998-06-02 Nynex Science & Technology, Inc. Name and address processor
US5832433A (en) * 1996-06-24 1998-11-03 Nynex Science And Technology, Inc. Speech synthesis method for operator assistance telecommunications calls comprising a plurality of text-to-speech (TTS) devices
JPH10153998A (en) * 1996-09-24 1998-06-09 Nippon Telegr & Teleph Corp <Ntt> Auxiliary information utilizing type voice synthesizing method, recording medium recording procedure performing this method, and device performing this method
US5956667A (en) * 1996-11-08 1999-09-21 Research Foundation Of State University Of New York System and methods for frame-based augmentative communication
US5924068A (en) * 1997-02-04 1999-07-13 Matsushita Electric Industrial Co. Ltd. Electronic news reception apparatus that selectively retains sections and searches by keyword or index for text to speech conversion
US6587822B2 (en) * 1998-10-06 2003-07-01 Lucent Technologies Inc. Web-based platform for interactive voice response (IVR)
US6870914B1 (en) * 1999-01-29 2005-03-22 Sbc Properties, L.P. Distributed text-to-speech synthesis between a telephone network and a telephone subscriber unit
US6400809B1 (en) * 1999-01-29 2002-06-04 Ameritech Corporation Method and system for text-to-speech conversion of caller information
US6618699B1 (en) * 1999-08-30 2003-09-09 Lucent Technologies Inc. Formant tracking based on phoneme information
US20020007315A1 (en) * 2000-04-14 2002-01-17 Eric Rose Methods and apparatus for voice activated audible order system
JP2002169581A (en) * 2000-11-29 2002-06-14 Matsushita Electric Ind Co Ltd Method and device for voice synthesis
JP2004149000A (en) 2002-10-30 2004-05-27 Showa Corp Gas cylinder device for vessel
DE50305344D1 (en) * 2003-01-29 2006-11-23 Harman Becker Automotive Sys Method and apparatus for restricting the scope of search in a dictionary for speech recognition
US7308407B2 (en) * 2003-03-03 2007-12-11 International Business Machines Corporation Method and system for generating natural sounding concatenative synthetic speech
GB2412046A (en) * 2004-03-11 2005-09-14 Seiko Epson Corp Semiconductor device having a TTS system to which is applied a voice parameter set
US7536304B2 (en) * 2005-05-27 2009-05-19 Porticus, Inc. Method and system for bio-metric voice print authentication
US8452604B2 (en) * 2005-08-15 2013-05-28 At&T Intellectual Property I, L.P. Systems, methods and computer program products providing signed visual and/or audio records for digital distribution using patterned recognizable artifacts
JP4878538B2 (en) * 2006-10-24 2012-02-15 株式会社日立製作所 Speech synthesizer
JP5093239B2 (en) * 2007-07-24 2012-12-12 パナソニック株式会社 Character information presentation device
CN110459211B (en) 2018-05-07 2023-06-23 阿里巴巴集团控股有限公司 Man-machine conversation method, client, electronic equipment and storage medium

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4467440A (en) * 1980-07-09 1984-08-21 Casio Computer Co., Ltd. Digital filter apparatus with resonance characteristics
JPS57142022A (en) * 1981-02-26 1982-09-02 Casio Comput Co Ltd Resonance characteristic controlling system in digital filter
JPS6054680B2 (en) * 1981-07-16 1985-11-30 カシオ計算機株式会社 LSP speech synthesizer
US4597318A (en) * 1983-01-18 1986-07-01 Matsushita Electric Industrial Co., Ltd. Wave generating method and apparatus using same
US4692941A (en) * 1984-04-10 1987-09-08 First Byte Real-time text-to-speech conversion system
JPS62215299A (en) * 1986-03-17 1987-09-21 富士通株式会社 Sentence reciting apparatus
US4829573A (en) * 1986-12-04 1989-05-09 Votrax International, Inc. Speech synthesizer
JPS63285597A (en) * 1987-05-18 1988-11-22 ケイディディ株式会社 Phoneme connection type parameter rule synthesization system
DE58903262D1 (en) * 1988-07-06 1993-02-25 Rieter Ag Maschf SYNCHRONIZABLE DRIVE SYSTEMS.
US4979216A (en) * 1989-02-17 1990-12-18 Malsheen Bathsheba J Text to speech synthesis system and method using context dependent vowel allophones

Also Published As

Publication number Publication date
JPH031200A (en) 1991-01-07
US5204905A (en) 1993-04-20
CA2017703A1 (en) 1990-11-29

Similar Documents

Publication Publication Date Title
CA2017703C (en) Text-to-speech synthesizer having formant-rule and speech-parameter synthesis modes
US4220819A (en) Residual excited predictive speech coding system
US4979216A (en) Text to speech synthesis system and method using context dependent vowel allophones
US5400434A (en) Voice source for synthetic speech system
JP3294604B2 (en) Processor for speech synthesis by adding and superimposing waveforms
US3909533A (en) Method and apparatus for the analysis and synthesis of speech signals
EP0239394B1 (en) Speech synthesis system
EP0384587A1 (en) Voice synthesizing apparatus
US6829577B1 (en) Generating non-stationary additive noise for addition to synthesized speech
O'Shaughnessy Design of a real-time French text-to-speech system
van Rijnsoever A multilingual text-to-speech system
Furtado et al. Synthesis of unlimited speech in Indian languages using formant-based rules
JP3059751B2 (en) Residual driven speech synthesizer
JP3081300B2 (en) Residual driven speech synthesizer
JP3083624B2 (en) Voice rule synthesizer
JPH09179576A (en) Voice synthesizing method
JPH0258640B2 (en)
JP2956936B2 (en) Speech rate control circuit of speech synthesizer
Sassi et al. A text-to-speech system for Arabic using neural networks
Teixeira et al. Automatic system of reading numbers
JPH037999A (en) Voice output device
Eady et al. Pitch assignment rules for speech synthesis by word concatenation
Yazu et al. The speech synthesis system for an unlimited Japanese vocabulary
KR970003092B1 (en) Method for constituting speech synthesis unit and sentence speech synthesis method
KR950013373B1 (en) Speech message suppling device and speech message reviving method

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed