IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
US007295980B2
(12) United States Patent ao) Patent No.: Us 7,295,980 B2
Garner et al. (45) Date of Patent: Nov. 13,2007
Page 2
(54) PATTERN MATCHING METHOD AND APPARATUS
(75) Inventors: Philip Neil Garner, Guildford (GB);
Jason Peter Andrew Charlesworth, Guildford (GB); Asako Higuchi, Tokyo (JP)
(73) Assignee: Canon Kabushiki Kaisha, Tokyo (JP)
( * ) Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 0 days.
(21) Appl. No.: 11/513,064
(22) Filed: Aug. 31, 2006
(65) Prior Publication Data
US 2007/0150275 Al Jun. 28, 2007
Related U.S. Application Data
(62) Division of application No. 10/111,051, filedas application No. PCT/GB00/04112 on Oct. 25, 2000, now Pat. No. 7,212,968.
(30) Foreign Application Priority Data
Oct. 28, 1999 (GB) 9925560.6
Oct. 28, 1999 (GB) 9925561.4
Oct. 13, 2000 (GB) 0025143.9
(51) Int. CI.
G10L15/00 (2006.01) G10L15/04 (2006.01)
(52) U.S. CI 704/254; 704/231; 704/251
(58) Field of Classification Search 704/254
See application file for complete search history.
(56) References Cited
U.S. PATENT DOCUMENTS
4,736,429 A 4/1988 Niyada et al 381/43
4,783,803 A * 11/1988 Baker et al 704/252
4,903,305 A 2/1990 Gillick et al 381/41
4,975,959 A 12/1990 Benbassat 381/41
4,980,918 A 12/1990 Bahl et al 381/43
4,985,924 A 1/1991 Matsuura 381/43
5,075,896 A 12/1991 Wilcox et al 382/39
5,131,043 A 7/1992 Fujii et al 381/41
(Continued)
FOREIGN PATENT DOCUMENTS
EP 0 597 798 5/1994
(Continued)
OTHER PUBLICATIONS
Kenney Ng, et al., "Subword Unit Representations for Spoken Document Retrieval", Spoken Language Systems Group, MIT Laboratory for Computer Science, Cambridge, MA. Witbrock, M.J. et al., "Using Words and Phonetic Strings for Efficient Information Retrieval from Imperfectly Transcribed Spo- ken Documents", School of Computer Science, Carnegie Mellon University.
(Continued)
Primary Examiner—Talivadis Ivars Smits
Assistant Examiner—Justin W. Rider
(74) Attorney, Agent, or Firm—Fitzpatrick, Cella, Harper &
Scinto
5,136,655 A 8/1992 Bronson 381/41
5,202,952 A 4/1993 Gillick et al 395/2
5,327,521 A * 7/1994 Savic et al 704/272
5,333,275 A 7/1994 Wheatley et al 395/2.52
5,390,278 A 2/1995 Gupta et al 379/88.01
5,500,920 A 3/1996 Kupiec 395/2.79
5,577,249 A 11/1996 Califano 395/611
5,594,641 A 1/1997 Kaplan et al 395/601
5,638,425 A 6/1997 Meador, III et al 379/88
5,640,487 A 6/1997 Lau et al 395/2.52
5.649.060 A 7/1997 Ellozy et al 395/2.87
5,675,706 A 10/1997 Lee et al 395/2.65
5,680,605 A 10/1997 Torres 395/603
5,684,925 A 11/1997 Morin et al 395/2.63
5,708,759 A 1/1998 Kemeny 395/2.63
5,721,939 A 2/1998 Kaplan 395/759
5,729,741 A 3/1998 Liaguno et al 395/615
5,737,489 A 4/1998 Chou et al 395/2.65
5,737,723 A 4/1998 Riley et al 704/243
5,752,227 A 5/1998 Lyberg 704/235
5,781,884 A 7/1998 Pereira et al 704/260
5,787,414 A 7/1998 Miike et al 707/2
5,799,267 A 8/1998 Siegel 704/1
5,835,667 A 11/1998 Wactlar et al 386/96
5,852,822 A 12/1998 Srinivasan et al 707/4
5,870,740 A 2/1999 Rose et al 707/5
5.873.061 A 2/1999 Haab-Umbach et al 704/254
5,907,821 A 5/1999 Kaji et al 704/4
5,983,177 A 11/1999 Wu et al 704/244
5,999,902 A 12/1999 Scahill et al 704/240
6,006,182 A 12/1999 Fakhr et al 704/251
6,023,536 A 2/2000 Visser 382/310
6,026,398 A 2/2000 Brown et al 707/5
6.061.679 A 5/2000 Bournas et al 707/3
6,070,140 A 5/2000 Tran 704/275
6,122,613 A 9/2000 Baker 704/252
6.172.675 Bl 1/2001 Ahmad et al 345/328
6,182,039 Bl * 1/2001 Rigazio et al 704/257
6,192,337 Bl 2/2001 Ittycheriah et al 704/231
6,236,964 Bl 5/2001 Tamura et al 704/254
6.243.676 Bl 6/2001 Witteman 704/243
6.243.680 Bl 6/2001 Gupta et al 704/260
6,272,242 Bl 8/2001 Saitoh et al 382/187
6,289,140 Bl 9/2001 Oliver 382/313
6,314,400 Bl 11/2001 Klakow 704/257
6,321,226 Bl 11/2001 Garber et al 707/10
6,389,395 Bl 5/2002 Ringland 704/254
6,463,413 Bl 10/2002 Applebaum et al 704/256
6,466,907 Bl 10/2002 Ferrieux et al 704/254
6,487,532 Bl 11/2002 Schoofs et al 704/251
6,490,563 B2 12/2002 Hon et al 704/260
6.535.849 Bl * 3/2003 Pakhomov et al 704/235
6.535.850 Bl 3/2003 Bayya 704/239
6,567,778 Bl 5/2003 Chao Chang et al 704/257
6,567,816 Bl 5/2003 Desai et al 707/102
6,662,180 Bl 12/2003 Arefetal 707/6
7,054,812 B2* 5/2006 Charlesworth et al 704/251
2002/0022960 Al 2/2002 Charlesworth et al 704/251
2002/0052740 Al 5/2002 Charlesworth et al 704/220
![[merged small][table]](http://www.google.fr/patents?id=UMqRAAAAEBAJ&hl=fr&ie=ISO-8859-1&output=text&pg=PA2&img=1&zoom=3&hl=fr&q=&cds=1&sig=ACfU3U1Q5j-_KogBIAZZSy8q0JRhK5YUNw&edge=0&edge=stretch&ci=127,1046,313,175)
WO WO 99/05681 2/1999
WO 00/31723 6/2000
WO WO 00/54168 9/2000
WO WO 02/27546 4/2002
OTHER PUBLICATIONS
R. Haeb-Umbach, et al., "Automatic Transcription of Unknown Words in a Speech Recognition System", IEEE (1995) pp. 840-843. F. Schiel, et al., "The Partitur Format at BAS", In Proc. of the First Int'l Conference on Language Resources and Evaluation, Granada, Spain, 1998.
Wright, J. et al., "Statiscal Models for Topic Identification Using Phoneme Substrings", IEEE (1996), pp. 307-310. Schmid, P. et al., "Automatically generated word pronunciations from phoneme classifier output", Statistical Signal and Array Processing, Minneapolis, Apr. 27-30, 1993, (ICASSP) Proceedings, New York, IEEE, vol. 4, pp. 223-226.
Jain N. et al., "Creating speaker-specific phonetic templates with a speaker-independent phonetic recognizer, Implications for voice dialing", 1996 IEEE (ICASSP) Proceedings vol. 2, pp. 881-884. "Template Averaging For Adapting A Dynamic Time Warping Speech", IBM Technical Disclosure Bulletin, IBM Corp. New York, Apr. 1, 1990, vol. 32, No. 11, pp. 422-426.
Gerber, C, "A General Approach to Speech Recognition", Proceedings of the Final Workshop on Multimedia Information Retrieval (Miro '95), Glasgow, Scotland (Sep. 18-20, 1995) pp. 1-12. Zobel J. et al., "Phonetic String Matching, Lessons From Information Retrieval", Sigir Forum, Association for Computing Machinery, New York, 1996, pp. 166-172.
Okawa, S. et al., "Automatic Training of Phoneme Dictionary Based on Mutual Information Criterion", IEEE (1994), pp. 241-244. Rahim, M. et al., "A Neural Tree Network for Phoneme Classification With Experiments on the Timit Database", IEEE (1992), pp. II-345-II-348.
Sankoff, D., et al, "Time Warps, String Edits, And Macromolecules: The Theory And Practice Of Sequence Comparison", Bell Laboratories and David Sankoff (1983), Ch. One, pp. 1-44; Part Three, pp. 211-214; Ch. Eleven, pp. 311-321, Ch. Sixteen, pp. 359-362. Wang, H., "Retrieval of Mandarin Spoken Documents Based on Syllable Lattice Matching", Pattern Recognition Letters (Jun. 2000), 21, pp. 615-624.
Berge, C, "Graphs And Hypergraphs", North Holland Mathematical Library, Amsterdam (1976) p. 175.
Gelin, P. & Wellekens, C. J., "Keyword Spotting for Video Soundtrack Indexing", 1996 IEEE Int. Conf. on Acoustics, Speech, and Sig. Proc, ICASSP-96, Conference Proceedings (May 7-10, 1996), vol. 1, pp. 299-302.
James, D.A. & Young, S. J., "A Fast Lattice-Based Approach To Vocabulary Independent Wordspotting", 1994 IEEE Int. Conf. on Acoustics, Speech and Sig. Proc, ICASSP-94, vol. 1 (Adelaide, Australia, Apr. 19-22, 1994), pp. 1-377-380.
Cassidy, S. & Harrington, J., "EMU: An Enhanced Hierarchical Speech Data Management System," Proceedings of the 6th Australian Speech Science and Technology Conference, (Adelaide, 1996), pp. 381-386.
Gagnoulet, C, et al., "Marievox: A Voice-Activated Information System," Speech Communication, vol. 10, No. 1 (Feb. 1991), pp. 23-31.
Bird, S. & Liberman, M., Towards a Formal Framework for Linguistic Annotations, International Conference On Spoken Lan- guage Processing, (Sydney, Australia, Dec. 1998). Bird, S. & Liberman, M., "A Formal Framework for Linguistic Annotation," Aug. 13, 1999.
Bird, S. & Liberman, M., "A Formal Framework for Linguistic Annotation," Mar. 1999.
Wold, E., et al., "Content-Based Classification, Speech, and Retrieval of Audio," IEEE Multimedia, vol. 3, No. 3 (Fall 1996), pp. 27-38.
Wechsler, M., "Spoken Document Retrieval Based on Phoneme Recognition," Swiss Federal Institute of Technology, Zurich (1998).
Bahl, L. R., et al., "A Method For The Construction Of Acoustic Markov Models For Words," IEEE Transactions on Speech and Audio Processing, vol. 1, No. 4 (Oct. 1993), pp. 443-452. Markowitz, J.A., Using Speech Recognition, Prentice Hall (1996), pp. 220-221.
Srinivasan, S., et al., "Phonetic Confusion Matrix Based Spoken Document Retrieval," Proceedings Of The 23rd Annual Interna- tional ACM SIGIR Conference On Research And Development In Information Retrieval (Jul. 24-28, 2000), pp. 81-87. Stalling, John, "Classic Maxim Entropy", Maximum Entropy and Bayesian Methods, Kluwer Academic Publishers (1989), pp. 45-51. Lee, Kai-Fu, Automatic Speech Recognition—The Development Of The SPHINX System, Kluwer Academic Publishers (1989), pp. 28-29.
P. Jokinen, "A Comparison of Approximate String Matching Algorithms", Software-Practice and Experience, vol. 26(12), 1996, pp. 1439-1458.
S. Besling, "A Statistical Approach to Multilingual Photonetic Transcription", Philips J. Res. 49 (1995) pp. 367-379. D. Sankoff et al., "Time Warps, String Edits, and Macromolecules", CSLI Publications (1999) ISBN: 1-57586-217-4.
M. Wechsler, "Spoken Document Retrieval Based on Phoneme Recognition", DISS. ETH. No. 12879 (1998) pp. 1-120. J.T. Foote, et al., "Unconstrained Keyword Spotting Using Phone Lattices with Application to Spoken Document Retrieval", Com- puter Speech and Language (1997), 11 pp. 207-224. Y, Kobayashi et al., "Matching Algorithms Between a Phonetic Lattice and Two Types of Templates—Lattice and Graph", Depart- ment of Computer Sciences, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto, Japan, pp. 1597-1600 (1985). G. Micca, et al., "Three Dimensional DP for Phonetic Lattice Matching", Digital Signal Processing, Elsevier Science Publishers (1987), pp. 547-551.
Kenny Ng, et al., "Phonetic Recognition for Spoken Document Retrieval", Spoken Language Systems Group, MIT Laboratory for Computer Science, Cambridge, MA, ICASSP (1998). Kenney Ng., "Survey of Approaches to Information Retrieval of Speech Messages", Laboratory for Computer Science, Massachu- setts Institute of Technology (1996), pp. 1-34.
* cited by examiner
« PrécédentContinuer » |