Recherche Images Maps Play YouTube Actualités Gmail Drive Plus »
Recherche avancée dans les brevets | Images de page | Historique Web | Connexion

Brevets

  

inn iiiiiii Hi mi mi Iiii ijii iiijiijijii mi nun mi mi mi

(12;

United States Patent

Junqua

(io) Patent No.: (45) Date of Patent:

US 7,069,214 B2 Jun. 27, 2006

(54;

(75 (73 (*

(21 (22 (65

(63 (51

(52;

(58 (56)

FACTORIZATION FOR GENERATING A
LIBRARY OF MOUTH SHAPES

Inventor: Jean-Claude Junqua, Santa Barbara,
CA (US)

Assignee: Matsushita Electric Industrial Co.,
Ltd., Osaka (JP)

Notice: Subject to any disclaimer, the term of this patent is extended or adjusted under 35 U.S.C. 154(b) by 816 days.

Appl. No.: 10/095,813

Filed: Mar. 12, 2002

Prior Publication Data

US 2002/0152074 Al Oct. 17, 2002

Related U.S. Application Data

Continuation-in-part of application No. 09/792,928, filed on Feb. 26, 2001.

[blocks in formation]

U.S. PATENT DOCUMENTS

5,608,839 A * 3/1997 Chen 704/235

6,112,177 A 8/2000 Cosatto et al.

6,188,776 Bl * 2/2001 Covell et al 382/100

2003/0072482 Al* 4/2003 Brand 382/154

OTHER PUBLICATIONS

Bregler et al. "Video Rewrite: Driving Visual Speech with
Audio," AVSP, 1997, pp. 153-156.*

Ezzat et al. "MikeTalk: A Talking Facial Display Based on
Morphing Visemes," Proc. of the Computer Animation
Conference, Philadelphia, Pa., Jun. 1998.*
Shih et al. "Efficient Adaptation of TTS Duration Model to
New Speakers," ICSLP, 1998.*

Bregler et al., "Video Rewrite: Driving Visual Speech with
Audio" Proc. ACM SIGGRAPH 1997, in Computer Graph-
ics Preceedings, Annual Conference Series, 1997.*
Bregler et al., "Video Rewrite: Visual Speech Synthesis
from Video" Proc. of the AVSP '97 Workshop, Rhodes
(Greece), Sep. 26-27, 1997.*
* cited by examiner
Primary Examiner—V. Paul Harper

(74) Attorney, Agent, or Firm—Harness, Dickey & Pierce, PLC

^ ABSTRACT

A library of mouth shapes is created by separating speakerdependent and speaker independent variability. Preferably, speaker dependent variability is modeled by a speaker space while the speaker independent variability (i.e. context dependency), is modeled by a set of normalized mouth shapes that need be built only once. Given a small amount of data from a new speaker, it is possible to construct a corresponding mouth shape library by estimating a point in speaker space that maximizes the likelihood of adaptation data and by combining speaker dependent and speaker independent variability. Creation of talking heads is simplified because creation of a library of mouth shapes is enabled with only a few mouth shape instances. To build the speaker space, a context independent mouth shape parametric representation is obtained. Then a supervector containing the set of context-independent mouth shapes is formed for each speaker included in the speaker space. Dimensionality reduction is used to find the areas of the speaker space.

20 Claims, 4 Drawing Sheets

[blocks in formation]
[blocks in formation]
[merged small][table][merged small][subsumed][graphic][merged small][merged small][merged small][graphic][graphic][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]
[merged small][graphic][merged small]
« PrécédentContinuer »