US20020097244A1 - System and method for automatic animation generation - Google Patents

System and method for automatic animation generation Download PDF

Info

Publication number
US20020097244A1
US20020097244A1 US09/112,692 US11269298A US2002097244A1 US 20020097244 A1 US20020097244 A1 US 20020097244A1 US 11269298 A US11269298 A US 11269298A US 2002097244 A1 US2002097244 A1 US 2002097244A1
Authority
US
United States
Prior art keywords
animation
character
file
gesture
dialog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US09/112,692
Other versions
US6433784B1 (en
Inventor
Richard Merrick
Michael Thenhaus
Wesley Bell
Mark Zartler
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AFLUO LLC
Original Assignee
Learn2 Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/031,488 external-priority patent/US6636219B2/en
Application filed by Learn2 Corp filed Critical Learn2 Corp
Priority to US09/112,692 priority Critical patent/US6433784B1/en
Assigned to 7TH LEVEL, INC. reassignment 7TH LEVEL, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BELL, WESLEY, MERRICK, RICHARD, THENHAUS, MICHAEL, ZARTLER, MARK
Assigned to LEARN2 CORPORATION reassignment LEARN2 CORPORATION MERGER AND CHANGE OF NAME Assignors: 7TH LEVEL, INC.
Publication of US20020097244A1 publication Critical patent/US20020097244A1/en
Application granted granted Critical
Publication of US6433784B1 publication Critical patent/US6433784B1/en
Assigned to LEARN.COM, INC. reassignment LEARN.COM, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LEARN2 CORPORATION
Assigned to SILICON VALLEY BANK reassignment SILICON VALLEY BANK SECURITY AGREEMENT Assignors: LEARN.COM, INC.
Assigned to SILICON VALLEY BANK reassignment SILICON VALLEY BANK SECURITY AGREEMENT Assignors: LEARN.COM, INC.
Assigned to LEARN.COM INC reassignment LEARN.COM INC RELEASE Assignors: SILICON VALLEY BANK
Assigned to LEARN.COM INC reassignment LEARN.COM INC RELEASE Assignors: SILICON VALLEY BANK
Assigned to L2 TECHNOLOGY, LLC reassignment L2 TECHNOLOGY, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LEARN.COM, INC.
Assigned to AFLUO, LLC reassignment AFLUO, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: L 2 TECHNOLOGY, LLC
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8543Content authoring using a description language, e.g. Multimedia and Hypermedia information coding Expert Group [MHEG], eXtensible Markup Language [XML]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8106Monomedia components thereof involving special audio data, e.g. different tracks for different languages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8146Monomedia components thereof involving graphical data, e.g. 3D object, 2D graphics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • G10L2021/105Synthesis of the lips movements from speech, e.g. for talking heads

Definitions

  • the system and method disclosed herein relate generally to animation production and more specifically to methods and systems for automatically generating animation for use in connection with Internet web pages and/or for embedding into applications such as Windows applications that use Microsoft Object Linking and Embedding (“OLE”) or other applications.
  • applications such as Windows applications that use Microsoft Object Linking and Embedding (“OLE”) or other applications.
  • Examples include animated talking characters in Microsoft Powerpoint presentations, Word word processing, and Microsoft Outlook E-mail.
  • Other examples include playing the characters on a stand-alone window floating on the desktop outside of any other application such as an Internet browser or a productivity application.
  • Animated characters can be effective communication tools in many fields but generally are expensive to produce and change. They can be used in many settings, from making Internet pages more interesting to customized presentations to groups or individual and even in interactive presentations, whether through the Internet or other E-mail or directly in an application such as Powerpoint or a word processor.
  • One key benefit of Internet based advertising is the availability of real time interaction with the audience (i.e. the Internet user). For example, it is possible for web developers, working at the behest of advertisers, to script multiple dialogs, scenes, and/or interactions in connection with a web site such that a visitor to that site may be made to feel that the “advertisement” was produced specifically for his or her interests. In other words, based upon the particular HTML links and selections that a user follows or makes, respectively, a user will be presented with information of specific interest to that user. This is in contrast to, for example, a television commercial, where an advertiser produces a commercial of general interest to the universe of its potential customers.
  • Web sites may include information taking the form of plain text, still photographs, still animation, movies, spoken words, scrolling text, dynamic animation and music among others. A combination of these forms of information can create a powerful, enjoyable and lasting image in the mind of the potential customer.
  • dynamic animation One aspect of web site content that is becoming increasingly popular is dynamic animation.
  • an animated character may appear on the user's display, move around the display in a lifelike fashion, point to various objects or text on the screen and speak to the user.
  • the dialog is synchronized with lip movements representing the phonemes being spoken so that it appears that the words are actually emanating from the character's mouth.
  • dynamic animation can provide an interesting, informative and fun environment through which products and services may be advertised.
  • a company may include its mascot (e.g. an animal, persona, fictional character) in its web page content. In this way, the mascot can walk around the web page, speak to the user and use hand and other body movements to convey messages to the user.
  • the mascot may point to specific items on the page, make movements and/or recite dialog based specifically and in real time upon user input. For example, in the case of a web site for the sale of automobiles, a user might click on the graphic of the particular model that interests him or her resulting in the display of a web page completely dedicated to that model. That page may also include the dynamic animation (often including dialog) representing the company's mascot welcoming the user to the page concerning the particular model. Additionally, the advantages of the real time interaction may be effected such that the character, for example, describes and points to various features of the car based upon user input (e.g. the user points to a portion of the automobile graphic which is of interest).
  • Dynamic animation also generally referred to as “choreography” herein
  • Choreography must generally be conceived and created by an individual having both artistic capabilities and a technical knowledge of the animation environment. The cost involved in having material choreographed is thus quite expensive both in terms of time and financial commitment.
  • a second difficulty arising in the creation of dynamic animation is the inherent difficulty in reusing such animation in significantly or even slightly different applications. For example, it is exceedingly difficult to reuse animation produced in accordance with a specific dialog with another dialog. In other words, it is a complex task to “re-purpose” choreography even after it is initially produced at great expense. Additionally, no tools that significantly automate this task are known to the inventors herein. Thus, borrowing from the above example, if an automobile salesman animation was produced with specific dialog to recite and point to features on the automobile as selected by the user, it would not be a simple or inexpensive task to use the same salesman character along with the same general class of body movements to add a discussion of a newly added automobile feature. On the contrary, it would heretofore be necessary to produce a new animation for synchronization with the new dialog.
  • the system and method described herein provide these and other advantages in the form of an easy to use tool for preparing animated characters for use on the Internet and in other applications that need not involve the Internet. Requiring only limited user input and selection, the system and method described herein essentially automatically choreograph and synchronizes reusable animation components with dialog streams. Once generated, the resulting choreography can be embedded into a hypertext markup language (HTML) web page with an appropriate animation player and audio player plug-in to deliver any number of animated dialogues with minimal wait time and minimal developer effort.
  • HTML hypertext markup language
  • the choreography can be embedded into other applications, such as, without limitation, into Windows applications that use Microsoft Object Linking and Embedding (“OLE”), e.g., animated talking characters can be included in Microsoft Powerpoint presentations, Microsoft Work word processing, and MS Outlook E-mail.
  • OLE Microsoft Object Linking and Embedding
  • the talking characters can be played in a stand-alone window floating on top of the desktop outside of any other application such as an Internet browser or a productivity application.
  • the automatic animation preparation system described herein includes an animation preparation application that assigns dialog to pre-existing character templates and automatically generates lip movements and movements and behaviors synchronized with streamed audio dialog.
  • the AAPS interacts with a browser control (plug-in) located on the client,
  • the browser control includes an animation engine supporting AAPS generated animation and also supports runtime execution of audio streaming.
  • the AAPS can be embedded in other applications, such as Windows applications that use OLE, and can be played back in a standalone window floating on the desktop outside of any other application.
  • a non-limiting object of the system and method described herein is to provide a system and method for generating character animation that can be used in an Internet environment, or in other environments, and address the shortcomings discussed above.
  • Another non-limiting object is to provide a tool for automatically generating easily modifiable dynamic animations synchronized with audio content that can be implemented by embedding such animations in an Internet web page or in application or other software.
  • FIG. 1 is a block diagram of the automatic animation preparation system (AAPS) and the environment in which it operates;
  • AAPS automatic animation preparation system
  • FIG. 2 is an illustration of an illustrative first dialog box used in connection with the AAPS
  • FIG. 3 illustrates an example of a second dialog box used in connection with the AAPS
  • FIG. 4 illustrates an example of a third dialog box used in connection with the AAPS.
  • FIG. 5 illustrates an example of a fourth dialog box used in connection with the AAPS.
  • the system method described herein provides a flexible, convenient and inexpensive method by which dynamic animation may be essentially automatically produced for use in connection with an Internet web page or in other environments and application software.
  • the AAPS 80 which is disclosed herein is designed to offer a user-friendly, intuitive interface through which animation can be selected, processed, and included within a web page accessible to a user operating a client terminal having access to the generated web page, or embedded and used in other environments and in application software.
  • FIG. 1 illustrates a client/server environment whereby development occurs on the same server as the resulting real time animation
  • FIG. 1 illustrates a client/server environment whereby development occurs on the same server as the resulting real time animation
  • FIG. 1 illustrates a client/server environment whereby development occurs on the same server as the resulting real time animation
  • AAPS 80 it is also possible for the present system and method to operate with separate servers for development and storage of generated files.
  • AAPS 80 it is also possible for AAPS 80 to exist in a standalone environment, e.g., on a personal computer, with transfer of files to an Internet server accomplished either by modem or some other communication device, or by copying onto transportable physical storage media.
  • HTML browser application 200 will now be described.
  • Browser application 200 preferably supports either Microsoft Internet Explorer version 3 or 4 or Netscape Navigator version 3 or 4, or any suitable successor product. Browser application 200 further preferably supports software such as one of the following: Microsoft NetShow, VivoActive, VDOLive, Liquid Audio, XING Stream Works and/or RealAudio versions 3, 4 or 5. Browser application 200 also includes browser control 210 for processing animation generated by animation preparation application 100 . Browser control 210 is preferably configured as a plug-in application for use with HTML browser application 220 and may always be resident or may be selectively resident as its use is required. In one illustrative embodiment, browser control 210 is the Topgun player available through 7th Level in Richardson, Texas, although browser control 210 can be any animation player application capable of supporting browser application 200 and the animation generated by AAPS 80 .
  • Animation preparation application 100 takes input from various files and developer selections (examples of both are discussed below) and generates dynamic character animation as represented by multiple output files (also discussed below).
  • Animation preparation application 100 contains a number of components which collectively generate animation.
  • User interface control 140 interacts with developer terminal 110 so as to allow a developer working at developer terminal I 10 to select and process dynamic animation characteristics in accordance with the system of the present invention.
  • user interface control 140 provides a Window's based GUI and operates so that display and processing from the developer point of view operates according to “wizard” applications which step the user through a task and which are now common in the Microsoft Windows environment.
  • Animation preparation application 100 also includes process control 170 which can incorporate both a physical processor and software or micro code for controlling the operation of animation preparation application 100 including the various components of animation preparation application 100 .
  • Animation preparation application 100 further includes various functional processes such as compression functionality 160 (to compress any data processed by animation preparation application 100 if necessary by, for example, encoding PCM wave data into one of a variety of audio formats of various bitrates), audio functionality 150 (to generate audio streaming data for playback at browser control 210 ), and character processing functionality 180 (to generate animation for playback at browser control 210 ).
  • compression functionality 160 to compress any data processed by animation preparation application 100 if necessary by, for example, encoding PCM wave data into one of a variety of audio formats of various bitrates
  • audio functionality 150 to generate audio streaming data for playback at browser control 210
  • character processing functionality 180 to generate animation for playback at browser control 210 .
  • Animation preparation application 100 references character database 135 can, but need not, reside on secondary storage associated with the development server maintaining animation preparation application 100 .
  • Character database 135 contains gesture data for any number of characters available to the developer in connection with the use of animation preparation application 100 . For each character, a fixed number of gestures associated with that character is also provided. The number of characters stored in character database 135 can be on the order of 5-50 characters but any number of characters, subject to storage and implementation issues, can be used.
  • dialog database 125 This database is used to store audio clips available for use in connection with animation preparation application 100 . It is also possible to provide a microphone or other recording means whereby developer can record additional audio clips either for storage in dialog database for later use or for direct, immediate use by animation preparation application 100 .
  • the first file generated by animation preparation application 100 can be referred to as the RealAudio Choreography (RAQ file 138 . While the discussion assumes the use of a RealAudio compatible player at the client, the system and method described herein can also be practiced with other players and all of the files described below can easily be generated so as to be compatible with other players.
  • the RAQ file 138 contains lip synchronization information which corresponds to the dialog selected from dialog database 125 . This file may be converted, using available tools to generate an event file corresponding to the player employed at the client.
  • a .INF file 112 is also generated by animation preparation application 100 .
  • This file includes version information respecting the various files and applications which should be used in playing back animation.
  • HTML browser application 200 Once HTML browser application 200 has received (through an Internet download) .INF file 112 , HTML browser application 200 is able to request the correct files (as indicated by the contents of .INF file 112 ) from the animation server, Additionally, animation preparation application 100 further generates control BIN file 108 that holds a set of pre-compiled character assets and behaviors.
  • animation preparation application 100 generates one or more resource segment files 105 which correspond to character models and contain components which may be composited together to form animation.
  • AAPS 80 the developer is prompted through several dialog boxes for character and behavior selection, dialog file import selection and various options to select and generate an animated character for use within an HTML web page.
  • digitized dialog is automatically analyzed in order to extract phonetic information so that the proper lip positions for the selected character can be assigned.
  • Default choreography can also be automatically assigned by the animation preparation application 100 through an analysis of dialog features such as pauses, time between pauses, audio amplitude and occasional random audio activity.
  • dialog features such as pauses, time between pauses, audio amplitude and occasional random audio activity.
  • a selected character's inherent personality traits can also be factored into the generation of default choreography. For example, one character may scratch his head while another puts his hands on his hips.
  • the resulting default choreography is preferably output into RealAudio Character (RAQ file 138 (or other choreography file) (that can be converted to a RAE or other event file) to trigger animation events at the user's computer through HTML browser application 200 and specifically browser control 210 .
  • Selected character behaviors and assets are pre-compiled into binary control file (BIN file) 108 and corresponding resource segments 105 for initial installation prior to installation into HTML page file 115 .
  • An optional security lock may also be implemented so that playback can occur only from a specified URL.
  • Another advantage of pre-compiling characters into segment files 105 is that the resulting animation preparation time when the animation is executed by HTML browser application 200 is significantly reduced.
  • behaviors and assets may be interleaved with the audio stream and processed dynamically.
  • animation preparation application 100 As a result of processing by animation preparation application 100 , a series of HTML tags are generated and placed on the Windows clipboard or saved to HTML clip file 165 . These tags contain all the necessary object embedding information and other parameters for direct insertion into a web page as reflected in HTML page file 115 . In addition to the HTML tag file 165 , the audio stream file 195 , the control BIN 108 and the segment files 105 discussed above, animation preparation application 100 also preferably generates “.INF” file 112 .” .INF file 112 contains associated resource files and version information.
  • the system and method described herein also provides a mechanism for overriding the default animation generated by animation preparation application 100 .
  • a developer may desire to override default behavior and manually select one or more specific gestures available in character database 135 .
  • a character may be talking about items in an online store and need to point in the direction of the items—say, to the character's left. In such case, even if the default animation does not provide this result, the developer can easily modify the default animation to meet his or her needs as discussed below.
  • the character may need to react with specific behaviors based upon user in a web page or from a Java or Visual Basic (VB) script.
  • VB Java or Visual Basic
  • the static override option enables the developer to modify choreography file 138 containing the choreography information.
  • the choreography generated by animation preparation application 100 is stored in choreography file 138 and presented as a list of timed gestures or high-level behavior commands such as turn left, walk right or jump in the air, interleaved with timed mouth positions.
  • gestures can be manually added, modified or removed from the linear sequence with any text editor.
  • a simple syntax (discussed below) can be used so a to allow for easy identification and modification and so as to allow clear differentiation between gesture commands and mouth position commands.
  • Several additional commands are also supported, including setting user input events (mouse, keyboard) or event triggers for launching web pages or other character animations.
  • choreography file 138 can then be used in connection with HTML Page File 115 and be made available for download and execution by HTML browser application 200 .
  • Static override can also be accomplished by allowing a user to embed specific commands, as described above, in the dialog file either in place of or in addition to the gesture file.
  • the web developer can issue gesture commands (index parameters to browser control 210 referencing a particular gesture) from a Java or VB script embedded in HTML page file 115 .
  • a web page can cause a character to say different things based upon user input or a Java application.
  • HTML browser application 200 through browser control 210 , can issue a variety of callbacks which can be used to trigger Java or VB scripts to handle special cases such as control startup, content download, beginning sequence, end sequence, and control termination.
  • a Java script can, for example, respond to embedded triggers in the character's choreography stream to drive a parallel synchronous GIF or JPEG slide show next to the character or even a guided tour through a web site.
  • Animation preparation application 100 can include a set of pre-produced characters, including, for example, salespeople, teachers, web hosts and other 46 alternative choices. These pre-produced characters and their associated gesture sets are stored in character database 135 . Each character can have exactly the same number and type of basic gestures, with each gesture composed of approximately the same number of animation “cels”. For purposes herein, the term “cel” refers to an individual image used as part of a set of images to create animation. A “cel” may be thought of as a frame, or a layer of a frame, in an animation sequence. Within character database 135 each character may have entirely different characteristics possibly making no two characters in character database the same. Nevertheless, conforming the character's “animation architecture” (i.e. same number and type of gestures and each gesture composed of approximately the same number of cels) provides a basis for the generation of automatic choreography by animation preparation application 100 according to the teachings of the present invention.
  • animation architecture i.e. same number and type of gestures and each gesture composed of approximately the same number of cels
  • the end-user using HTML browser application 200 can set a pre-installed character to be their “personal web host” for use with all web pages based upon animation preparation application 100 generated HTML, thus obviating the need for repetitive character download to the end-user.
  • This can be effected by a user by, for example, right clicking on a given character in a web page and setting the “Personal Web Host” flag in the object menu. It is also possible for the developer, using animation preparation application 100 to override the user set flag and enforce download of a specific character.
  • a character can be relatively small, ranging in download size from 15K to 50K depending upon the level of sophistication and detail required or desired.
  • each gesture for each character can reside in a separate “segment” file which can be downloaded progressively over time to create a “better than bandwidth” experience.
  • three dialogs can be created for a character where the first dialog uses a small model (15K), the second dialog uses a medium model (20K) containing all of the gestures of the small model as well as some additional gestures, and a third dialog (40K) includes yet some additional gestures.
  • the character models can be made available to the client through the distribution of a CD-ROM, other transportable storage medium or pre-loaded on a computer hard drive. In this way, a user can be provided with large character databases and attributes without the need to wait for download.
  • Choreography control information can be delivered prior to initiation of the audio stream or embedded and streamed with the audio for interpretation “on the fly”. In the latter case, callbacks can be made dynamically on the client to trigger lip movements and gestures.
  • each character is actually dynamically composited from a collection of parts such as body, arms, head, eyes and mouth layers, redundant animation cels are eliminated and therefore do not need to be downloaded.
  • animation sequences can be created through the use of animation preparation application 100 with reference to the body parts resident on the client for playback without any real-time download.
  • body parts can be positioned, timed and layered in a seemingly endless number of combinations to correspond to the desired dialog by subsequently downloading a very small control file (BTN file) 108 which is typically only a few thousand bytes.
  • BTN file control file
  • the control file 108 need only reference body parts and positions already resident on the client to reflect the desired animation produced by animation preparation application 100 .
  • a standard Windows Help file is included as a component of the animation preparation application 100 .
  • the Help file can contain in-context information describing options for each selection screen as well as tutorials and examples explaining how to manipulate default character behaviors including the use of both static and dynamic overrides as discussed above.
  • FIG. 2 illustrates an example of a wizard dialog box that can be employed by animation preparation application 100 and specifically generated by user interface control 140 in the first step of the process for generating automatic dynamic animation
  • the first dialog box prompts the developer to select between automatically creating new choreography or using existing choreography.
  • the developer selects among these choices through “radio buttons”.
  • the developer proceeds to the second dialog box.
  • a Browse button becomes active to provide another dialog box for finding and selecting a previously generated choreography file. Again, after this selection, the developer proceeds to the second dialog box. Preferably, at the bottom of the dialog box, the buttons Help, Exit and Next are displayed.
  • the Help file can offer the developer context specific help with respect to the first dialog box.
  • the second dialog box is depicted in FIG. 3.
  • This box prompts the developer for selection of a character name through the use of a list box and provides a thumbnail image representative of each character when selected.
  • three radio buttons are preferably included allowing the user to select among a small, medium or large model for each character.
  • the character model limits or expands the number of gestures available for the selected character and may be selected as a tradeoff between download speed and animation richness.
  • the Help, Exit, Back, Next and Finish are provided at the bottom of the dialog box.
  • the Next button is grayed and the Finish button active only when the developer has selected the use choreography file option through the first dialog box.
  • the developer can optionally choose a different character for use with a preexisting or modified choreography file 175 before again using animation preparation application 100 to automatically generate animation.
  • Back returns the developer to the previous dialog box and Finish incurs any remaining defaults and then completes the preparation of dynamic animation and all files associated therewith.
  • the third dialog box which is illustrated in FIG. 4, is used to prompt the developer to select a source WAV audio file as well as providing the ability to browse for an audio file or record a new one.
  • the third dialog box may include a Preview button (not shown) in order to allow the developer to hear a selected audio file.
  • the selected audio file preferably contains spoken dialog without background sound or noise which would make phoneme recognition difficult, even though it is possible for the resulting RealAudio file 195 or.WAV file 195 to contain music, sound effects and/or other audio.
  • An edit box is also provided for entry of the URL pointing to the encoded RealAudio file 195 or.WAV file 195 which is generated by animation preparation application 100 .
  • the entered URL is also used by animation preparation application 100 to generate .INF file 112 and HTML tag file 115 .
  • An option may also be included whereby the developer can select a specific bit-rate with which to encode the audio. Encoding according to a specified bit-rate can be accomplished through the use of the RealAudio Software Development Kit (SDK) or other development tools corresponding to other players.
  • SDK RealAudio Software Development Kit
  • the third dialog also includes Exit, Help, Back, Next and Finish buttons which operate the same way as discussed above.
  • the fourth dialog box is illustrated in FIG. 5. This dialog is processed upon completion of the third dialog.
  • the fourth dialog box prompts the developer for choreography options using four mutually exclusive radio button options: FAVOR-LEFT, FAVOR-FRONT, FAVOR-BACK, and FAVOR-RIGHT.
  • FAVOR-LEFT mutually exclusive radio button options
  • FAVOR-FRONT FAVOR-FRONT
  • FAVOR-BACK FAVOR-RIGHT
  • Each of these options will cause animation preparation application 100 to tend towards selection of gestures and high-level behaviors which cause the character to orient toward a particular area of the web page or specific orientation with respect to the user.
  • the above four options are provided by way of example only and many other options might be provided either in addition to or instead of the four options above.
  • the options may reflect any particular behavior of the character which is preferred by the developer so long as the appropriate processing to accomplish the tendency is built into animation preparation application 100 .
  • choreography file 138 is generated.
  • This file may be converted to an event file using, for example, Real Network's WEVENTS.EXE in the case of a RealAudio R-AC File.
  • RAC file 138 contains both a reference to the audio file and to a list of timed gestures in a “Gesture List” represented by segment files 105 .
  • RAC file 138 is hidden from the developer and automatically compiled for use with character assets contained in BIN control file 108 .
  • the segment files 105 and BIN control file 108 may be collectively compiled for immediate use in connection with HTML Page File 115 .
  • the developer can choose to edit the gestures in the segment files 105 111 order to manually control character behavior as discussed above.
  • the RAC file 138 contains a series of references to gestures which are contained in the segment files 105 .
  • Each gesture reference is represented in RAC file 138 as a function call with two parameters: gesture number (constant identifier) and duration.
  • An example of a RAC file with a set of gesture references might be as follows: GestureList begin . . . Gesture(ARMS_UP, 2000) . . . end.
  • ARMS-UP is a command to move the character's arms upward and 2000 is the total time allocated to the gesture in milliseconds. In this case, if the actual animation requires only 500 milliseconds to execute, then the character would preferably be disposed to hold the ARMS-UP position for an additional 1500 milliseconds.
  • the use of a single entry point for gestures, rather than a different entry point for each gesture provides an open model for forward and backward compatibility between supported gestures and future streaming and control technologies.
  • the gesture list contained in RAC file 138 is automatically serviced by browser control 210 based upon callback events generated by browser control 210 .
  • Each gesture, mouth and event commands are interpreted by browser control 210 essentially in real time, causing the animation to play in a manner that the user can perceive as being synchronous with the audio stream and external event messages broadcast (i.e. dynamic override events from user/JAVA control).
  • GestureList begin . . . Gesture(ARMS_UP, 2000) Gesture(ARMS_DOWN, 500) Gesture(EYES_BLINK, 1000) Gesture(ARMS_CROSS, 3000) . . . end
  • GestureList begin . . . Gesture(BODY_LEFT, 1000) Gesture(ARMS_LEFT, 3000) Gesture(EYES_BLINK, 2500) . . . end
  • phonetic information can be added as comments (using a predetermined delimiter) to help identify spoken dialog and timing.
  • mouth positions in RAC file 138 may be represented as follows: array GestureList begin . . . Gesture(ARMS_UP, 2000) Mouth(LIP_A, 250) Mouth(LIP_C, 350) Mouth(LIP_B, 250) Mouth(LIP_A, 450) Gesture(ARMS-DOWN, 500) . . . end
  • LIP-A, LIP-B and LIP_C represent particular mouth positions and the number following the position is the length of time such mouth position is held.
  • generic mouth positions indicated must be converted into logical/physical mouth positions dynamically to correspond to the gesture pose in effect at any moment.
  • the mouth positions A, C, B, A should be changed into specific bitmaps depending on which gesture is being displayed, and composited onto the other character layers. This is discussed in greater detail below,
  • each character is preferably produced to a template—gesture for gesture.
  • the template is set up to be general purpose and to include by way of example the following basic gestures: 1) Face Front (mouth positions active) 2) Face Left (flip for Face Right) (mouth positions active) 3) Face Rear Left (flip for Face Rear (mouth positions active) Right) 4) Foreshorten to Camera (mouth positions active) 5) Walk Cycle Left 6) Am-is Down 7) Arms Up 8) Arms Left (flip for Arms Right) 9) Arms Cross 10) Arms Out (to implore or stop)
  • each locked head position should have the standard six mouth positions as well as eye blinks and common eye emotions such as eyes questioning, eyes concentrating, and eyes happy. All gestures can be animated forward or backward to central hookup position(s).
  • Each character as discussed above, preferably has small, medium and large size versions of a given gesture which utilize less to more cels.
  • each pose which has active mouth positions preferably includes a corresponding set of planned behaviors and timings. This is discussed in further detail below.
  • the RealAudio player SDK should be used as necessary to provide audio streaming at various bit-rates and to maintain synchronization of character animation with audio stream file 195 .
  • Character choreography is delivered via RAC file 138 , which is automatically created by animation preparation application 100 and which can be converted to a RealAudio event file (or similar event file).
  • Speech recognition and emotive libraries and source code can be employed to provide automatic phoneme recognition using, for example, Voxware, VPI or AT&T Watson software.
  • any SAPI compliant text-to-speech processor such as Lernout & Hauspie's TrueVoice
  • dialog text can be processed into animation preparation application 100 ) into phonemes for greater precision in synchronizing mouth positions to a dialog stream.
  • a dictionary providing mappings from common words to phonemes can be used.
  • Browser control 210 is also preferably configured to provide mapping of the 40 phonemes onto one of 6 mouth positions. However, the map in this case should be intrinsic with each character in that some characters may have different or more or less mouth positions.
  • All characters preferably share the same number and type of basic gestures and the same number and type of scene-planned behaviors, but do not necessarily require the same number of animation cels.
  • Each character's possible gestures and behaviors are “queues” contained in it's own browser control 210 include file. These queues are character specific animation commands, including logical cel names and hold counts. This method frees the artist to use as many cels as desired to express the character for each basic gesture or composite behavior. In addition, new or replacement gestures can be added in the future without concern for backward compatibility in browser control 210 .
  • Character assets (bitmaps) for each basic gesture are compiled into a separate segment file 105 to enable separate downloading based on the size model selected (i.e., small, medium or large). For the small model, only a few gesture bitmaps and queues arc needed. For the large model, all of the queues and bitmaps are necessary, Segmenting each basic gesture into its own segment file 205 enables selective downloading of assets. In this way, gestures can be accumulated in three successive dialog streams to create a “better than bandwidth experience”.
  • JNF file 112 is used to identify client browser control 210 version and versions of any resource segment files 105 or necessary plug-ins.
  • browser control 210 selects one or more of the behavior queues for a given size model to fill the time to the next behavior trigger.
  • Each of the entries in the above table can either be a pointer to another table or simply another table dimension containing several variants of each behavior row.
  • the SPEAK_LOUD row, medium model column entry FF3M[ ] might have the following 3 scene-planned behaviors composed of primitive gestures:
  • FF3MB the second entry in this table, might be generated as follows: array GestureList begin . . . Gesture(FF3MB, 15000) ;FACE_FRONT_2, SPEAK_LOUD, medium model, 2nd ;version randomly selected. . . . end
  • browser control 210 would have 15 seconds of time to fill before the next gesture trigger.
  • the behavior duration is adjusted automatically by browser control 210 by varying the hold value for the last cel of each gesture in the behavior.
  • the additional hold values can be calculated as a randomized percentage of the required duration minus the cumulative gesture animation times. Whenever a duration exceeds some prescribed length of time (per artist) without encountering a gesture trigger, browser control 210 selects another random entry for the same criteria (say, FF3MQ and adds this to the gesture list.
  • Each character can have 10 basic gestures with 4 of these poses having 6 mouth positions.
  • all characters can share the same 45 scene-planned behaviors.
  • additional entries can be scene planned to create more variety in behavior if needed.
  • browser control 210 preferably places the character into an ambient wait state that cycles through various randomized gestures. This is indicated by the command Gesture(FF_WAlT, ⁇ 1).
  • the COMMAND_NAME can be a basic gesture, high-level behavior or text string containing a URL.
  • the DURATION applies only to gestures or behaviors and can be either a specific period of time in milliseconds or ⁇ I (LOOP constant) to indicate infinite loop.
  • control file URL points to INF file 112 describing content and/or control file to retrieve and initiate. If all files are already cached on the client, then control file 108 is processed immediately to stream the desired dialog. This format is extensible to support a variety of other commands or conditions in future versions.
  • HTML HyperText Markup Language
  • Java Java
  • VB Script server CGI script
  • HTML different characters and dialogs are launched by passing .INF file 112 URL as a parameter to browser control 210 .
  • the HTML simply needs to set the parameter to the desired content to cause it to begin streaming. This is accomplished either upon loading of the web page or is based on logic contained in embedded Java or VB script.
  • embedded Java or VB script logic events triggered from user input or built-in logic conditions can launch different character dialogs.
  • Numeric values can be passed into browser control 210 corresponding to each of the supported gestures and behaviors, causing that action to be performed. For instance, a fragment of Java code in an HTML page can call browser control 210 with the parameter “2” (corresponding to the FACE-LEFT gesture command) to cause the character to unconditionally animate to face left.
  • the Java fragment can be called from other Java script processing mouse events for a button or picture.
  • entire applications can be written in Java or VB to create any number of control programs for characters.
  • the design of the system and method described herein accounts for the dynamic mapping of mouth positions to gestures.
  • the present system and method assume that the two are maintained separately, lips z-ordered above gestures in the proper position, and played together at runtime by browser control 210 .
  • a method is provided to process gesture requests—either from the gesture list or through a dynamic gesture request—and automatically select and composite the mouth position cels corresponding to the current gesture ANIMATE queue command. More specifically, since the gesture list and mouth positions list generate asynchronous requests, a mechanism services the requests, composites them and animates them in real time.

Abstract

A system an method providing an easy to use tool for preparing animated characters for use on the Internet or in other environments. Requiring only limited user input and selection, the system and method essential automatically choreograph and synchronizes reusable animation components with dialog streams and with gestures. Once generated, the resulting choreography can be embedded into a hypertext markup language (HTML) web page with an appropriate audio player plug-in to deliver any number of animated dialogues with minimal wait time and minimal developer effort, or can be similarly embedded or used with other software.

Description

    REFERENCE TO RELATED APPLICATION
  • This application is a continuation-in-part of parent application Ser. No. 09/031,488 filed on Feb. 26, 1998, which is hereby incorporated herein in its entirety.[0001]
  • BACKGROUND
  • 1. Field [0002]
  • The system and method disclosed herein relate generally to animation production and more specifically to methods and systems for automatically generating animation for use in connection with Internet web pages and/or for embedding into applications such as Windows applications that use Microsoft Object Linking and Embedding (“OLE”) or other applications. Examples include animated talking characters in Microsoft Powerpoint presentations, Word word processing, and Microsoft Outlook E-mail. Other examples include playing the characters on a stand-alone window floating on the desktop outside of any other application such as an Internet browser or a productivity application. [0003]
  • 2. General Background [0004]
  • Animated characters, particularly talking animated characters, can be effective communication tools in many fields but generally are expensive to produce and change. They can be used in many settings, from making Internet pages more interesting to customized presentations to groups or individual and even in interactive presentations, whether through the Internet or other E-mail or directly in an application such as Powerpoint or a word processor. [0005]
  • One example of using aminated talking characters in Internet communications that are enjoying more popularity than ever. With the number of users rising almost exponentially over the last few, years, it is not surprising that a large majority of businesses have made the Internet a significant part of their overall marketing plan. In addition to the large number of “web surfers” who may come across advertising content, the Internet offers many advantages in terms of technological capabilities for advertising products and services. Current Internet technology permits advertisers to do many things which have heretofore been unavailable through any other known advertising medium. [0006]
  • One key benefit of Internet based advertising is the availability of real time interaction with the audience (i.e. the Internet user). For example, it is possible for web developers, working at the behest of advertisers, to script multiple dialogs, scenes, and/or interactions in connection with a web site such that a visitor to that site may be made to feel that the “advertisement” was produced specifically for his or her interests. In other words, based upon the particular HTML links and selections that a user follows or makes, respectively, a user will be presented with information of specific interest to that user. This is in contrast to, for example, a television commercial, where an advertiser produces a commercial of general interest to the universe of its potential customers. [0007]
  • Another major advantage available to Internet advertisers is the variety and richness of media available. Web sites may include information taking the form of plain text, still photographs, still animation, movies, spoken words, scrolling text, dynamic animation and music among others. A combination of these forms of information can create a powerful, enjoyable and lasting image in the mind of the potential customer. [0008]
  • One aspect of web site content that is becoming increasingly popular is dynamic animation. With this media format, an animated character may appear on the user's display, move around the display in a lifelike fashion, point to various objects or text on the screen and speak to the user. In most cases, when the character speaks to the user, the dialog is synchronized with lip movements representing the phonemes being spoken so that it appears that the words are actually emanating from the character's mouth. As can be imagined, dynamic animation can provide an interesting, informative and fun environment through which products and services may be advertised. By way of example, a company may include its mascot (e.g. an animal, persona, fictional character) in its web page content. In this way, the mascot can walk around the web page, speak to the user and use hand and other body movements to convey messages to the user. [0009]
  • Additionally, the mascot may point to specific items on the page, make movements and/or recite dialog based specifically and in real time upon user input. For example, in the case of a web site for the sale of automobiles, a user might click on the graphic of the particular model that interests him or her resulting in the display of a web page completely dedicated to that model. That page may also include the dynamic animation (often including dialog) representing the company's mascot welcoming the user to the page concerning the particular model. Additionally, the advantages of the real time interaction may be effected such that the character, for example, describes and points to various features of the car based upon user input (e.g. the user points to a portion of the automobile graphic which is of interest). [0010]
  • While dynamic animation presents significant opportunities for advertising (as well as other applications) on the Internet and in other applications, various implementation difficulties arise in connection with developing and revising content. First, the production of dynamic animation requires special skill not broadly available. Dynamic animation (also generally referred to as “choreography” herein) must generally be conceived and created by an individual having both artistic capabilities and a technical knowledge of the animation environment. The cost involved in having material choreographed is thus quite expensive both in terms of time and financial commitment. [0011]
  • A second difficulty arising in the creation of dynamic animation is the inherent difficulty in reusing such animation in significantly or even slightly different applications. For example, it is exceedingly difficult to reuse animation produced in accordance with a specific dialog with another dialog. In other words, it is a complex task to “re-purpose” choreography even after it is initially produced at great expense. Additionally, no tools that significantly automate this task are known to the inventors herein. Thus, borrowing from the above example, if an automobile salesman animation was produced with specific dialog to recite and point to features on the automobile as selected by the user, it would not be a simple or inexpensive task to use the same salesman character along with the same general class of body movements to add a discussion of a newly added automobile feature. On the contrary, it would heretofore be necessary to produce a new animation for synchronization with the new dialog. [0012]
  • Another problem arising in connection with the use of dynamic animation on the Internet and some other applications results from network bandwidth limitations. With current technology and network traffic, it is difficult to deliver compelling and highly expressive animation over the Internet or through certain other communication channels without downloading substantial information prior to execution of the animation. Similar considerations apply even when the animation is locally generated and stored or played out. This can result in user frustration, substantial use of storage space and other undesirable effects resulting from the download process or the storage, retrieval and play out process. Alternatively, the animation may be reduced to an acceptable size for real time narrowband delivery and lower storage and retrieval requirements. This solution, however, compromises the quality of the animation as well as, in most cases, the quality of associated audio. [0013]
  • Finally, the possibility of changing animation and/or dialog for a character on a daily or even hourly basis typically is impractical due to the inherent difficulties and time required to synchronize lip movements and behaviors to dialog. Each of the issues discussed above individually and collectively serve to create a substantial barrier to entry for the acceptance and implementation of animated characters in an Internet environment. [0014]
  • SUMMARY OF THE DISCLOSURE
  • Accordingly, it is believed that there is a need for a system and method whereby dynamic animation can be prepared at a reduced cost and without the need for significant specialized skills. [0015]
  • There is also a need for a system and method that can be used to develop flexible dynamic animation that can be easily re-purposed for use in different applications and with different dialogue. [0016]
  • There is additionally a need for a system and method that generate dynamic animation that can be used in a narrowband environment such as the Internet without the need to delete content or compromise quality in order for such animation to be processed on a real-time basis. [0017]
  • The system and method described herein provide these and other advantages in the form of an easy to use tool for preparing animated characters for use on the Internet and in other applications that need not involve the Internet. Requiring only limited user input and selection, the system and method described herein essentially automatically choreograph and synchronizes reusable animation components with dialog streams. Once generated, the resulting choreography can be embedded into a hypertext markup language (HTML) web page with an appropriate animation player and audio player plug-in to deliver any number of animated dialogues with minimal wait time and minimal developer effort. In addition, the choreography (moving and talking characters) can be embedded into other applications, such as, without limitation, into Windows applications that use Microsoft Object Linking and Embedding (“OLE”), e.g., animated talking characters can be included in Microsoft Powerpoint presentations, Microsoft Work word processing, and MS Outlook E-mail. The talking characters can be played in a stand-alone window floating on top of the desktop outside of any other application such as an Internet browser or a productivity application. [0018]
  • In an illustrative and non-limiting embodiment, the automatic animation preparation system (AAPS) described herein includes an animation preparation application that assigns dialog to pre-existing character templates and automatically generates lip movements and movements and behaviors synchronized with streamed audio dialog. The AAPS interacts with a browser control (plug-in) located on the client, The browser control includes an animation engine supporting AAPS generated animation and also supports runtime execution of audio streaming. In other uses, the AAPS can be embedded in other applications, such as Windows applications that use OLE, and can be played back in a standalone window floating on the desktop outside of any other application. [0019]
  • A non-limiting object of the system and method described herein is to provide a system and method for generating character animation that can be used in an Internet environment, or in other environments, and address the shortcomings discussed above. [0020]
  • Another non-limiting object is to provide a tool for automatically generating easily modifiable dynamic animations synchronized with audio content that can be implemented by embedding such animations in an Internet web page or in application or other software. [0021]
  • In accordance with these and other objects which will be apparent hereinafter, the system and method hereof will be described with particular reference to the accompanying drawings.[0022]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other objects, features and advantages of the system and method described herein will be apparent from the following more particular description of various illustrative and non-limiting embodiments, as illustrated in the accompanying drawings in which like reference numerals refer to like components throughout the different views and illustrations. [0023]
  • FIG. 1 is a block diagram of the automatic animation preparation system (AAPS) and the environment in which it operates; [0024]
  • FIG. 2 is an illustration of an illustrative first dialog box used in connection with the AAPS; [0025]
  • FIG. 3 illustrates an example of a second dialog box used in connection with the AAPS; [0026]
  • FIG. 4 illustrates an example of a third dialog box used in connection with the AAPS; and [0027]
  • FIG. 5 illustrates an example of a fourth dialog box used in connection with the AAPS. [0028]
  • DETAILED DESCRIPTION
  • The system method described herein provides a flexible, convenient and inexpensive method by which dynamic animation may be essentially automatically produced for use in connection with an Internet web page or in other environments and application software. The AAPS [0029] 80 which is disclosed herein is designed to offer a user-friendly, intuitive interface through which animation can be selected, processed, and included within a web page accessible to a user operating a client terminal having access to the generated web page, or embedded and used in other environments and in application software.
  • Referring now to FIG. 1, for an illustration of aspects of the system and method described herein for dynamic animation, it should be understood that although FIG. 1 illustrates a client/server environment whereby development occurs on the same server as the resulting real time animation, there is no restriction to such an arrangement. For example, it is also possible for the present system and method to operate with separate servers for development and storage of generated files. It is also possible for AAPS [0030] 80 to exist in a standalone environment, e.g., on a personal computer, with transfer of files to an Internet server accomplished either by modem or some other communication device, or by copying onto transportable physical storage media. Returning now to FIG. 1 and the components illustrated thereon, HTML browser application 200 will now be described. Browser application 200 preferably supports either Microsoft Internet Explorer version 3 or 4 or Netscape Navigator version 3 or 4, or any suitable successor product. Browser application 200 further preferably supports software such as one of the following: Microsoft NetShow, VivoActive, VDOLive, Liquid Audio, XING Stream Works and/or RealAudio versions 3, 4 or 5. Browser application 200 also includes browser control 210 for processing animation generated by animation preparation application 100. Browser control 210 is preferably configured as a plug-in application for use with HTML browser application 220 and may always be resident or may be selectively resident as its use is required. In one illustrative embodiment, browser control 210 is the Topgun player available through 7th Level in Richardson, Texas, although browser control 210 can be any animation player application capable of supporting browser application 200 and the animation generated by AAPS 80.
  • Animation preparation application [0031] 100 takes input from various files and developer selections (examples of both are discussed below) and generates dynamic character animation as represented by multiple output files (also discussed below). Animation preparation application 100 contains a number of components which collectively generate animation. User interface control 140 interacts with developer terminal 110 so as to allow a developer working at developer terminal I 10 to select and process dynamic animation characteristics in accordance with the system of the present invention. In one illustrative example, user interface control 140 provides a Window's based GUI and operates so that display and processing from the developer point of view operates according to “wizard” applications which step the user through a task and which are now common in the Microsoft Windows environment.
  • Animation preparation application [0032] 100 also includes process control 170 which can incorporate both a physical processor and software or micro code for controlling the operation of animation preparation application 100 including the various components of animation preparation application 100. Animation preparation application 100 further includes various functional processes such as compression functionality 160 (to compress any data processed by animation preparation application 100 if necessary by, for example, encoding PCM wave data into one of a variety of audio formats of various bitrates), audio functionality 150 (to generate audio streaming data for playback at browser control 210), and character processing functionality 180 (to generate animation for playback at browser control 210).
  • Animation preparation application [0033] 100 references character database 135 can, but need not, reside on secondary storage associated with the development server maintaining animation preparation application 100. Character database 135 contains gesture data for any number of characters available to the developer in connection with the use of animation preparation application 100. For each character, a fixed number of gestures associated with that character is also provided. The number of characters stored in character database 135 can be on the order of 5-50 characters but any number of characters, subject to storage and implementation issues, can be used.
  • The system and method described herein further include dialog database [0034] 125. This database is used to store audio clips available for use in connection with animation preparation application 100. It is also possible to provide a microphone or other recording means whereby developer can record additional audio clips either for storage in dialog database for later use or for direct, immediate use by animation preparation application 100.
  • A brief discussion of the files generated by animation preparation application [0035] 100 is now provided. Further detail with respect to each file is provided below. The first file generated by animation preparation application 100 can be referred to as the RealAudio Choreography (RAQ file 138. While the discussion assumes the use of a RealAudio compatible player at the client, the system and method described herein can also be practiced with other players and all of the files described below can easily be generated so as to be compatible with other players. The RAQ file 138 contains lip synchronization information which corresponds to the dialog selected from dialog database 125. This file may be converted, using available tools to generate an event file corresponding to the player employed at the client. In the case of RealAudio, the file is an RAE file and in the case of NetShow, the file would be an Advanced Streaming Format (ASF) file. The event file triggers animation events through browser control 210. Additionally, animation preparation application 100 generates HTML clip file 165 which consists of HTML commands with embedded object references so as to trigger the execution of animation again through browser control 210 and in connection with the aforementioned event file. HTML clip file 165 can be manually pasted into HTML page file 115 in the appropriate location. Animation preparation application 100 also generates either or both a RealAudio (.RA) file 195 (or other audio file) and/or a.WAV file. These files represent the encoded dialog selected from dialog database 125 in a format which may be used by HTML browser application 200 to play audio associated with the generated animation.
  • A .INF file [0036] 112 is also generated by animation preparation application 100. This file includes version information respecting the various files and applications which should be used in playing back animation. Once HTML browser application 200 has received (through an Internet download) .INF file 112, HTML browser application 200 is able to request the correct files (as indicated by the contents of .INF file 112) from the animation server, Additionally, animation preparation application 100 further generates control BIN file 108 that holds a set of pre-compiled character assets and behaviors. In addition to BIN file 108, animation preparation application 100 generates one or more resource segment files 105 which correspond to character models and contain components which may be composited together to form animation.
  • In using AAPS [0037] 80, the developer is prompted through several dialog boxes for character and behavior selection, dialog file import selection and various options to select and generate an animated character for use within an HTML web page. During the process for generating dynamic character animation, digitized dialog is automatically analyzed in order to extract phonetic information so that the proper lip positions for the selected character can be assigned. Default choreography can also be automatically assigned by the animation preparation application 100 through an analysis of dialog features such as pauses, time between pauses, audio amplitude and occasional random audio activity. In addition to dialog features, a selected character's inherent personality traits can also be factored into the generation of default choreography. For example, one character may scratch his head while another puts his hands on his hips.
  • The resulting default choreography is preferably output into RealAudio Character (RAQ file [0038] 138 (or other choreography file) (that can be converted to a RAE or other event file) to trigger animation events at the user's computer through HTML browser application 200 and specifically browser control 210. Selected character behaviors and assets are pre-compiled into binary control file (BIN file) 108 and corresponding resource segments 105 for initial installation prior to installation into HTML page file 115. In this way, character assets are protected from piracy and accidental deletion thus reducing support problems due to missing or corrupted source files. An optional security lock may also be implemented so that playback can occur only from a specified URL. Another advantage of pre-compiling characters into segment files 105 is that the resulting animation preparation time when the animation is executed by HTML browser application 200 is significantly reduced. Alternatively, behaviors and assets may be interleaved with the audio stream and processed dynamically.
  • As a result of processing by animation preparation application [0039] 100, a series of HTML tags are generated and placed on the Windows clipboard or saved to HTML clip file 165. These tags contain all the necessary object embedding information and other parameters for direct insertion into a web page as reflected in HTML page file 115. In addition to the HTML tag file 165, the audio stream file 195, the control BIN 108 and the segment files 105 discussed above, animation preparation application 100 also preferably generates “.INF” file 112.” .INF file 112 contains associated resource files and version information.
  • The system and method described herein also provides a mechanism for overriding the default animation generated by animation preparation application [0040] 100. In some applications, a developer may desire to override default behavior and manually select one or more specific gestures available in character database 135. By way of example, a character may be talking about items in an online store and need to point in the direction of the items—say, to the character's left. In such case, even if the default animation does not provide this result, the developer can easily modify the default animation to meet his or her needs as discussed below. In another case, the character may need to react with specific behaviors based upon user in a web page or from a Java or Visual Basic (VB) script. Each of these cases is now discussed; the first case referred to as a “static override” and the second case is referred to as “dynamic override”.
  • Static Override Option [0041]
  • The static override option enables the developer to modify choreography file [0042] 138 containing the choreography information. The choreography generated by animation preparation application 100 is stored in choreography file 138 and presented as a list of timed gestures or high-level behavior commands such as turn left, walk right or jump in the air, interleaved with timed mouth positions. Using a list of behaviors common to all characters, gestures can be manually added, modified or removed from the linear sequence with any text editor. A simple syntax (discussed below) can be used so a to allow for easy identification and modification and so as to allow clear differentiation between gesture commands and mouth position commands. Several additional commands are also supported, including setting user input events (mouse, keyboard) or event triggers for launching web pages or other character animations. Once modified, choreography file 138 can then be used in connection with HTML Page File 115 and be made available for download and execution by HTML browser application 200. Static override can also be accomplished by allowing a user to embed specific commands, as described above, in the dialog file either in place of or in addition to the gesture file.
  • Dynamic Override Option [0043]
  • In order to dynamically override a character's default choreography, the web developer can issue gesture commands (index parameters to [0044] browser control 210 referencing a particular gesture) from a Java or VB script embedded in HTML page file 115. For example, a web page can cause a character to say different things based upon user input or a Java application. In addition, HTML browser application 200, through browser control 210, can issue a variety of callbacks which can be used to trigger Java or VB scripts to handle special cases such as control startup, content download, beginning sequence, end sequence, and control termination. In this way, a Java script can, for example, respond to embedded triggers in the character's choreography stream to drive a parallel synchronous GIF or JPEG slide show next to the character or even a guided tour through a web site.
  • Animation preparation application [0045] 100 can include a set of pre-produced characters, including, for example, salespeople, teachers, web hosts and other 46 alternative choices. These pre-produced characters and their associated gesture sets are stored in character database 135. Each character can have exactly the same number and type of basic gestures, with each gesture composed of approximately the same number of animation “cels”. For purposes herein, the term “cel” refers to an individual image used as part of a set of images to create animation. A “cel” may be thought of as a frame, or a layer of a frame, in an animation sequence. Within character database 135 each character may have entirely different characteristics possibly making no two characters in character database the same. Nevertheless, conforming the character's “animation architecture” (i.e. same number and type of gestures and each gesture composed of approximately the same number of cels) provides a basis for the generation of automatic choreography by animation preparation application 100 according to the teachings of the present invention.
  • Since all characters in [0046] character database 135 can share a common set of behaviors, the end-user using HTML browser application 200 can set a pre-installed character to be their “personal web host” for use with all web pages based upon animation preparation application 100 generated HTML, thus obviating the need for repetitive character download to the end-user. This can be effected by a user by, for example, right clicking on a given character in a web page and setting the “Personal Web Host” flag in the object menu. It is also possible for the developer, using animation preparation application 100 to override the user set flag and enforce download of a specific character.
  • A character can be relatively small, ranging in download size from 15K to 50K depending upon the level of sophistication and detail required or desired. In fact, in one embodiment each gesture for each character can reside in a separate “segment” file which can be downloaded progressively over time to create a “better than bandwidth” experience. For instance, three dialogs can be created for a character where the first dialog uses a small model (15K), the second dialog uses a medium model (20K) containing all of the gestures of the small model as well as some additional gestures, and a third dialog (40K) includes yet some additional gestures. [0047]
  • After downloading one or more of these models (gesture sets), they are available for use by [0048] HTML browser application 200 without any further download to the client. In this way, dialogs and dynamic animation can be implemented such that very expressive sequences can be created despite bandwidth limitations. Alternatively, the character models can be made available to the client through the distribution of a CD-ROM, other transportable storage medium or pre-loaded on a computer hard drive. In this way, a user can be provided with large character databases and attributes without the need to wait for download. Choreography control information can be delivered prior to initiation of the audio stream or embedded and streamed with the audio for interpretation “on the fly”. In the latter case, callbacks can be made dynamically on the client to trigger lip movements and gestures.
  • Since each character is actually dynamically composited from a collection of parts such as body, arms, head, eyes and mouth layers, redundant animation cels are eliminated and therefore do not need to be downloaded. In other words, once the body parts necessary to perform the desired animation have been downloaded to the end-user client, animation sequences can be created through the use of animation preparation application [0049] 100 with reference to the body parts resident on the client for playback without any real-time download.
  • Additionally, body parts can be positioned, timed and layered in a seemingly endless number of combinations to correspond to the desired dialog by subsequently downloading a very small control file (BTN file) [0050] 108 which is typically only a few thousand bytes. The control file 108 need only reference body parts and positions already resident on the client to reflect the desired animation produced by animation preparation application 100.
  • In an illustrative embodiment, a standard Windows Help file is included as a component of the animation preparation application [0051] 100. The Help file can contain in-context information describing options for each selection screen as well as tutorials and examples explaining how to manipulate default character behaviors including the use of both static and dynamic overrides as discussed above.
  • Turning now to FIGS. [0052] 2-5, a detailed description of the operation of the animation preparation application 100 is now provided from both a user point of view as well as with respect to internal processing steps. FIG. 2 illustrates an example of a wizard dialog box that can be employed by animation preparation application 100 and specifically generated by user interface control 140 in the first step of the process for generating automatic dynamic animation, The first dialog box prompts the developer to select between automatically creating new choreography or using existing choreography. In an illustrative embodiment, the developer selects among these choices through “radio buttons”. In the default case of automatic creation, the developer proceeds to the second dialog box. In the case where the developer selects use of an existing choreography file, a Browse button becomes active to provide another dialog box for finding and selecting a previously generated choreography file. Again, after this selection, the developer proceeds to the second dialog box. Preferably, at the bottom of the dialog box, the buttons Help, Exit and Next are displayed. The Help file can offer the developer context specific help with respect to the first dialog box.
  • The second dialog box is depicted in FIG. 3. This box prompts the developer for selection of a character name through the use of a list box and provides a thumbnail image representative of each character when selected. In addition, three radio buttons are preferably included allowing the user to select among a small, medium or large model for each character. As discussed above, the character model limits or expands the number of gestures available for the selected character and may be selected as a tradeoff between download speed and animation richness. Once a character and gesture level (model) are selected by the developer, the fully compressed download size for the selected character/model combination is displayed in order to assist the developer in his or her selection. [0053]
  • Additionally, at the bottom of the dialog box, the Help, Exit, Back, Next and Finish are provided. The Next button is grayed and the Finish button active only when the developer has selected the use choreography file option through the first dialog box. In this way, the developer can optionally choose a different character for use with a preexisting or modified [0054] choreography file 175 before again using animation preparation application 100 to automatically generate animation. In all dialogs, Back returns the developer to the previous dialog box and Finish incurs any remaining defaults and then completes the preparation of dynamic animation and all files associated therewith.
  • The third dialog box, which is illustrated in FIG. 4, is used to prompt the developer to select a source WAV audio file as well as providing the ability to browse for an audio file or record a new one. The third dialog box may include a Preview button (not shown) in order to allow the developer to hear a selected audio file. The selected audio file preferably contains spoken dialog without background sound or noise which would make phoneme recognition difficult, even though it is possible for the resulting RealAudio file [0055] 195 or.WAV file 195 to contain music, sound effects and/or other audio.
  • An edit box is also provided for entry of the URL pointing to the encoded RealAudio file [0056] 195 or.WAV file 195 which is generated by animation preparation application 100. The entered URL is also used by animation preparation application 100 to generate .INF file 112 and HTML tag file 115. It is also possible to include an input field in the third dialog box whereby the user can enter text corresponding to the recorded dialog to ensure that lip synchronization is accurate. An option may also be included whereby the developer can select a specific bit-rate with which to encode the audio. Encoding according to a specified bit-rate can be accomplished through the use of the RealAudio Software Development Kit (SDK) or other development tools corresponding to other players. Finally, the third dialog also includes Exit, Help, Back, Next and Finish buttons which operate the same way as discussed above.
  • The fourth dialog box is illustrated in FIG. 5. This dialog is processed upon completion of the third dialog. The fourth dialog box prompts the developer for choreography options using four mutually exclusive radio button options: FAVOR-LEFT, FAVOR-FRONT, FAVOR-BACK, and FAVOR-RIGHT. Each of these options will cause animation preparation application [0057] 100 to tend towards selection of gestures and high-level behaviors which cause the character to orient toward a particular area of the web page or specific orientation with respect to the user. It will be understood that the above four options are provided by way of example only and many other options might be provided either in addition to or instead of the four options above. In other words, the options may reflect any particular behavior of the character which is preferred by the developer so long as the appropriate processing to accomplish the tendency is built into animation preparation application 100.
  • The options are employed in connection with characteristics in the selected audio file as well as randomization techniques (discussed below) in order to automatically choreograph the character. In addition, an edit field may be provided for the developer to enter a URL from which all character content should be retrieved at runtime. This URL may be different from the location of the audio files and is used as a security lock as discussed above. Again, at the bottom of the dialog box, the Help, Exit, Back and Finish buttons are provided. After the developer has completed the fourth dialog (or selected Finish in an earlier dialog), animation preparation application [0058] 100 has all of the information which it needs to automatically generate dynamic information for insertion into a web page.
  • As a result of the processing by animation preparation application [0059] 100, choreography file 138 is generated. This file may be converted to an event file using, for example, Real Network's WEVENTS.EXE in the case of a RealAudio R-AC File. RAC file 138 contains both a reference to the audio file and to a list of timed gestures in a “Gesture List” represented by segment files 105. In the default case, RAC file 138 is hidden from the developer and automatically compiled for use with character assets contained in BIN control file 108. The segment files 105 and BIN control file 108 may be collectively compiled for immediate use in connection with HTML Page File 115. Alternatively, the developer can choose to edit the gestures in the segment files 105 111 order to manually control character behavior as discussed above.
  • The RAC file [0060] 138 contains a series of references to gestures which are contained in the segment files 105. Each gesture reference is represented in RAC file 138 as a function call with two parameters: gesture number (constant identifier) and duration. An example of a RAC file with a set of gesture references might be as follows:
    GestureList
    begin
    . . .
    Gesture(ARMS_UP, 2000)
    . . .
    end.
  • where ARMS-UP is a command to move the character's arms upward and 2000 is the total time allocated to the gesture in milliseconds. In this case, if the actual animation requires only 500 milliseconds to execute, then the character would preferably be disposed to hold the ARMS-UP position for an additional 1500 milliseconds. The use of a single entry point for gestures, rather than a different entry point for each gesture provides an open model for forward and backward compatibility between supported gestures and future streaming and control technologies. [0061]
  • The gesture list contained in RAC file [0062] 138 is automatically serviced by browser control 210 based upon callback events generated by browser control 210. Each gesture, mouth and event commands are interpreted by browser control 210 essentially in real time, causing the animation to play in a manner that the user can perceive as being synchronous with the audio stream and external event messages broadcast (i.e. dynamic override events from user/JAVA control).
  • By way of example, the developer may desire that the character point to a book image on the character's left at the moment when the dialog says “—and here is the book that you have been looking for.” This action could be accomplished by changing an “ARMS-UP” gesture parameter to “ARMS-LEFT”. If the developer wanted the new gesture to hold longer, subsequent gesture parameters could also be changed or simply deleted and duration parameters adjusted to maintain synchronization with the dialog that follows. [0063]
  • This adjustment is illustrated as follows. Assuming that automatic general animation preparation application [0064] 100 generated the following:
    GestureList
    begin
    . . .
    Gesture(ARMS_UP, 2000)
    Gesture(ARMS_DOWN, 500)
    Gesture(EYES_BLINK, 1000)
    Gesture(ARMS_CROSS, 3000)
    . . .
    end
  • The following represents manual modification to achieve the desired result: [0065]
    GestureList
    begin
    . . .
    Gesture(BODY_LEFT, 1000)
    Gesture(ARMS_LEFT, 3000)
    Gesture(EYES_BLINK, 2500)
    . . .
    end
  • In the margin next to the gesture commands, phonetic information can be added as comments (using a predetermined delimiter) to help identify spoken dialog and timing. [0066]
  • In a typical case, there are six mouth positions used to express dialog. The mouth positions and duration are also written to RAC file [0067] 138 as a list of commands interleaved with the gesture commands. For example, mouth positions in RAC file 138 may be represented as follows:
    array GestureList
    begin
    . . .
    Gesture(ARMS_UP, 2000)
    Mouth(LIP_A, 250)
    Mouth(LIP_C, 350)
    Mouth(LIP_B, 250)
    Mouth(LIP_A, 450)
    Gesture(ARMS-DOWN, 500)
    . . .
    end
  • where LIP-A, LIP-B and LIP_C represent particular mouth positions and the number following the position is the length of time such mouth position is held. It should be noted that the generic mouth positions indicated must be converted into logical/physical mouth positions dynamically to correspond to the gesture pose in effect at any moment. In the above example, the mouth positions A, C, B, A should be changed into specific bitmaps depending on which gesture is being displayed, and composited onto the other character layers. This is discussed in greater detail below, [0068]
  • For each character, a mapping of phonemes to lip positions is also necessary to account for differences between character personality features. This map file should be included with each character's assets and used to convert recognized phonemes into appropriate mouth positions. [0069]
  • In an illustrative embodiment, there are on the order of and preferably at least ten characters (and their associated gestures) in [0070] character database 135 for use by animation preparation application 100, Each character is preferably produced to a template—gesture for gesture. The template is set up to be general purpose and to include by way of example the following basic gestures:
    1) Face Front (mouth positions active)
    2) Face Left (flip for Face Right) (mouth positions active)
    3) Face Rear Left (flip for Face Rear (mouth positions active)
    Right)
    4) Foreshorten to Camera (mouth positions active)
    5) Walk Cycle Left
    6) Am-is Down
    7) Arms Up
    8) Arms Left (flip for Arms Right)
    9) Arms Cross
    10)  Arms Out (to implore or stop)
  • As would be understood by one of ordinary skill in the art, other gestures may be added or substituted for the above gestures. In addition, each locked head position should have the standard six mouth positions as well as eye blinks and common eye emotions such as eyes questioning, eyes concentrating, and eyes happy. All gestures can be animated forward or backward to central hookup position(s). Each character, as discussed above, preferably has small, medium and large size versions of a given gesture which utilize less to more cels. In addition, each pose which has active mouth positions preferably includes a corresponding set of planned behaviors and timings. This is discussed in further detail below. [0071]
  • There are several key elements which make automatic and dynamic choreography of characters possible using [0072] browser control 210 and animation preparation application 100:
  • 1. Audio encoding SDK integration; [0073]
  • 2. Automatic phoneme recognition; [0074]
  • 3. [0075] Browser control 210 script language and compiler;
  • 4. Templated character gestures; [0076]
  • 5. Gesture asset segmentation; [0077]
  • 5. Behavior generation (scene-planned gestures); and [0078]
  • 7. Dynamic Control of Gestures [0079]
  • Each one of these key features is now discussed. [0080]
  • RealAudio SDK (or Alternative SDKs) [0081]
  • Although the following description relates to the use of the RealAudio player SDK, it will be understood that the system and method described herein can alternatively employ any of the following or similar SDKs: NetShow, VDO, VivoActive, Liquid Audio or XING Streamworks. [0082]
  • The RealAudio player SDK should be used as necessary to provide audio streaming at various bit-rates and to maintain synchronization of character animation with [0083] audio stream file 195. Character choreography is delivered via RAC file 138, which is automatically created by animation preparation application 100 and which can be converted to a RealAudio event file (or similar event file).
  • Automatic Phoneme Recognition [0084]
  • Speech recognition and emotive libraries and source code can be employed to provide automatic phoneme recognition using, for example, Voxware, VPI or AT&T Watson software. In addition, any SAPI compliant text-to-speech processor (such as Lernout & Hauspie's TrueVoice) can be used to process dialog text (entered into animation preparation application [0085] 100) into phonemes for greater precision in synchronizing mouth positions to a dialog stream. In the event that a SAPI compliant processor is not installed, a dictionary providing mappings from common words to phonemes can be used. Browser control 210 is also preferably configured to provide mapping of the 40 phonemes onto one of 6 mouth positions. However, the map in this case should be intrinsic with each character in that some characters may have different or more or less mouth positions.
  • Templated Character Gestures [0086]
  • All characters preferably share the same number and type of basic gestures and the same number and type of scene-planned behaviors, but do not necessarily require the same number of animation cels. [0087]
  • Each character's possible gestures and behaviors are “queues” contained in it's [0088] own browser control 210 include file. These queues are character specific animation commands, including logical cel names and hold counts. This method frees the artist to use as many cels as desired to express the character for each basic gesture or composite behavior. In addition, new or replacement gestures can be added in the future without concern for backward compatibility in browser control 210.
  • Gesture Asset Segmentation and Version Control [0089]
  • Character assets (bitmaps) for each basic gesture are compiled into a separate segment file [0090] 105 to enable separate downloading based on the size model selected (i.e., small, medium or large). For the small model, only a few gesture bitmaps and queues arc needed. For the large model, all of the queues and bitmaps are necessary, Segmenting each basic gesture into its own segment file 205 enables selective downloading of assets. In this way, gestures can be accumulated in three successive dialog streams to create a “better than bandwidth experience”.
  • All characters and content are made compatible with future browser control versions by locking subsequent browser control commands entry points in the runtime interpreter. New commands can simply be appended to support newer features. Old browser controls should also support newer content by simply ignoring new gesture commands. [0091]
  • For versioning between server and client, JNF file [0092] 112 is used to identify client browser control 210 version and versions of any resource segment files 105 or necessary plug-ins.
  • Behavior Generation [0093]
  • Automatically generating choreography from a set of gestures requires both a library of scene-planned behaviors and several input parameters. As discussed above, each character has small, medium and large size versions (as well as other possible sized versions) of a given gesture and each pose having mouth positions has a set of planned behaviors and timings. [0094]
  • For instance, the FACE-FRONT behavior in medium model might have five versions, each lasting approximately 2000 milliseconds which can be applied automatically in any combination to fill time available until the next gesture trigger. It follows that, in this case, there would be 3 sizes×5 behaviors=15 possible behavior queues for FACE-FRONT. In general, [0095] browser control 210 selects one or more of the behavior queues for a given size model to fill the time to the next behavior trigger. These options can be driven by browser control 210 using the following table:
    FACE_FRONT Behavior Table
    \small medium large
    SILENCE I FFIS[] FF1M[] FF1L[]
    SPEAK_SOFT I FF2S[] FF2M[] FF2L[]
    SPEAK_LOUD I FF3S[] FF3M[] FF3L[]
    SPEAK_SHORT I FF4S[] FF4M[] FF4L[]
    SPEAK_LONG I FF5S[] FF5M[] FF5L[]
  • Each of the entries in the above table can either be a pointer to another table or simply another table dimension containing several variants of each behavior row. For instance, the SPEAK_LOUD row, medium model column entry FF3M[ ] might have the following 3 scene-planned behaviors composed of primitive gestures: [0096]
  • FACE FRONT, SPEAK LOUD. Medium model [0097]
  • FF3MA FACE_FRONT, EYES_BLINK, ARMS_UP, ARMS_DOWN [0098]
  • FF3MB FACE_FRONT, ARMS_UP, ARMS_DOWN, HAND_POINT [0099]
  • FF3MC FACE_FRONT, HAND_POINT, EYES_BLINK, ARMS_UP [0100]
  • In this way, selection of gestures and high-level behaviors are based on a combination of user options, gesture triggers and randomization. For example, the second entry in this table, FF3MB, might be generated as follows: [0101]
    array GestureList
    begin
    . . .
    Gesture(FF3MB, 15000) ;FACE_FRONT_2, SPEAK_LOUD, medium
    model, 2nd
    ;version randomly selected.
    . . .
    end
  • In this example, [0102] browser control 210 would have 15 seconds of time to fill before the next gesture trigger. To fill this time with action, the behavior duration is adjusted automatically by browser control 210 by varying the hold value for the last cel of each gesture in the behavior. The additional hold values can be calculated as a randomized percentage of the required duration minus the cumulative gesture animation times. Whenever a duration exceeds some prescribed length of time (per artist) without encountering a gesture trigger, browser control 210 selects another random entry for the same criteria (say, FF3MQ and adds this to the gesture list.
  • Each character can have 10 basic gestures with 4 of these poses having 6 mouth positions. In addition, all characters can share the same 45 scene-planned behaviors. However, additional entries can be scene planned to create more variety in behavior if needed. During any significant period of silence or at the end of a dialog stream, [0103] browser control 210 preferably places the character into an ambient wait state that cycles through various randomized gestures. This is indicated by the command Gesture(FF_WAlT,−1).
  • Interactive Controls [0104]
  • Several additional commands can be made available for manual insertion into the gesture list, These commands cause event triggers based on user input, including mouse browse, mouse click and keyboard input. [0105]
  • To handle keyboard, mouse or browse events, the developer might insert the commands: [0106]
  • Key (<COMMAND_NAME>, <DURATION>) [0107]
  • Mouse (<COMMAND_NAME>, <DURATION>) [0108]
  • Browse (<COMMAND_NAME>, <DURATION>) [0109]
  • at any point in the gesture list to cause a particular gesture or URL link to occur immediately. The assumption is that all events relate to a single character, so a character name parameter is not necessary. The COMMAND_NAME can be a basic gesture, high-level behavior or text string containing a URL. The DURATION applies only to gestures or behaviors and can be either a specific period of time in milliseconds or −I (LOOP constant) to indicate infinite loop. [0110]
  • For example, browsing a character might cause it to point to a banner ad in the web page. Alternatively, the command could be a text string containing a URL linking to another web page (causing a launch of that page) or a URL to a dialog control file to retrieve and launch. The control file URL points to INF file [0111] 112 describing content and/or control file to retrieve and initiate. If all files are already cached on the client, then control file 108 is processed immediately to stream the desired dialog. This format is extensible to support a variety of other commands or conditions in future versions.
  • Dynamic Mapping of Gestures [0112]
  • At times, a developer may wish to cause specific character actions to occur under the control of HTML, Java, VB Script or server CGI script or some other software. In the case of HTML, different characters and dialogs are launched by passing .INF file [0113] 112 URL as a parameter to browser control 210. To launch anew dialog, the HTML simply needs to set the parameter to the desired content to cause it to begin streaming. This is accomplished either upon loading of the web page or is based on logic contained in embedded Java or VB script. Using embedded Java or VB script logic, events triggered from user input or built-in logic conditions can launch different character dialogs.
  • In addition to launching specific character dialogs from embedded script, a mechanism is provided for triggering specific character gestures or behaviors, Numeric values can be passed into [0114] browser control 210 corresponding to each of the supported gestures and behaviors, causing that action to be performed. For instance, a fragment of Java code in an HTML page can call browser control 210 with the parameter “2” (corresponding to the FACE-LEFT gesture command) to cause the character to unconditionally animate to face left. The Java fragment can be called from other Java script processing mouse events for a button or picture. In fact, entire applications can be written in Java or VB to create any number of control programs for characters.
  • There is no significant problem triggering a different gesture than the one currently executing, other than a possible “snap to position” that would occur if the character was in a non-hookup state. The potential difficulties involve dynamic selection of mouth positions for a given pose and the possibility of falling out of synch with the dialog stream. [0115]
  • The design of the system and method described herein accounts for the dynamic mapping of mouth positions to gestures. The present system and method assume that the two are maintained separately, lips z-ordered above gestures in the proper position, and played together at runtime by [0116] browser control 210. To support dynamic mapping of mouth positions, a method is provided to process gesture requests—either from the gesture list or through a dynamic gesture request—and automatically select and composite the mouth position cels corresponding to the current gesture ANIMATE queue command. More specifically, since the gesture list and mouth positions list generate asynchronous requests, a mechanism services the requests, composites them and animates them in real time.
  • The solution to “falling out of synch” lies in remembering where the character should be in time synchronous with the continuing dialog stream. [0117] Browser control 210 handles this by holding the last cel of the interrupting gesture as needed and jumping to the next gesture in time in order to catch back up with the gesture list and dialog stream. If this case is not handled, the animation can remain out of synch for the remainder of the dialog.
  • Another issue that arises when dynamic override of a character's choreography occurs is the possibility of triggering animation which does not have mouth positions associated with it. It is the responsibility of the developer to take this possibility into account and not trigger a move which would not have mouth positions associated. Given the choice of gestures or behaviors which have mouth positions, [0118] browser control 210 is likely able to dynamically select the correct mouth positions synchronous to the audio stream as it would normally do for any pre-produced gesture.
  • While the description above illustrates mainly animation related to Internet applications, it should be clear that the same or similar animation can be embedded into other Windows and similar applications that use facilities such as the Microsoft Object Linking and Embedding (“OLE”). Examples include animated talking characters in software such as Microsoft Powerpoint presentations, Microsoft Word, and other application software. The animation described above can be played back on a standalone window that can float on the desktop as seen by the user on a computer monitor outside of other application software such as an Internet browser or a productivity application. [0119]
  • Further, while particular illustrative embodiments have been described and illustrated, it should be understood that the system and method described herein are not limited thereto since modifications may be made by persons skilled in the aft while still falling within the scope and spirit of the appended claims. [0120]

Claims (1)

We claim:
1. An automatic animation generation system using a talking character lip-synched to speech, comprising: a developer terminal, a character database containing data representative of at least one character and at least one gesture associated with each said character; and an animation preparation application, said animation preparation application being in communication with said developer terminal and said character database and said animation preparation application generating a plurality of output files representative of animation displaying a talking character lip-synched to speech and to gestures thereof.
US09/112,692 1998-02-26 1998-07-09 System and method for automatic animation generation Expired - Lifetime US6433784B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/112,692 US6433784B1 (en) 1998-02-26 1998-07-09 System and method for automatic animation generation

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/031,488 US6636219B2 (en) 1998-02-26 1998-02-26 System and method for automatic animation generation
US09/112,692 US6433784B1 (en) 1998-02-26 1998-07-09 System and method for automatic animation generation

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/031,488 Continuation-In-Part US6636219B2 (en) 1998-02-26 1998-02-26 System and method for automatic animation generation

Publications (2)

Publication Number Publication Date
US20020097244A1 true US20020097244A1 (en) 2002-07-25
US6433784B1 US6433784B1 (en) 2002-08-13

Family

ID=46276250

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/112,692 Expired - Lifetime US6433784B1 (en) 1998-02-26 1998-07-09 System and method for automatic animation generation

Country Status (1)

Country Link
US (1) US6433784B1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020095332A1 (en) * 2001-01-16 2002-07-18 Doherty Timothy K. Internet advertisement system and method
US20030083854A1 (en) * 2001-10-26 2003-05-01 Cronin Thomas M. Particle control using a path
US20050168485A1 (en) * 2004-01-29 2005-08-04 Nattress Thomas G. System for combining a sequence of images with computer-generated 3D graphics
US20060029913A1 (en) * 2004-08-06 2006-02-09 John Alfieri Alphabet based choreography method and system
US20070059676A1 (en) * 2005-09-12 2007-03-15 Jinnyeo Jeong Interactive animation for entertainment and instruction using networked devices
US20070226621A1 (en) * 2000-09-08 2007-09-27 Porto Ranelli, Sa Computerized advertising method and system
US20100146393A1 (en) * 2000-12-19 2010-06-10 Sparkpoint Software, Inc. System and method for multimedia authoring and playback
US7921136B1 (en) * 2004-03-11 2011-04-05 Navteq North America, Llc Method and system for using geographic data for developing scenes for entertainment features
US20120026174A1 (en) * 2009-04-27 2012-02-02 Sonoma Data Solution, Llc Method and Apparatus for Character Animation
CN103346172A (en) * 2013-06-08 2013-10-09 英利集团有限公司 Hetero-junction solar battery and preparation method thereof
CN103531647A (en) * 2013-10-25 2014-01-22 英利集团有限公司 Heterojunction photovoltaic cell and preparation method thereof
CN109308730A (en) * 2018-09-10 2019-02-05 尹岩 A kind of action planning system based on simulation
US10672284B2 (en) 2011-06-24 2020-06-02 Breakthrough Performance Tech, Llc Methods and systems for dynamically generating a training program

Families Citing this family (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8352400B2 (en) 1991-12-23 2013-01-08 Hoffberg Steven M Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore
US7284187B1 (en) * 1997-05-30 2007-10-16 Aol Llc, A Delaware Limited Liability Company Encapsulated document and format system
JP3384314B2 (en) * 1997-12-02 2003-03-10 ヤマハ株式会社 Tone response image generation system, method, apparatus, and recording medium therefor
US7904187B2 (en) 1999-02-01 2011-03-08 Hoffberg Steven M Internet appliance system and method
US6714202B2 (en) * 1999-12-02 2004-03-30 Canon Kabushiki Kaisha Method for encoding animation in an image file
CA2395207A1 (en) * 1999-12-23 2001-06-28 M.H. Segan Limited Partnership System for viewing content over a network and method therefor
DE10018143C5 (en) * 2000-04-12 2012-09-06 Oerlikon Trading Ag, Trübbach DLC layer system and method and apparatus for producing such a layer system
JP4547768B2 (en) * 2000-04-21 2010-09-22 ソニー株式会社 Information processing apparatus and method, and recording medium
JP3796397B2 (en) * 2000-06-15 2006-07-12 アイ・ティー・エックス翼ネット株式会社 Vehicle transaction system and vehicle transaction method
JP2002032009A (en) * 2000-07-14 2002-01-31 Sharp Corp Virtual character fostering server, virtual character fostering method and mechanically readable recording medium recorded with program realizing the method
JP2002041276A (en) * 2000-07-24 2002-02-08 Sony Corp Interactive operation-supporting system, interactive operation-supporting method and recording medium
JP2002109560A (en) * 2000-10-02 2002-04-12 Sharp Corp Animation reproducing unit, animation reproducing system, animation reproducing method, recording medium readable by computer storing program for executing animation reproducing method
US7349946B2 (en) * 2000-10-02 2008-03-25 Canon Kabushiki Kaisha Information processing system
US20080040227A1 (en) * 2000-11-03 2008-02-14 At&T Corp. System and method of marketing using a multi-media communication system
US6976082B1 (en) 2000-11-03 2005-12-13 At&T Corp. System and method for receiving multi-media messages
US6963839B1 (en) 2000-11-03 2005-11-08 At&T Corp. System and method of controlling sound in a multi-media communication application
US7203648B1 (en) 2000-11-03 2007-04-10 At&T Corp. Method for sending multi-media messages with customized audio
US6990452B1 (en) 2000-11-03 2006-01-24 At&T Corp. Method for sending multi-media messages using emoticons
US7091976B1 (en) 2000-11-03 2006-08-15 At&T Corp. System and method of customizing animated entities for use in a multi-media communication application
US6448483B1 (en) * 2001-02-28 2002-09-10 Wildtangent, Inc. Dance visualization of music
US7019741B2 (en) * 2001-03-23 2006-03-28 General Electric Company Methods and systems for simulating animation of web-based data files
US20020171746A1 (en) * 2001-04-09 2002-11-21 Eastman Kodak Company Template for an image capture device
US7671861B1 (en) * 2001-11-02 2010-03-02 At&T Intellectual Property Ii, L.P. Apparatus and method of customizing animated entities for use in a multi-media communication application
US7315820B1 (en) * 2001-11-30 2008-01-01 Total Synch, Llc Text-derived speech animation tool
JP3826073B2 (en) * 2002-06-05 2006-09-27 キヤノン株式会社 Screen saver creation system and method
US7827034B1 (en) 2002-11-27 2010-11-02 Totalsynch, Llc Text-derived speech animation tool
WO2004066200A2 (en) * 2003-01-17 2004-08-05 Yeda Research And Development Co. Ltd. Reactive animation
US7260539B2 (en) * 2003-04-25 2007-08-21 At&T Corp. System for low-latency animation of talking heads
US7173623B2 (en) * 2003-05-09 2007-02-06 Microsoft Corporation System supporting animation of graphical display elements through animation object instances
US7360151B1 (en) * 2003-05-27 2008-04-15 Walt Froloff System and method for creating custom specific text and emotive content message response templates for textual communications
US8442331B2 (en) 2004-02-15 2013-05-14 Google Inc. Capturing text from rendered documents using supplemental information
US7707039B2 (en) 2004-02-15 2010-04-27 Exbiblio B.V. Automatic modification of web pages
US10635723B2 (en) 2004-02-15 2020-04-28 Google Llc Search engines and systems with handheld document data capture devices
US7812860B2 (en) 2004-04-01 2010-10-12 Exbiblio B.V. Handheld device for capturing text from both a document printed on paper and a document displayed on a dynamic display device
US9116890B2 (en) 2004-04-01 2015-08-25 Google Inc. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US7894670B2 (en) 2004-04-01 2011-02-22 Exbiblio B.V. Triggering actions in response to optically or acoustically capturing keywords from a rendered document
US9143638B2 (en) 2004-04-01 2015-09-22 Google Inc. Data capture from rendered documents using handheld device
US7990556B2 (en) 2004-12-03 2011-08-02 Google Inc. Association of a portable scanner with input/output and storage devices
US8081849B2 (en) 2004-12-03 2011-12-20 Google Inc. Portable scanning and memory device
US20060081714A1 (en) 2004-08-23 2006-04-20 King Martin T Portable scanning device
US9008447B2 (en) 2004-04-01 2015-04-14 Google Inc. Method and system for character recognition
US20060098900A1 (en) 2004-09-27 2006-05-11 King Martin T Secure data gathering from rendered documents
US8146156B2 (en) 2004-04-01 2012-03-27 Google Inc. Archive of text captures from rendered documents
US8713418B2 (en) 2004-04-12 2014-04-29 Google Inc. Adding value to a rendered document
US8620083B2 (en) 2004-12-03 2013-12-31 Google Inc. Method and system for character recognition
US8874504B2 (en) 2004-12-03 2014-10-28 Google Inc. Processing techniques for visual capture data from a rendered document
US8489624B2 (en) 2004-05-17 2013-07-16 Google, Inc. Processing techniques for text capture from a rendered document
US8346620B2 (en) 2004-07-19 2013-01-01 Google Inc. Automatic modification of web pages
US20060109273A1 (en) * 2004-11-19 2006-05-25 Rams Joaquin S Real-time multi-media information and communications system
US7601904B2 (en) * 2005-08-03 2009-10-13 Richard Dreyfuss Interactive tool and appertaining method for creating a graphical music display
US20080284848A1 (en) * 2005-08-26 2008-11-20 Peter Martin Security surveillance planning tool kit
EP2067119A2 (en) 2006-09-08 2009-06-10 Exbiblio B.V. Optical scanners, such as hand-held optical scanners
US8547396B2 (en) * 2007-02-13 2013-10-01 Jaewoo Jung Systems and methods for generating personalized computer animation using game play data
US20090046097A1 (en) * 2007-08-09 2009-02-19 Scott Barrett Franklin Method of making animated video
US8638363B2 (en) 2009-02-18 2014-01-28 Google Inc. Automatically capturing information, such as capturing information using a document-aware device
US20090201298A1 (en) * 2008-02-08 2009-08-13 Jaewoo Jung System and method for creating computer animation with graphical user interface featuring storyboards
US9589381B2 (en) * 2008-06-12 2017-03-07 Microsoft Technology Licensing, Llc Copying of animation effects from a source object to at least one target object
US20100146388A1 (en) * 2008-12-05 2010-06-10 Nokia Corporation Method for defining content download parameters with simple gesture
US8447066B2 (en) 2009-03-12 2013-05-21 Google Inc. Performing actions based on capturing information from rendered documents, such as documents under copyright
DE202010018551U1 (en) 2009-03-12 2017-08-24 Google, Inc. Automatically deliver content associated with captured information, such as information collected in real-time
US9081799B2 (en) 2009-12-04 2015-07-14 Google Inc. Using gestalt information to identify locations in printed information
US9323784B2 (en) 2009-12-09 2016-04-26 Google Inc. Image search using text-based elements within the contents of images
US9286383B1 (en) 2014-08-28 2016-03-15 Sonic Bloom, LLC System and method for synchronization of data and audio
US11130066B1 (en) 2015-08-28 2021-09-28 Sonic Bloom, LLC System and method for synchronization of messages and events with a variable rate timeline undergoing processing delay in environments with inconsistent framerates
US10580187B2 (en) 2018-05-01 2020-03-03 Enas TARAWNEH System and method for rendering of an animated avatar

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5111409A (en) * 1989-07-21 1992-05-05 Elon Gasper Authoring and use systems for sound synchronized animation

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070226621A1 (en) * 2000-09-08 2007-09-27 Porto Ranelli, Sa Computerized advertising method and system
US20100146393A1 (en) * 2000-12-19 2010-06-10 Sparkpoint Software, Inc. System and method for multimedia authoring and playback
US10127944B2 (en) * 2000-12-19 2018-11-13 Resource Consortium Limited System and method for multimedia authoring and playback
US20020095332A1 (en) * 2001-01-16 2002-07-18 Doherty Timothy K. Internet advertisement system and method
US20030083854A1 (en) * 2001-10-26 2003-05-01 Cronin Thomas M. Particle control using a path
US20050168485A1 (en) * 2004-01-29 2005-08-04 Nattress Thomas G. System for combining a sequence of images with computer-generated 3D graphics
US7921136B1 (en) * 2004-03-11 2011-04-05 Navteq North America, Llc Method and system for using geographic data for developing scenes for entertainment features
US20060029913A1 (en) * 2004-08-06 2006-02-09 John Alfieri Alphabet based choreography method and system
WO2007033076A3 (en) * 2005-09-12 2007-11-22 Jinnyeo Jeong Interactive animation for entertainment and instruction using networked devices
US20070059676A1 (en) * 2005-09-12 2007-03-15 Jinnyeo Jeong Interactive animation for entertainment and instruction using networked devices
US20120026174A1 (en) * 2009-04-27 2012-02-02 Sonoma Data Solution, Llc Method and Apparatus for Character Animation
US10672284B2 (en) 2011-06-24 2020-06-02 Breakthrough Performance Tech, Llc Methods and systems for dynamically generating a training program
US11145216B2 (en) 2011-06-24 2021-10-12 Breakthrough Performancetech, Llc Methods and systems for dynamically generating a training program
US11769419B2 (en) 2011-06-24 2023-09-26 Breakthrough Performancetech, Llc Methods and systems for dynamically generating a training program
CN103346172A (en) * 2013-06-08 2013-10-09 英利集团有限公司 Hetero-junction solar battery and preparation method thereof
CN103531647A (en) * 2013-10-25 2014-01-22 英利集团有限公司 Heterojunction photovoltaic cell and preparation method thereof
CN109308730A (en) * 2018-09-10 2019-02-05 尹岩 A kind of action planning system based on simulation

Also Published As

Publication number Publication date
US6433784B1 (en) 2002-08-13

Similar Documents

Publication Publication Date Title
US6433784B1 (en) System and method for automatic animation generation
US6636219B2 (en) System and method for automatic animation generation
Rist et al. Adding animated presentation agents to the interface
US6181351B1 (en) Synchronizing the moveable mouths of animated characters with recorded speech
US5613056A (en) Advanced tools for speech synchronized animation
US7035803B1 (en) Method for sending multi-media messages using customizable background images
US10372790B2 (en) System, method and apparatus for generating hand gesture animation determined on dialogue length and emotion
US20080040227A1 (en) System and method of marketing using a multi-media communication system
US20010033296A1 (en) Method and apparatus for delivery and presentation of data
RU2259588C2 (en) Method and system for computerized advertisement
US20020007276A1 (en) Virtual representatives for use as communications tools
US20030028380A1 (en) Speech system
US20100114579A1 (en) System and Method of Controlling Sound in a Multi-Media Communication Application
CN110858408A (en) Animation production system
JP2003521750A (en) Speech system
CN101491089A (en) Embedded metadata in a media presentation
Morris Multimedia systems: Delivering, generating and interacting with multimedia
Kunc et al. ECAF: Authoring language for embodied conversational agents
US20020099549A1 (en) Method for automatically presenting a digital presentation
AU2021366670A1 (en) Conversion of text to dynamic video
AU2009223616A1 (en) Photo realistic talking head creation, content creation, and distribution system and method
Nowina-Krowicki et al. ENGAGE: Automated Gestures for Animated Characters
Yang et al. MPML-FLASH: A multimodal presentation markup language with character agent control in flash medium
Kunc et al. Talking head as life blog
Bernsen 2.4 Data resources from the SmartKom project

Legal Events

Date Code Title Description
AS Assignment

Owner name: 7TH LEVEL, INC., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MERRICK, RICHARD;THENHAUS, MICHAEL;BELL, WESLEY;AND OTHERS;REEL/FRAME:009476/0157

Effective date: 19980910

AS Assignment

Owner name: LEARN2 CORPORATION, CONNECTICUT

Free format text: MERGER AND CHANGE OF NAME;ASSIGNOR:7TH LEVEL, INC.;REEL/FRAME:013032/0050

Effective date: 19990714

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: LEARN.COM, INC., FLORIDA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEARN2 CORPORATION;REEL/FRAME:013496/0916

Effective date: 20020809

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: SILICON VALLEY BANK, GEORGIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:LEARN.COM, INC.;REEL/FRAME:018015/0782

Effective date: 20060728

AS Assignment

Owner name: SILICON VALLEY BANK, CALIFORNIA

Free format text: SECURITY AGREEMENT;ASSIGNOR:LEARN.COM, INC.;REEL/FRAME:021998/0981

Effective date: 20081125

AS Assignment

Owner name: LEARN.COM INC, FLORIDA

Free format text: RELEASE;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:023003/0462

Effective date: 20090723

Owner name: LEARN.COM INC, FLORIDA

Free format text: RELEASE;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:023003/0449

Effective date: 20090723

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: L2 TECHNOLOGY, LLC, DELAWARE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEARN.COM, INC.;REEL/FRAME:024933/0147

Effective date: 20100830

FEPP Fee payment procedure

Free format text: PAYER NUMBER DE-ASSIGNED (ORIGINAL EVENT CODE: RMPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: AFLUO, LLC, DELAWARE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:L 2 TECHNOLOGY, LLC;REEL/FRAME:027029/0727

Effective date: 20110930

FPAY Fee payment

Year of fee payment: 12