US20070115920A1 - Dialog authoring and execution framework - Google Patents
Dialog authoring and execution framework Download PDFInfo
- Publication number
- US20070115920A1 US20070115920A1 US11/253,047 US25304705A US2007115920A1 US 20070115920 A1 US20070115920 A1 US 20070115920A1 US 25304705 A US25304705 A US 25304705A US 2007115920 A1 US2007115920 A1 US 2007115920A1
- Authority
- US
- United States
- Prior art keywords
- dialog
- communication
- interface
- computer
- message
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/107—Computer-aided management of electronic mailing [e-mailing]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L51/00—User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
Definitions
- the applications include contact center self-service applications such as call routing and customer account/personal information access.
- Other contact center applications are possible including travel reservations, financial and stock applications and customer relationship management.
- information technology groups can benefit from applications in the areas of sales and field-service automation, E-commerce, auto-attendants, help desk password reset applications and speech-enabled network management, for example.
- a framework to author and execute dialog applications is utilized in a communication architecture.
- the applications can be used with a plurality of different modes of communication.
- a message processed by the dialog application is used to determine a dialog state and provide an associated response.
- FIG. 1 is a front view of an exemplary mobile device.
- FIG. 2 is a block diagram of functional components for the mobile device of FIG. 1 .
- FIG. 3 is a front view of an exemplary phone.
- FIG. 4 is a block diagram of a general computing environment.
- FIG. 5 is a block diagram of a communication architecture for handling communication messages.
- FIG. 6 is a diagram of a plurality of dialog states.
- FIG. 7 is a block diagram of components in a user interface.
- FIG. 8 is a flow diagram of a method for handling communication messages.
- FIG. 1 An exemplary form of a data management mobile device 30 is illustrated in FIG. 1 .
- the mobile device 30 includes a housing 32 and has a user interface including a display 34 , which uses a contact sensitive display screen in conjunction with a stylus 33 .
- the stylus 33 is used to press or contact the display 34 at designated coordinates to select a field, to selectively move a starting position of a cursor, or to otherwise provide command information such as through gestures or handwriting.
- one or more buttons 35 can be included on the device 30 for navigation.
- other input mechanisms such as rotatable wheels, rollers or the like can also be provided.
- Another form of input can include a visual input such as through computer vision.
- FIG. 2 a block diagram illustrates the functional components comprising the mobile device 30 .
- a central processing unit (CPU) 50 implements the software control functions.
- CPU 50 is coupled to display 34 so that text and graphic icons generated in accordance with the controlling software appear on the display 34 .
- a speaker 43 can be coupled to CPU 50 typically with a digital-to-analog converter 59 to provide an audible output.
- RAM random access memory
- ROM read only memory
- ROM 58 can also be used to store the operating system software for the device that controls the basic functionality of the mobile device 30 and other operating system kernel functions (e.g., the loading of software components into RAM 54 ).
- RAM 54 also serves as storage for the code in the manner analogous to the function of a hard drive on a PC that is used to store application programs. It should be noted that although non-volatile memory is used for storing the code, it alternatively can be stored in volatile memory that is not used for execution of the code.
- Wireless signals can be transmitted/received by the mobile device through a wireless transceiver 52 , which is coupled to CPU 50 .
- An optional communication interface 60 can also be provided for downloading data directly from a computer (e.g., desktop computer), or from a wired network, if desired. Accordingly, interface 60 can comprise various forms of communication devices, for example, an infrared link, modem, a network card, or the like.
- Mobile device 30 includes a microphone 29 , an analog-to-digital (A/D) converter 37 , and an optional recognition program (speech, DTMF, handwriting, gesture or computer vision) stored in store 54 .
- A/D analog-to-digital
- recognition program speech, DTMF, handwriting, gesture or computer vision
- microphone 29 provides speech signals, which are digitized by A/D converter 37 .
- the speech recognition program can perform normalization and/or feature extraction functions on the digitized speech signals to obtain intermediate speech recognition results.
- speech and other data can be transmitted remotely, for example to an agent.
- a remote speech server can be utilized.
- Recognition results can be returned to mobile device 30 for rendering (e.g. visual and/or audible) thereon, and eventual transmission to the agent, wherein the agent and mobile device 30 interact based on communication messages.
- handwriting input can be digitized with or without pre-processing on device 30 .
- this form of input can be transmitted to a server for recognition wherein the recognition results are returned to at least one of the device 30 and/or a remote agent.
- DTMF data, gesture data and visual data can be processed similarly.
- device 30 (and the other forms of clients discussed below) would include necessary hardware such as a camera for visual input.
- FIG. 3 is a plan view of an exemplary embodiment of a portable phone 80 .
- the phone 80 includes a display 82 and a keypad 84 .
- the block diagram of FIG. 2 applies to the phone of FIG. 3 , although additional circuitry necessary to perform other functions may be required. For instance, a transceiver necessary to operate as a phone will be required for the embodiment of FIG. 2 ; however, such circuitry is not pertinent to the present invention.
- the agent is also operational with numerous other general purpose or special purpose computing systems, environments or configurations.
- Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, regular telephones (without any screen), personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, radio frequency identification (RFID) devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
- RFID radio frequency identification
- FIG. 4 The following is a brief description of a general purpose computer 120 illustrated in FIG. 4 .
- the computer 120 is again only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computer 120 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated therein.
- the invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer.
- program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
- the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
- program modules may be located in both local and remote computer storage media including memory storage devices. Tasks performed by the programs and modules are described below and with the aid of figures.
- processor executable instructions which can be written on any form of a computer readable medium.
- components of computer 120 may include, but are not limited to, a processing unit 140 , a system memory 150 , and a system bus 141 that couples various system components including the system memory to the processing unit 140 .
- the system bus 141 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures.
- bus architectures include Industry Standard Architecture (ISA) bus, Universal Serial Bus (USB), Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.
- Computer 120 typically includes a variety of computer readable mediums.
- Computer readable mediums can be any available media that can be accessed by computer 120 and includes both volatile and nonvolatile media, removable and non-removable media.
- Computer readable mediums may comprise computer storage media and communication media.
- Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computer 120 .
- Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media.
- modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
- communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, FR, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
- the system memory 150 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 151 and random access memory (RAM) 152 .
- ROM read only memory
- RAM random access memory
- BIOS basic input/output system
- RAM 152 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 140 .
- FIG. 4 illustrates operating system 54 , application programs 155 , other program modules 156 , and program data 157 .
- the computer 120 may also include other removable/non-removable volatile/nonvolatile computer storage media.
- FIG. 4 illustrates a hard disk drive 161 that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive 171 that reads from or writes to a removable, nonvolatile magnetic disk 172 , and an optical disk drive 175 that reads from or writes to a removable, nonvolatile optical disk 176 such as a CD ROM or other optical media.
- removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like.
- the hard disk drive 161 is typically connected to the system bus 141 through a non-removable memory interface such as interface 160
- magnetic disk drive 171 and optical disk drive 175 are typically connected to the system bus 141 by a removable memory interface, such as interface 170 .
- hard disk drive 161 is illustrated as storing operating system 164 , application programs 165 , other program modules 166 , and program data 167 . Note that these components can either be the same as or different from operating system 154 , application programs 155 , other program modules 156 , and program data 157 . Operating system 164 , application programs 165 , other program modules 166 , and program data 167 are given different numbers here to illustrate that, at a minimum, they are different copies.
- a user may enter commands and information into the computer 120 through input devices such as a keyboard 182 , a microphone 183 , and a pointing device 181 , such as a mouse, trackball or touch pad.
- Other input devices may include a joystick, game pad, satellite dish, scanner, or the like.
- a monitor 184 or other type of display device is also connected to the system bus 141 via an interface, such as a video interface 185 .
- computers may also include other peripheral output devices such as speakers 187 and printer 186 , which may be connected through an output peripheral interface 188 .
- the computer 120 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 194 .
- the remote computer 194 may be a personal computer, a hand-held device, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 120 .
- the logical connections depicted in FIG. 4 include a local area network (LAN) 191 and a wide area network (WAN) 193 , but may also include other networks.
- LAN local area network
- WAN wide area network
- Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
- the computer 120 When used in a LAN networking environment, the computer 120 is connected to the LAN 191 through a network interface or adapter 190 .
- the computer 120 When used in a WAN networking environment, the computer 120 typically includes a modem 192 or other means for establishing communications over the WAN 193 , such as the Internet.
- the modem 192 which may be internal or external, may be connected to the system bus 141 via the user input interface 180 , or other appropriate mechanism.
- program modules depicted relative to the computer 120 may be stored in the remote memory storage device.
- FIG. 4 illustrates remote application programs 195 as residing on remote computer 194 . It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
- GUI Graphical User Interface
- a well designed graphical user interface usually does not produce ambiguous references or require the underlying application to confirm a particular interpretation of the input received through the interface 180 .
- the interface is precise, there is typically no requirement that the user be queried further regarding the input, e.g., “Did you click on the ‘ok’ button?”
- an object model designed for a graphical user interface is very mechanical and rigid in its implementation.
- natural language In contrast to an input from a graphical user interface, a natural language query or command will frequently translate into not just one, but a series of function calls to the input object model.
- natural language is a communication means in which human interlocutors rely on each other's intelligence, often unconsciously, to resolve ambiguities. In fact, natural language is regarded as “natural” exactly because it is not mechanical. Human interlocutors can resolve ambiguities based upon contextual information and cues regarding any number of domains surrounding the utterance. With human interlocutors, the sentence, “Forward the minutes to those in the review meeting on Friday” is a perfectly understandable sentence without any further explanations. However, from the mechanical point of view of a machine, specific details must be specified such as exactly what document and which meeting are being referred to, and exactly to whom the document should be sent.
- FIG. 5 illustrates an exemplary communication architecture 200 with an agent 202 .
- Agent 202 receives communication requests and/or messages from an initiator and performs tasks based on the requests and/or messages. The messages can be routed to a destination.
- An initiator can include a person, a device, a telephone, a remote personal information manager, etc. that connects to agent 202 .
- the messages from the initiator can take many forms including real time voice (for example from a simple telephone or through a voice over Internet protocol source), real time text (such as instant messaging), non-real time voice (for example a voicemail message) and non-real time text (for example through short message service (SMS) or email). Tasks are automatically performed by agent 202 , for example responding to a customer care inquiry sent by an initiator.
- real time voice for example from a simple telephone or through a voice over Internet protocol source
- real time text such as instant messaging
- non-real time voice for example a voicemail message
- SMS short message service
- agent 202 can be implemented on a general purpose computer such as computer 120 discussed above.
- Agent 202 represents a single point of contact for a user dialog application. Thus, if a person wishes to interact with the dialog application, communication requests and messages are handled through agent 202 . In this manner, the person need not contact agent 202 using a particular device. The person only needs to contact agent 202 through any desired device, which handles and routes incoming communication requests and messages.
- agent 202 can contact agent 202 through a number of different modes of communication.
- agent 202 can be accessed through a client such as a mobile device 30 (which herein also represents other forms of computing devices having a display screen, a microphone, a camera, a touch sensitive panel, etc., as required based on the form of input), or through phone 80 wherein communication is made audibly or through tones generated by phone 80 in response to keys depressed and wherein information from agent 202 can be provided audibly back to the user.
- a client such as a mobile device 30 (which herein also represents other forms of computing devices having a display screen, a microphone, a camera, a touch sensitive panel, etc., as required based on the form of input), or through phone 80 wherein communication is made audibly or through tones generated by phone 80 in response to keys depressed and wherein information from agent 202 can be provided audibly back to the user.
- a client such as a mobile device 30 (which herein also represents other forms of computing devices having a
- agent 202 is unified in that whether information is obtained through device 30 or phone 80 , agent 202 can support either mode of operation.
- Agent 202 is operably coupled to multiple interfaces to receive communication messages.
- agent 202 can provide a response to different types of devices based on a mode of communication for the device.
- IP interface 204 receives and transmits information using packet switching technologies, for example using TCP/IP (Transmission Control Protocol/Internet Protocol).
- TCP/IP Transmission Control Protocol/Internet Protocol
- a computing device communicating using an internet protocol can thus interface with IP interface 204 .
- POTS interface 206 can interface with any type of circuit switching system including a Public Switch Telephone Network (PSTN), a private network (for example a corporate Private Branch Exchange (PBX)) and/or combinations thereof.
- PSTN Public Switch Telephone Network
- PBX corporate Private Branch Exchange
- POTS interface 206 can include an FXO (Foreign Exchange Office) interface and an FXS (Foreign Exchange Station) interface for receiving information using circuit switching technologies.
- FXO Form Exchange Office
- FXS Forwardeign Exchange Station
- IP interface 204 and POTS interface 206 can be embodied in a single device such as an analog telephony adapter (ATA).
- ATA analog telephony adapter
- Other devices that can interface and transport audio data between a computer and a POTS can be used, such as “voice modems” that connect a POTS to a computer using a telephone application program interface (TAPI).
- TAPI telephone application program interface
- device 30 and agent 202 are commonly connected, and separately addressable, through a network 208 , herein a wide area network such as the Internet. It therefore is not necessary that client 30 and agent 202 be physically located adjacent each other.
- Client 30 can transmit data, for example speech, text and video data, using a specified protocol to IP interface 204 .
- communication between client 30 and IP interface 204 uses standardized protocols, for example SIP with RTP (Session Initiator Protocol with Realtime Transport Protocol), both Internet Engineering Task Force (IETF) standards.
- SIP with RTP Session Initiator Protocol with Realtime Transport Protocol
- IETF Internet Engineering Task Force
- Access to agent 202 through phone 80 includes connection of phone 80 to a wired or wireless telephone network 210 that, in turn, connects phone 80 to agent 202 through a FXO interface.
- phone 80 can directly connect to agent 202 through a FXS interface, which is a part of POTS interface 206 .
- Both IP interface 204 and POTS interface 206 connect to agent 202 through a communication application programming interface (API) 212 .
- communication API 212 is Microsoft Real-Time Communication (RTC) Client API, developed by Microsoft Corporation of Redmond, Wash.
- RTC Real-Time Communication
- Another implementation of communication API 212 is the Computer Supported Telecommunication Architecture (ECMA-269/ISO 18051), or CSTA, an ISO/ECMA standard.
- Communication API 212 can facilitate multimodal communication applications, including applications for communication between two computers, between two phones and between a phone and a computer.
- Communication API 212 can also support audio and video calls, text-based messaging and application sharing.
- agent 202 is able to initiate communication to client 30 and/or phone 80 .
- Agent 202 also includes a dialog execution module 214 , a natural language processing unit 216 , dialog states 218 and prompts 220 .
- Dialog execution module 214 includes logic to handle communication requests and messages from communication API 212 as well as performs tasks based on dialog states 218 . These tasks can include transmitting a prompt from prompts 220 .
- Dialog execution module 214 utilizes natural language processing unit 216 to perform various natural language processing tasks.
- Natural language processing unit 216 includes a recognition engine that is used to identify features in the user input. Recognition features for speech are usually words in the spoken language while recognition features for handwriting usually correspond to strokes in the user's handwriting.
- a language model such as a grammar can be used to recognize text within a speech utterance. As is known, recognition can also be provided for visual inputs.
- Dialog execution module 214 can use objects recognized by natural language processing unit 216 to determine a desired dialog state from dialog states 218 . Dialog execution module 214 also accesses prompts 220 to provide an output to a person based on user input. Dialog states 218 can be stored as one or more files to be accessed by dialog execution module 214 . Prompts 220 can be integrated into dialog states 218 or stored and accessed separately from dialog states 218 . Prompts can be stored as text, audio and/or video data that is transmitted via communication API 212 to a user based on a request from the user, for example, an initial prompt may include, “Welcome to Acme Company Help Center, how can I help you?” The prompt is transmitted based on a mode of communication for the user. If the user connects to agent 202 using a phone, the prompt can be played audibly through the phone. If the user sends an email message, the agent 202 can respond with an email message.
- dialog execution module 214 interprets communication messages received from a user in order to traverse through a dialog that includes a plurality of dialog states, for example dialog states 218 .
- the dialog can be configured as a help center with prompts for use in answering questions from a user.
- the dialog states 218 can be stored as a file to be accessed by dialog execution module 214 .
- the file can be authored independent of a particular communication mode that is used by a user to access agent 202 .
- dialog execution module 214 can include an application programming interface (API) to access dialog states 218 .
- API application programming interface
- FIG. 6 is a diagram of an exemplary dialog 300 including a plurality of dialog states. Each state is represented by a circle and arrows represent transitions between two states.
- Dialog 300 includes an initial state 302 and an end state 304 .
- state 302 can include one or more processes or tasks to be performed.
- dialog state 302 can include a welcome prompt to be played and/or transmitted to user.
- a further communication message can be received.
- dialog 300 moves to a next state.
- dialog 300 can transition to state 306 , state 308 , etc.
- Each of these states can include further associated tasks and prompts to conduct a dialog with a user.
- These states also include transitions to other states in dialog 300 .
- dialog 300 is traversed until end state 304 is reached.
- FIG. 7 is a block diagram of components in a user interface that allows a person to author a dialog, for example dialog 300 .
- the interface allows the person to create a state-based dialog.
- the interface enables creation of a dialog using a flowcharting tool.
- the tool allows the person to create dialog states as well as various properties associated with the dialog states. For example, the person can specify tasks 320 , a prompt 322 , a grammar 324 and next dialog states 326 for dialog state 302 .
- Tasks 320 include one or more processes that are run for dialog state 302 .
- Prompt 322 includes text, audio and/or video data that can be transmitted via communication API 212 .
- Grammar 324 allows an author to express natural language input that will drive state changes from dialog state 302 .
- grammar 324 can be a context-free grammar, n-gram, hybrid or other.
- Next dialog states 326 that can follow dialog state 302 , in this case dialog states 306 and 308 , can also be specified. Dialog states 306 and 308 can include their own specified tasks, prompts, grammars and next dialog states.
- FIG. 8 is a flow diagram of a method 350 performed by dialog execution module 214 .
- a communication message is received.
- a communication mode is determined based on the message received.
- the mode can be an email message, an instant message or a connection via a telephone system.
- the communication message is analyzed to determine a next dialog state for the dialog. This step can include dialog execution module 214 accessing natural language processing unit 216 to identify semantic information within the message. The semantic information can be used with a grammar to determine a next dialog state.
- tasks associated with the dialog state are executed.
- a communication message is then transmitted based on the dialog state and the communication mode at step 360 .
- the message can include one or more prompts associated with the dialog state.
- a framework for authoring a dialog independent of a communication mode across a channel can thus be realized.
- a dialog execution module can communicate through various communication channels to communicate with a user. The dialog is accessed by the dialog execution module such that the dialog execution module can initiate and conduct a dialog regardless of a mode of communication that the user desires.
Abstract
A framework to author and execute dialog applications is utilized in a communication architecture. The applications can be used with a plurality of different modes of communication. A message processed by the dialog application is used to determine a dialog state and provide an associated response.
Description
- The discussion below is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.
- Remote applications from a broad variety of industries can be utilized across a computer network. For example, the applications include contact center self-service applications such as call routing and customer account/personal information access. Other contact center applications are possible including travel reservations, financial and stock applications and customer relationship management. Additionally, information technology groups can benefit from applications in the areas of sales and field-service automation, E-commerce, auto-attendants, help desk password reset applications and speech-enabled network management, for example.
- Traditional customer care has typically been handled through call centers manned by several human agents who answer telephones and respond to customer inquiries. Currently, many of these call centers are automated through telephony based Interactive Voice Response (IVR) systems employing a combination of Dual Tone Multi Frequency (DTMF) and Automatic Speech Recognition (ASR) technologies. Furthermore, customer care has been extended past telephony based systems into Instant Messaging (IM) and Email based systems. These different channels provide additional choices to the end customer, thereby increasing overall customer satisfaction. Automation of customer care across these various channels has currently been difficult as different tools are used for each channel.
- This Summary is provided to introduce some concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
- A framework to author and execute dialog applications is utilized in a communication architecture. The applications can be used with a plurality of different modes of communication. A message processed by the dialog application is used to determine a dialog state and provide an associated response.
-
FIG. 1 is a front view of an exemplary mobile device. -
FIG. 2 is a block diagram of functional components for the mobile device ofFIG. 1 . -
FIG. 3 is a front view of an exemplary phone. -
FIG. 4 is a block diagram of a general computing environment. -
FIG. 5 is a block diagram of a communication architecture for handling communication messages. -
FIG. 6 is a diagram of a plurality of dialog states. -
FIG. 7 is a block diagram of components in a user interface. -
FIG. 8 is a flow diagram of a method for handling communication messages. - Before describing an agent for handling communication messages and methods for implementing the same, it may be useful to describe generally computing devices that can function in a communication architecture. These devices can be used in various computing settings to utilize the agent across a computer network. For example, the devices can interact with the agent using natural language input of different modalities including text and speech. The devices discussed below are exemplary only and are not intended to limit the subject matter described herein.
- An exemplary form of a data management
mobile device 30 is illustrated inFIG. 1 . Themobile device 30 includes ahousing 32 and has a user interface including adisplay 34, which uses a contact sensitive display screen in conjunction with astylus 33. Thestylus 33 is used to press or contact thedisplay 34 at designated coordinates to select a field, to selectively move a starting position of a cursor, or to otherwise provide command information such as through gestures or handwriting. Alternatively, or in addition, one ormore buttons 35 can be included on thedevice 30 for navigation. In addition, other input mechanisms such as rotatable wheels, rollers or the like can also be provided. Another form of input can include a visual input such as through computer vision. - Referring now to
FIG. 2 , a block diagram illustrates the functional components comprising themobile device 30. A central processing unit (CPU) 50 implements the software control functions.CPU 50 is coupled to display 34 so that text and graphic icons generated in accordance with the controlling software appear on thedisplay 34. Aspeaker 43 can be coupled toCPU 50 typically with a digital-to-analog converter 59 to provide an audible output. - Data that is downloaded or entered by the user into the
mobile device 30 is stored in a non-volatile read/write randomaccess memory store 54 bi-directionally coupled to theCPU 50. Random access memory (RAM) 54 provides volatile storage for instructions that are executed byCPU 50, and storage for temporary data, such as register values. Default values for configuration options and other variables are stored in a read only memory (ROM) 58.ROM 58 can also be used to store the operating system software for the device that controls the basic functionality of themobile device 30 and other operating system kernel functions (e.g., the loading of software components into RAM 54). -
RAM 54 also serves as storage for the code in the manner analogous to the function of a hard drive on a PC that is used to store application programs. It should be noted that although non-volatile memory is used for storing the code, it alternatively can be stored in volatile memory that is not used for execution of the code. - Wireless signals can be transmitted/received by the mobile device through a
wireless transceiver 52, which is coupled toCPU 50. Anoptional communication interface 60 can also be provided for downloading data directly from a computer (e.g., desktop computer), or from a wired network, if desired. Accordingly,interface 60 can comprise various forms of communication devices, for example, an infrared link, modem, a network card, or the like. -
Mobile device 30 includes amicrophone 29, an analog-to-digital (A/D)converter 37, and an optional recognition program (speech, DTMF, handwriting, gesture or computer vision) stored instore 54. By way of example, in response to audible information, instructions or commands from a user ofdevice 30,microphone 29 provides speech signals, which are digitized by A/D converter 37. The speech recognition program can perform normalization and/or feature extraction functions on the digitized speech signals to obtain intermediate speech recognition results. - Using
wireless transceiver 52 orcommunication interface 60, speech and other data can be transmitted remotely, for example to an agent. When transmitting speech data, a remote speech server can be utilized. Recognition results can be returned tomobile device 30 for rendering (e.g. visual and/or audible) thereon, and eventual transmission to the agent, wherein the agent andmobile device 30 interact based on communication messages. - Similar processing can be used for other forms of input. For example, handwriting input can be digitized with or without pre-processing on
device 30. Like the speech data, this form of input can be transmitted to a server for recognition wherein the recognition results are returned to at least one of thedevice 30 and/or a remote agent. Likewise, DTMF data, gesture data and visual data can be processed similarly. Depending on the form of input, device 30 (and the other forms of clients discussed below) would include necessary hardware such as a camera for visual input. -
FIG. 3 is a plan view of an exemplary embodiment of aportable phone 80. Thephone 80 includes adisplay 82 and akeypad 84. Generally, the block diagram ofFIG. 2 applies to the phone ofFIG. 3 , although additional circuitry necessary to perform other functions may be required. For instance, a transceiver necessary to operate as a phone will be required for the embodiment ofFIG. 2 ; however, such circuitry is not pertinent to the present invention. - The agent is also operational with numerous other general purpose or special purpose computing systems, environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with the invention include, but are not limited to, regular telephones (without any screen), personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, radio frequency identification (RFID) devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
- The following is a brief description of a
general purpose computer 120 illustrated inFIG. 4 . However, thecomputer 120 is again only one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should thecomputer 120 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated therein. - The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices. Tasks performed by the programs and modules are described below and with the aid of figures. Those skilled in the art can implement the description and figures as processor executable instructions, which can be written on any form of a computer readable medium.
- With reference to
FIG. 4 , components ofcomputer 120 may include, but are not limited to, aprocessing unit 140, asystem memory 150, and a system bus 141 that couples various system components including the system memory to theprocessing unit 140. The system bus 141 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Universal Serial Bus (USB), Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.Computer 120 typically includes a variety of computer readable mediums. Computer readable mediums can be any available media that can be accessed bycomputer 120 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable mediums may comprise computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed bycomputer 120. - Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, FR, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
- The
system memory 150 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) 151 and random access memory (RAM) 152. A basic input/output system 153 (BIOS), containing the basic routines that help to transfer information between elements withincomputer 120, such as during start-up, is typically stored inROM 151.RAM 152 typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processingunit 140. By way of example, and not limitation,FIG. 4 illustratesoperating system 54,application programs 155,other program modules 156, andprogram data 157. - The
computer 120 may also include other removable/non-removable volatile/nonvolatile computer storage media. By way of example only,FIG. 4 illustrates ahard disk drive 161 that reads from or writes to non-removable, nonvolatile magnetic media, amagnetic disk drive 171 that reads from or writes to a removable, nonvolatilemagnetic disk 172, and anoptical disk drive 175 that reads from or writes to a removable, nonvolatileoptical disk 176 such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. Thehard disk drive 161 is typically connected to the system bus 141 through a non-removable memory interface such asinterface 160, andmagnetic disk drive 171 andoptical disk drive 175 are typically connected to the system bus 141 by a removable memory interface, such asinterface 170. - The drives and their associated computer storage media discussed above and illustrated in
FIG. 4 , provide storage of computer readable instructions, data structures, program modules and other data for thecomputer 120. InFIG. 4 , for example,hard disk drive 161 is illustrated as storingoperating system 164, application programs 165,other program modules 166, andprogram data 167. Note that these components can either be the same as or different fromoperating system 154,application programs 155,other program modules 156, andprogram data 157.Operating system 164, application programs 165,other program modules 166, andprogram data 167 are given different numbers here to illustrate that, at a minimum, they are different copies. - A user may enter commands and information into the
computer 120 through input devices such as akeyboard 182, amicrophone 183, and apointing device 181, such as a mouse, trackball or touch pad. Other input devices (not shown) may include a joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to theprocessing unit 140 through auser input interface 180 that is coupled to the system bus, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB). Amonitor 184 or other type of display device is also connected to the system bus 141 via an interface, such as avideo interface 185. In addition to the monitor, computers may also include other peripheral output devices such asspeakers 187 andprinter 186, which may be connected through an outputperipheral interface 188. - The
computer 120 may operate in a networked environment using logical connections to one or more remote computers, such as aremote computer 194. Theremote computer 194 may be a personal computer, a hand-held device, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to thecomputer 120. The logical connections depicted inFIG. 4 include a local area network (LAN) 191 and a wide area network (WAN) 193, but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet. - When used in a LAN networking environment, the
computer 120 is connected to theLAN 191 through a network interface oradapter 190. When used in a WAN networking environment, thecomputer 120 typically includes amodem 192 or other means for establishing communications over theWAN 193, such as the Internet. Themodem 192, which may be internal or external, may be connected to the system bus 141 via theuser input interface 180, or other appropriate mechanism. In a networked environment, program modules depicted relative to thecomputer 120, or portions thereof, may be stored in the remote memory storage device. By way of example, and not limitation,FIG. 4 illustratesremote application programs 195 as residing onremote computer 194. It will be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used. - Typically,
application programs 155 have interacted with a user through a command line or a Graphical User Interface (GUI) throughuser input interface 180. However, in an effort to simplify and expand the use of computer systems, inputs have been developed which are capable of receiving natural language input from the user. In contrast to natural language or speech, a graphical user interface is precise. A well designed graphical user interface usually does not produce ambiguous references or require the underlying application to confirm a particular interpretation of the input received through theinterface 180. For example, because the interface is precise, there is typically no requirement that the user be queried further regarding the input, e.g., “Did you click on the ‘ok’ button?” Typically, an object model designed for a graphical user interface is very mechanical and rigid in its implementation. - In contrast to an input from a graphical user interface, a natural language query or command will frequently translate into not just one, but a series of function calls to the input object model. In contrast to the rigid, mechanical limitations of a traditional line input or graphical user interface, natural language is a communication means in which human interlocutors rely on each other's intelligence, often unconsciously, to resolve ambiguities. In fact, natural language is regarded as “natural” exactly because it is not mechanical. Human interlocutors can resolve ambiguities based upon contextual information and cues regarding any number of domains surrounding the utterance. With human interlocutors, the sentence, “Forward the minutes to those in the review meeting on Friday” is a perfectly understandable sentence without any further explanations. However, from the mechanical point of view of a machine, specific details must be specified such as exactly what document and which meeting are being referred to, and exactly to whom the document should be sent.
-
FIG. 5 illustrates anexemplary communication architecture 200 with anagent 202.Agent 202 receives communication requests and/or messages from an initiator and performs tasks based on the requests and/or messages. The messages can be routed to a destination. An initiator can include a person, a device, a telephone, a remote personal information manager, etc. that connects toagent 202. The messages from the initiator can take many forms including real time voice (for example from a simple telephone or through a voice over Internet protocol source), real time text (such as instant messaging), non-real time voice (for example a voicemail message) and non-real time text (for example through short message service (SMS) or email). Tasks are automatically performed byagent 202, for example responding to a customer care inquiry sent by an initiator. - In one embodiment,
agent 202 can be implemented on a general purpose computer such ascomputer 120 discussed above.Agent 202 represents a single point of contact for a user dialog application. Thus, if a person wishes to interact with the dialog application, communication requests and messages are handled throughagent 202. In this manner, the person need not contactagent 202 using a particular device. The person only needs to contactagent 202 through any desired device, which handles and routes incoming communication requests and messages. - An initiator of a communication request or message can contact
agent 202 through a number of different modes of communication. Generally,agent 202 can be accessed through a client such as a mobile device 30 (which herein also represents other forms of computing devices having a display screen, a microphone, a camera, a touch sensitive panel, etc., as required based on the form of input), or throughphone 80 wherein communication is made audibly or through tones generated byphone 80 in response to keys depressed and wherein information fromagent 202 can be provided audibly back to the user. - More importantly though,
agent 202 is unified in that whether information is obtained throughdevice 30 orphone 80,agent 202 can support either mode of operation.Agent 202 is operably coupled to multiple interfaces to receive communication messages. Thus,agent 202 can provide a response to different types of devices based on a mode of communication for the device. -
IP interface 204 receives and transmits information using packet switching technologies, for example using TCP/IP (Transmission Control Protocol/Internet Protocol). A computing device communicating using an internet protocol can thus interface withIP interface 204. - POTS (Plain Old Telephone System, also referred to as Plain Old Telephone Service)
interface 206 can interface with any type of circuit switching system including a Public Switch Telephone Network (PSTN), a private network (for example a corporate Private Branch Exchange (PBX)) and/or combinations thereof. Thus, POTS interface 206 can include an FXO (Foreign Exchange Office) interface and an FXS (Foreign Exchange Station) interface for receiving information using circuit switching technologies. -
IP interface 204 and POTS interface 206 can be embodied in a single device such as an analog telephony adapter (ATA). Other devices that can interface and transport audio data between a computer and a POTS can be used, such as “voice modems” that connect a POTS to a computer using a telephone application program interface (TAPI). - As illustrated in
FIG. 5 ,device 30 andagent 202 are commonly connected, and separately addressable, through anetwork 208, herein a wide area network such as the Internet. It therefore is not necessary thatclient 30 andagent 202 be physically located adjacent each other.Client 30 can transmit data, for example speech, text and video data, using a specified protocol toIP interface 204. In one embodiment, communication betweenclient 30 andIP interface 204 uses standardized protocols, for example SIP with RTP (Session Initiator Protocol with Realtime Transport Protocol), both Internet Engineering Task Force (IETF) standards. - Access to
agent 202 throughphone 80 includes connection ofphone 80 to a wired orwireless telephone network 210 that, in turn, connectsphone 80 toagent 202 through a FXO interface. Alternatively,phone 80 can directly connect toagent 202 through a FXS interface, which is a part ofPOTS interface 206. - Both
IP interface 204 and POTS interface 206 connect toagent 202 through a communication application programming interface (API) 212. One implementation of communication API 212 is Microsoft Real-Time Communication (RTC) Client API, developed by Microsoft Corporation of Redmond, Wash. Another implementation of communication API 212 is the Computer Supported Telecommunication Architecture (ECMA-269/ISO 18051), or CSTA, an ISO/ECMA standard. Communication API 212 can facilitate multimodal communication applications, including applications for communication between two computers, between two phones and between a phone and a computer. Communication API 212 can also support audio and video calls, text-based messaging and application sharing. Thus,agent 202 is able to initiate communication toclient 30 and/orphone 80. -
Agent 202 also includes adialog execution module 214, a naturallanguage processing unit 216, dialog states 218 and prompts 220.Dialog execution module 214 includes logic to handle communication requests and messages from communication API 212 as well as performs tasks based on dialog states 218. These tasks can include transmitting a prompt fromprompts 220. -
Dialog execution module 214 utilizes naturallanguage processing unit 216 to perform various natural language processing tasks. Naturallanguage processing unit 216 includes a recognition engine that is used to identify features in the user input. Recognition features for speech are usually words in the spoken language while recognition features for handwriting usually correspond to strokes in the user's handwriting. In one particular example, a language model such as a grammar can be used to recognize text within a speech utterance. As is known, recognition can also be provided for visual inputs. -
Dialog execution module 214 can use objects recognized by naturallanguage processing unit 216 to determine a desired dialog state from dialog states 218.Dialog execution module 214 also accessesprompts 220 to provide an output to a person based on user input. Dialog states 218 can be stored as one or more files to be accessed bydialog execution module 214.Prompts 220 can be integrated into dialog states 218 or stored and accessed separately from dialog states 218. Prompts can be stored as text, audio and/or video data that is transmitted via communication API 212 to a user based on a request from the user, for example, an initial prompt may include, “Welcome to Acme Company Help Center, how can I help you?” The prompt is transmitted based on a mode of communication for the user. If the user connects toagent 202 using a phone, the prompt can be played audibly through the phone. If the user sends an email message, theagent 202 can respond with an email message. - In operation,
dialog execution module 214 interprets communication messages received from a user in order to traverse through a dialog that includes a plurality of dialog states, for example dialog states 218. In one embodiment, the dialog can be configured as a help center with prompts for use in answering questions from a user. The dialog states 218 can be stored as a file to be accessed bydialog execution module 214. The file can be authored independent of a particular communication mode that is used by a user to accessagent 202. Thus,dialog execution module 214 can include an application programming interface (API) to access dialog states 218. -
FIG. 6 is a diagram of anexemplary dialog 300 including a plurality of dialog states. Each state is represented by a circle and arrows represent transitions between two states.Dialog 300 includes aninitial state 302 and anend state 304. After a communication message is received byagent 202,dialog 300 is initiated and begins withstate 302.State 302 can include one or more processes or tasks to be performed. For example,dialog state 302 can include a welcome prompt to be played and/or transmitted to user. After theinitial state 302, a further communication message can be received. Based on the communication message received,dialog 300 moves to a next state. For example,dialog 300 can transition tostate 306,state 308, etc. Each of these states can include further associated tasks and prompts to conduct a dialog with a user. These states also include transitions to other states indialog 300. Ultimately,dialog 300 is traversed untilend state 304 is reached. -
FIG. 7 is a block diagram of components in a user interface that allows a person to author a dialog, forexample dialog 300. The interface allows the person to create a state-based dialog. In one embodiment, the interface enables creation of a dialog using a flowcharting tool. The tool allows the person to create dialog states as well as various properties associated with the dialog states. For example, the person can specifytasks 320, a prompt 322, agrammar 324 and next dialog states 326 fordialog state 302. -
Tasks 320 include one or more processes that are run fordialog state 302. Prompt 322 includes text, audio and/or video data that can be transmitted via communication API 212.Grammar 324 allows an author to express natural language input that will drive state changes fromdialog state 302. For example,grammar 324 can be a context-free grammar, n-gram, hybrid or other. Next dialog states 326 that can followdialog state 302, in this case dialog states 306 and 308, can also be specified. Dialog states 306 and 308 can include their own specified tasks, prompts, grammars and next dialog states. -
FIG. 8 is a flow diagram of amethod 350 performed bydialog execution module 214. Atstep 352, a communication message is received. Next, atstep 354, a communication mode is determined based on the message received. For example, the mode can be an email message, an instant message or a connection via a telephone system. Atstep 356, the communication message is analyzed to determine a next dialog state for the dialog. This step can includedialog execution module 214 accessing naturallanguage processing unit 216 to identify semantic information within the message. The semantic information can be used with a grammar to determine a next dialog state. Atstep 358, tasks associated with the dialog state are executed. A communication message is then transmitted based on the dialog state and the communication mode atstep 360. For example, the message can include one or more prompts associated with the dialog state. Atstep 362, it is determined whether or not the dialog is at an end state. If the dialog is not at an end state, themethod 350 will proceed to step 352 to wait for a further communication message. If the end state has been reached,method 350 ends atstep 364. - A framework for authoring a dialog independent of a communication mode across a channel can thus be realized. A dialog execution module can communicate through various communication channels to communicate with a user. The dialog is accessed by the dialog execution module such that the dialog execution module can initiate and conduct a dialog regardless of a mode of communication that the user desires.
- Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Claims (20)
1. A method of handling communication messages in a communication architecture, comprising:
receiving a first communication message from a source;
identifying a mode of communication associated with the first communication message;
determining a dialog state based on the first communication message;
transmitting a second communication message based on the dialog state to the source using the mode of communication.
2. The method of claim 1 and further comprising accessing a dialog file containing a plurality of specified dialog states.
3. The method of claim 2 wherein each of the dialog states includes associated properties including at least one of a task, a prompt and a related dialog state.
4. The method of claim 1 and further comprising performing a task based on the dialog state.
5. The method of claim 1 and further comprising analyzing the first communication message to determine semantic information contained therein and wherein the dialog state is determined based on the semantic information.
6. The method of claim 1 wherein the mode of communication is one of email, instant messaging and telephony.
7. The method of claim 1 wherein the first communication message includes one of speech data and text data.
8. A computer-readable medium adapted to process a communication message from a source having a mode of communication, comprising:
a dialog execution module adapted to access a plurality of dialog states to determine a dialog state based on the communication message; and
a communication interface coupled to the dialog execution module and adapted to transmit a response to the source based on the dialog state and the mode of communication.
9. The computer-readable medium of claim 8 wherein the dialog execution module is further adapted to analyze the communication message to determine semantic information contained therein.
10. The computer-readable medium of claim 9 wherein the next dialog state is determined based on the semantic information.
11. The computer-readable medium of claim 10 wherein the dialog execution module is adapted to access a language model to determine the dialog state based on the semantic information.
12. The computer-readable medium of claim 8 wherein the communication interface is adapted to transmit the response to an internet protocol source and a POTS source.
13. The computer-readable medium of claim 8 wherein the dialog execution module is adapted to access a prompt to determine the response.
14. A system comprising:
a communication interface adapted to receive communication messages from a plurality of different modes of communication and transmit communication messages based on the plurality of different modes of communication;
a dialog file including a plurality of dialog states, each dialog state having associated properties; and
a dialog execution module coupled to the communication interface to receive communication messages therefrom, adapted to access the dialog file to determine a dialog state based on a particular communication message and provide a response associated with the dialog state to the communication interface.
15. The system of claim 14 wherein the associated properties include a prompt, a language model and a related dialog state.
16. The system of claim 14 and further comprising a natural language processing unit coupled to the dialog execution module to identify semantic information within the communication messages.
17. The system of claim 14 and further comprising an internet protocol interface and a POTS interface coupled to the communication interface.
18. The system of claim 14 wherein the dialog execution module includes an application programming interface to access the dialog file.
19. The system of claim 14 wherein the communication messages include at least one speech data and text data.
20. The system of claim 14 wherein the communication interface is adapted to transmit at least one of an email message and an audio message.
Priority Applications (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/253,047 US20070115920A1 (en) | 2005-10-18 | 2005-10-18 | Dialog authoring and execution framework |
JP2008536601A JP2009512393A (en) | 2005-10-18 | 2006-10-03 | Dialog creation and execution framework |
EP06816184A EP1941435A4 (en) | 2005-10-18 | 2006-10-03 | Dialog authoring and execution framework |
PCT/US2006/038740 WO2007047105A1 (en) | 2005-10-18 | 2006-10-03 | Dialog authoring and execution framework |
CNA200680038585XA CN101292256A (en) | 2005-10-18 | 2006-10-03 | Dialog authoring and execution framework |
KR1020087009169A KR101251697B1 (en) | 2005-10-18 | 2006-10-03 | Dialog authoring and execution framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/253,047 US20070115920A1 (en) | 2005-10-18 | 2005-10-18 | Dialog authoring and execution framework |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070115920A1 true US20070115920A1 (en) | 2007-05-24 |
Family
ID=37962817
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/253,047 Abandoned US20070115920A1 (en) | 2005-10-18 | 2005-10-18 | Dialog authoring and execution framework |
Country Status (6)
Country | Link |
---|---|
US (1) | US20070115920A1 (en) |
EP (1) | EP1941435A4 (en) |
JP (1) | JP2009512393A (en) |
KR (1) | KR101251697B1 (en) |
CN (1) | CN101292256A (en) |
WO (1) | WO2007047105A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070198272A1 (en) * | 2006-02-20 | 2007-08-23 | Masaru Horioka | Voice response system |
US20100124325A1 (en) * | 2008-11-19 | 2010-05-20 | Robert Bosch Gmbh | System and Method for Interacting with Live Agents in an Automated Call Center |
US20140269490A1 (en) * | 2013-03-12 | 2014-09-18 | Vonage Network, Llc | Systems and methods of configuring a terminal adapter for use with an ip telephony system |
US20190147867A1 (en) * | 2017-11-10 | 2019-05-16 | Hyundai Motor Company | Dialogue system and method for controlling thereof |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10462619B2 (en) | 2016-06-08 | 2019-10-29 | Google Llc | Providing a personal assistant module with a selectively-traversable state machine |
US10621984B2 (en) | 2017-10-04 | 2020-04-14 | Google Llc | User-configured and customized interactive dialog application |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5357596A (en) * | 1991-11-18 | 1994-10-18 | Kabushiki Kaisha Toshiba | Speech dialogue system for facilitating improved human-computer interaction |
US5396536A (en) * | 1992-06-23 | 1995-03-07 | At&T Corp. | Automatic processing of calls with different communication modes in a telecommunications system |
US20010005382A1 (en) * | 1999-07-13 | 2001-06-28 | Inter Voice Limited Partnership | System and method for packet network media redirection |
US6389132B1 (en) * | 1999-10-13 | 2002-05-14 | Avaya Technology Corp. | Multi-tasking, web-based call center |
US20030126330A1 (en) * | 2001-12-28 | 2003-07-03 | Senaka Balasuriya | Multimodal communication method and apparatus with multimodal profile |
US20030179876A1 (en) * | 2002-01-29 | 2003-09-25 | Fox Stephen C. | Answer resource management system and method |
US20040083092A1 (en) * | 2002-09-12 | 2004-04-29 | Valles Luis Calixto | Apparatus and methods for developing conversational applications |
US20040098253A1 (en) * | 2000-11-30 | 2004-05-20 | Bruce Balentine | Method and system for preventing error amplification in natural language dialogues |
US20050004800A1 (en) * | 2003-07-03 | 2005-01-06 | Kuansan Wang | Combining use of a stepwise markup language and an object oriented development tool |
US20050105712A1 (en) * | 2003-02-11 | 2005-05-19 | Williams David R. | Machine learning |
US6985576B1 (en) * | 1999-12-02 | 2006-01-10 | Worldcom, Inc. | Method and apparatus for automatic call distribution |
US7519665B1 (en) * | 2000-03-30 | 2009-04-14 | Fujitsu Limited | Multi-channel processing control device and multi-channel processing control method |
US7546546B2 (en) * | 2005-08-24 | 2009-06-09 | International Business Machines Corporation | User defined contextual desktop folders |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100314084B1 (en) * | 1999-12-07 | 2001-11-15 | 구자홍 | Web call center system using internet web browser |
KR20020015908A (en) * | 2000-08-23 | 2002-03-02 | 전영 | Real Time Internet Call System Using Video And Audio |
WO2002073331A2 (en) | 2001-02-20 | 2002-09-19 | Semantic Edge Gmbh | Natural language context-sensitive and knowledge-based interaction environment for dynamic and flexible product, service and information search and presentation applications |
KR100679807B1 (en) * | 2001-09-29 | 2007-02-07 | 주식회사 케이티 | A Messaging Service System in PSTN/ISDN network |
JP3777337B2 (en) * | 2002-03-27 | 2006-05-24 | ドコモ・モバイルメディア関西株式会社 | Data server access control method, system thereof, management apparatus, computer program, and recording medium |
JP2004289803A (en) * | 2003-03-04 | 2004-10-14 | Omron Corp | Interactive system, dialogue control method, and interactive control program |
US7363027B2 (en) * | 2003-11-11 | 2008-04-22 | Microsoft Corporation | Sequential multimodal input |
-
2005
- 2005-10-18 US US11/253,047 patent/US20070115920A1/en not_active Abandoned
-
2006
- 2006-10-03 CN CNA200680038585XA patent/CN101292256A/en active Pending
- 2006-10-03 JP JP2008536601A patent/JP2009512393A/en active Pending
- 2006-10-03 EP EP06816184A patent/EP1941435A4/en not_active Withdrawn
- 2006-10-03 KR KR1020087009169A patent/KR101251697B1/en not_active IP Right Cessation
- 2006-10-03 WO PCT/US2006/038740 patent/WO2007047105A1/en active Application Filing
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5357596A (en) * | 1991-11-18 | 1994-10-18 | Kabushiki Kaisha Toshiba | Speech dialogue system for facilitating improved human-computer interaction |
US5396536A (en) * | 1992-06-23 | 1995-03-07 | At&T Corp. | Automatic processing of calls with different communication modes in a telecommunications system |
US20010005382A1 (en) * | 1999-07-13 | 2001-06-28 | Inter Voice Limited Partnership | System and method for packet network media redirection |
US6389132B1 (en) * | 1999-10-13 | 2002-05-14 | Avaya Technology Corp. | Multi-tasking, web-based call center |
US6985576B1 (en) * | 1999-12-02 | 2006-01-10 | Worldcom, Inc. | Method and apparatus for automatic call distribution |
US7519665B1 (en) * | 2000-03-30 | 2009-04-14 | Fujitsu Limited | Multi-channel processing control device and multi-channel processing control method |
US20040098253A1 (en) * | 2000-11-30 | 2004-05-20 | Bruce Balentine | Method and system for preventing error amplification in natural language dialogues |
US20030126330A1 (en) * | 2001-12-28 | 2003-07-03 | Senaka Balasuriya | Multimodal communication method and apparatus with multimodal profile |
US20030179876A1 (en) * | 2002-01-29 | 2003-09-25 | Fox Stephen C. | Answer resource management system and method |
US20040083092A1 (en) * | 2002-09-12 | 2004-04-29 | Valles Luis Calixto | Apparatus and methods for developing conversational applications |
US20050105712A1 (en) * | 2003-02-11 | 2005-05-19 | Williams David R. | Machine learning |
US20050004800A1 (en) * | 2003-07-03 | 2005-01-06 | Kuansan Wang | Combining use of a stepwise markup language and an object oriented development tool |
US7546546B2 (en) * | 2005-08-24 | 2009-06-09 | International Business Machines Corporation | User defined contextual desktop folders |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070198272A1 (en) * | 2006-02-20 | 2007-08-23 | Masaru Horioka | Voice response system |
US20090141871A1 (en) * | 2006-02-20 | 2009-06-04 | International Business Machines Corporation | Voice response system |
US8095371B2 (en) * | 2006-02-20 | 2012-01-10 | Nuance Communications, Inc. | Computer-implemented voice response method using a dialog state diagram to facilitate operator intervention |
US8145494B2 (en) * | 2006-02-20 | 2012-03-27 | Nuance Communications, Inc. | Voice response system |
US20100124325A1 (en) * | 2008-11-19 | 2010-05-20 | Robert Bosch Gmbh | System and Method for Interacting with Live Agents in an Automated Call Center |
US8943394B2 (en) * | 2008-11-19 | 2015-01-27 | Robert Bosch Gmbh | System and method for interacting with live agents in an automated call center |
US20140269490A1 (en) * | 2013-03-12 | 2014-09-18 | Vonage Network, Llc | Systems and methods of configuring a terminal adapter for use with an ip telephony system |
US20190147867A1 (en) * | 2017-11-10 | 2019-05-16 | Hyundai Motor Company | Dialogue system and method for controlling thereof |
US10937420B2 (en) * | 2017-11-10 | 2021-03-02 | Hyundai Motor Company | Dialogue system and method to identify service from state and input information |
Also Published As
Publication number | Publication date |
---|---|
JP2009512393A (en) | 2009-03-19 |
CN101292256A (en) | 2008-10-22 |
EP1941435A4 (en) | 2012-11-07 |
KR101251697B1 (en) | 2013-04-05 |
EP1941435A1 (en) | 2008-07-09 |
KR20080058408A (en) | 2008-06-25 |
WO2007047105A1 (en) | 2007-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7285949B2 (en) | Systems and methods for assisting agents via artificial intelligence | |
US7921214B2 (en) | Switching between modalities in a speech application environment extended for interactive text exchanges | |
US8239204B2 (en) | Inferring switching conditions for switching between modalities in a speech application environment extended for interactive text exchanges | |
US7801968B2 (en) | Delegated presence for unified messaging/unified communication | |
JP4550362B2 (en) | Voice-enabled user interface for voice mail system | |
US7409349B2 (en) | Servers for web enabled speech recognition | |
US8515028B2 (en) | System and method for externally mapping an Interactive Voice Response menu | |
US20080037745A1 (en) | Systems, Methods, And Media For Automated Conference Calling | |
US8155276B2 (en) | Synchronous and asynchronous brokering of IVR sessions for offline interaction and response | |
US7653547B2 (en) | Method for testing a speech server | |
US20210157989A1 (en) | Systems and methods for dialog management | |
US20070239458A1 (en) | Automatic identification of timing problems from speech data | |
US20020069060A1 (en) | Method and system for automatically managing a voice-based communications systems | |
US20160094491A1 (en) | Pattern-controlled automated messaging system | |
US20070115920A1 (en) | Dialog authoring and execution framework | |
US20040092293A1 (en) | Third-party call control type simultaneous interpretation system and method thereof | |
US20070294349A1 (en) | Performing tasks based on status information | |
CN109887483A (en) | Self-Service processing method, device, computer equipment and storage medium | |
US10984229B2 (en) | Interactive sign language response system and method | |
JP3761158B2 (en) | Telephone response support apparatus and method | |
US20060056601A1 (en) | Method and apparatus for executing tasks in voice-activated command systems | |
JPH08263401A (en) | Method and apparatus for adjustment and maintenance of data | |
EP3535752B1 (en) | System and method for parameterization of speech recognition grammar specification | |
CN101588418A (en) | Treatment processing of a plurality of streaming voice signals for determination of responsive action thereto | |
Duerr | Voice recognition in the telecommunications industry |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MICROSOFT CORPORATION, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RAMAKRISHNA, ANAND;REEL/FRAME:016992/0732 Effective date: 20051012 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034543/0001 Effective date: 20141014 |