US20130077769A1 - Enhanced voicemail usage through automatic voicemail preview - Google Patents

Enhanced voicemail usage through automatic voicemail preview Download PDF

Info

Publication number
US20130077769A1
US20130077769A1 US13/681,633 US201213681633A US2013077769A1 US 20130077769 A1 US20130077769 A1 US 20130077769A1 US 201213681633 A US201213681633 A US 201213681633A US 2013077769 A1 US2013077769 A1 US 2013077769A1
Authority
US
United States
Prior art keywords
voicemail
calling party
preview
user
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/681,633
Inventor
Jon Hamaker
Keith Herold
Michael Wilson
David Notario
Tom Millet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/681,633 priority Critical patent/US20130077769A1/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HAMAKER, JON, HEROLD, KEITH, MILLETT, TOM, WILSON, MICHAEL
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HAMAKER, JON, HEROLD, KEITH, MILLETT, TOM, WILSON, MICHAEL
Publication of US20130077769A1 publication Critical patent/US20130077769A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • G06Q50/60
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/271Devices whereby a plurality of signals may be stored simultaneously controlled by voice recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/533Voice mail systems
    • H04M3/53333Message receiving aspects
    • H04M3/53358Message preview
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/107Computer-aided management of electronic mailing [e-mailing]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/64Automatic arrangements for answering calls; Automatic arrangements for recording messages for absent subscribers; Arrangements for recording conversations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/533Voice mail systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/25Aspects of automatic or semi-automatic exchanges related to user interface aspects of the telephonic communication service
    • H04M2203/251Aspects of automatic or semi-automatic exchanges related to user interface aspects of the telephonic communication service where a voice mode or a visual mode can be used interchangeably
    • H04M2203/253Aspects of automatic or semi-automatic exchanges related to user interface aspects of the telephonic communication service where a voice mode or a visual mode can be used interchangeably where a visual mode is used instead of a voice mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/45Aspects of automatic or semi-automatic exchanges related to voicemail messaging
    • H04M2203/4509Unified messaging with single point of access to voicemail and other mail or messaging systems

Definitions

  • Embodiments are directed to enabling voicemail preview in a combination of text and audio formats taking advantage of information automatically extracted from data sources associated with a user and voicemail metadata.
  • elements of and interactions with the voicemail preview extend a value of the voicemail beyond simple reading of the text.
  • information in the voicemail is surfaced and made actionable using contextual data.
  • FIG. 1 is a diagram illustrating an example unified communications system, where embodiments may be implemented for enhanced voicemail usage through automatic voicemail preview;
  • FIG. 2 is a conceptual diagram illustrating a basic example system for providing enhanced voicemail preview
  • FIG. 3 is illustrates major components in providing enhanced voicemail preview according to embodiments
  • FIG. 4 illustrates a screenshot of an example user interface for providing enhanced voicemail preview
  • FIG. 5 is a networked environment, where a system according to embodiments may be implemented
  • FIG. 6 is a block diagram of an example computing operating environment, where embodiments may be implemented.
  • FIG. 7 illustrates a logic flow diagram for providing enhanced voicemail preview according to embodiments.
  • a textual preview of a voicemail may be generated and provided through email or similar media to users along with the audio version. Transcription to a textual version, as well as additional capabilities such as actionable information is performed based on contextual data obtained from user associated data stores and voicemail metadata.
  • references are made to the accompanying drawings that form a part hereof, and in which are shown by way of illustrations specific embodiments or examples. These aspects may be combined, other aspects may be utilized, and structural changes may be made without departing from the spirit or scope of the present disclosure. The following detailed description is therefore not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims and their equivalents.
  • program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types.
  • embodiments may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and comparable computing devices.
  • Embodiments may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote memory storage devices.
  • Embodiments may be implemented as a computer-implemented process (method), a computing system, or as an article of manufacture, such as a computer program product or computer readable media.
  • the computer program product may be a computer storage medium readable by a computer system and encoding a computer program that comprises instructions for causing a computer or computing system to perform example process(es).
  • the computer-readable storage medium can for example be implemented via one or more of a volatile computer memory, a non-volatile memory, a hard drive, a flash drive, a floppy disk, or a compact disk, and comparable media.
  • the computer program product may also be a propagated signal on a carrier (e.g. a frequency or phase modulated signal) or medium readable by a computing system and encoding a computer program of instructions for executing a computer process.
  • platform may be a combination of software and hardware components for managing communication applications utilized for voicemail preview delivery. Examples of platforms include, but are not limited to, a hosted service executed over a plurality of servers, an application executed on a single server, and comparable systems.
  • server generally refers to a computing device executing one or more software programs typically in a networked environment. However, a server may also be implemented as a virtual server (software programs) executed on one or more computing devices viewed as a server on the network. More detail on these technologies and example operations is provided below.
  • voicemail preview refers to textual data derived from a voicemail marked up with actionable items and integrated into a productivity service such as an email application, an instant message application and similar communication applications.
  • the textual data may include machine transcriptions (e.g. automatic voice recognition), human transcriptions, and comparable forms that may reflect the complete voicemail or a summary of the same.
  • the data derived from voicemail may also include graphical features.
  • a unified communication system is an example of modern communication systems with a wide range of capabilities and services that can be provided to subscribers such as enhanced voicemail preview.
  • a unified communication system is a real-time communications system facilitating instant messaging, presence, audio-video conferencing, web conferencing functionality, and comparable capabilities.
  • a unified communication (“UC”) system such as the one shown in diagram 100
  • users may communicate via a variety of end devices ( 102 , 104 ), which are client devices of the UC system.
  • Each client device may be capable of executing one or more communication applications for voice communication, video communication, instant messaging, application sharing, data sharing, and the like.
  • the end devices may also facilitate traditional phone calls through an external connection such as through PBX 124 to a Public Switched Telephone Network (“PSTN”).
  • PSTN Public Switched Telephone Network
  • End devices may include any type of smart phone, cellular phone, any computing device executing a communication application, a smart automobile console, and advanced phone devices with additional functionality.
  • UC Network(s) 110 includes a number of servers performing different tasks.
  • UC servers 114 provide registration, presence, and routing functionalities. Routing functionality enables the system to route calls to a user to anyone of the client devices assigned to the user based on default and/or user set policies. For example, if the user is not available through a regular phone, the call may be forwarded to the user's cellular phone, and if that is not answering a number of voicemail options may be utilized. Since the end devices can handle additional communication modes, UC servers 114 may provide access to these additional communication modes (e.g. instant messaging, video communication, etc.) through access server 112 .
  • additional communication modes e.g. instant messaging, video communication, etc.
  • Access server 112 resides in a perimeter network and enables connectivity through UC network(s) 110 with other users in one of the additional communication modes.
  • UC servers 114 may include servers that perform combinations of the above described functionalities or specialized servers that only provide a particular functionality. For example, home servers providing presence functionality, routing servers providing routing functionality, rights management servers, and so on. Similarly, access server 112 may provide multiple functionalities such as firewall protection and connectivity, or only specific functionalities.
  • Audio/Video (AN) conferencing server 118 provides audio and/or video conferencing capabilities by facilitating those over an internal or external network.
  • Mediation server 116 mediates signaling and media to and from other types of networks such as a PSTN or a cellular network (e.g. calls through PBX 124 or from cellular phone 122 ).
  • Voicemail server 115 may manage voicemails for subscribers of the UC system performing tasks like storage and delivery of voicemails, transcription of audio files into textual data and generation of enhanced voicemail preview emails or instant messages according to some embodiments.
  • Mediation server 116 may also act as a Session Initiation Protocol (SIP) user agent.
  • SIP Session Initiation Protocol
  • users may have one or more identities, which is not necessarily limited to a phone number.
  • the identity may take any form depending on the integrated networks, such as a telephone number, a Session Initiation Protocol (SIP) Uniform Resource Identifier (URI), or any other identifier. While any protocol may be used in a UC system, SIP is a preferred method.
  • SIP Session Initiation Protocol
  • URI Uniform Resource Identifier
  • SIP is an application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. It can be used to create two-party, multiparty, or multicast sessions that include Internet telephone calls, multimedia distribution, and multimedia conferences. SIP is designed to be independent of the underlying transport layer.
  • SIP clients may use Transport Control Protocol (“TCP”) to connect to SIP servers and other SIP endpoints.
  • TCP Transport Control Protocol
  • SIP is primarily used in setting up and tearing down voice or video calls. However, it can be used in any application where session initiation is a requirement. These include event subscription and notification, terminal mobility, and so on. Voice and/or video communications are typically done over separate session protocols, typically Real Time Protocol (“RTP”).
  • RTP Real Time Protocol
  • FIG. 1 has been described with specific components such as mediation server, AN server, and similar devices, embodiments are not limited to this system of the example components and configurations.
  • a service for enhanced voicemail preview may be implemented in other systems and configurations employing fewer or additional components.
  • Such systems do not have to be enhanced communication systems integrating various communication modes.
  • Embodiments may also be implemented for voicemails delivered in traditional communication systems such as PSTN or cellular networks using the principles described herein.
  • FIG. 2 is a conceptual diagram 200 illustrating a basic example system for providing enhanced voicemail preview. While a system according to embodiments is likely to include a number of servers, client devices, and services such as those illustratively discussed in FIG. 1 , only those relevant to embodiments are shown in FIG. 2 .
  • Embodiments provide technologies supporting the integration of voicemail more thoroughly and more richly into the user's information workflow. This integration into the user's most prevalent information processing (email, instant messaging, and similar forms) is aided by the context available in the user's data store and provides additional capabilities beyond the simple reading of the message such as audio navigation, contact generation, search over voicemail, instant message behavior, and comparable ones.
  • a textual preview of the voicemail is generated by means of automatic speech recognition and delivered it to the recipient through email, instant message, or similar messaging technology.
  • a speech recognizer is integrated directly with the voicemail and messaging systems, according to one embodiment. Due to this deep integration, the speech recognizer is able to leverage significant contextual information about the caller and callee to improve the recognition accuracy (fidelity). This includes, but is not limited to, the names of the parties, their respective contact lists, their organizational structures, previous communications between the parties, communications relevant to the parties, etc. As mentioned above, however, embodiments are not limited to data generated by automatic speech recognition. Actionable items and contextual information may also be provided employing human transcribed data, combination of human transcribed and machine generated data, and comparable information that is marked up in a schema decipherable by client applications.
  • voicemail server 234 in the example system receives the voicemail in audio form and further receives contextual information from sources like presence server 236 , email server 238 , and so on.
  • presence information may include location associated with the caller and words in the voicemail may be recognized with higher accuracy based on the knowledge of where the caller is.
  • Voicemail server 234 may process the text from the speech recognizer or another transcription source such that key information (e.g. names or key points) can be identified, specially rendered, and made actionable as appropriate for the user's benefit.
  • a rendering sub-system may use further contextual information along with the conceptual highlighting provided by the speech sub-system to provide a visual representation of the voicemail for the user.
  • the rendered voicemail preview 240 may be provided to the user's ( 244 ) messaging application executed on computing device 242 to be presented by a messaging user interface.
  • client applications instead of voicemail server
  • An example implementation of a messaging application for delivering enhanced voicemail preview is email.
  • An email message may deliver the rendered voicemail preview along with an audio file of the voicemail such that switch back between the textual data and the audio file is enabled along with search capability allowing the recipient to search easily for portions of the audio file in addition to having the capability to request/perform actions based on the information in the voicemail.
  • email and instant messaging is frequently referred to as example services to which a voicemail preview may be integrated, embodiments are not limited to those.
  • Other messaging systems such as SMS, RSS feeds, and comparable ones may also be employed in providing an enhanced voicemail preview experience to users.
  • FIG. 3 is illustrates major components in providing enhanced voicemail preview according to embodiments in diagram 300 .
  • contextual information such as presence information 352 , contact/address book information 354 (associated with the caller and/or callee), email history information 356 , and similar data is used at various stages of generating enhanced voicemail preview.
  • the audio version of the voicemail 358 may be transcribed into a rich text form 360 (with actionable terms, highlights, and other features) using the contextual information to improve accuracy of transcription and to add the features.
  • the rich text forms 360 of the voicemail may then be further processed ( 362 ) with additional features again using the contextual information.
  • the information used at different stages may be distinct. For example, caller associated information may be used in one stage, while callee associated information may be employed for the other stage.
  • the enhanced voicemail preview may be integrated into an email message along with the audio version of the voicemail ( 358 ) and presented to the subscriber through an email user interface 364 .
  • the audio version of the voicemail may be attached to the email message or a link to its location may be provided.
  • voicemail is integrated into email workflow along with presence information. Key portions of the voicemail are made actionable, and advanced navigation of the voicemail audio is provided allowing the subscriber to “jump” to any location in the voicemail audio file by clicking on the appropriate text (i.e. playback jump or audio repositioning). Thus, the voicemail's text representation is presented such that the subscriber can more easily find key information without distraction from the typically present speech recognition errors.
  • FIG. 4 illustrates a screenshot of an example user interface for providing enhanced voicemail preview.
  • the elements and configuration of the user interface on the screenshot are for illustration purposes only and do not constitute a limitation on embodiments.
  • a messaging application capable of presenting enhanced voicemail previews to users may employ any user interface with other elements and configurations.
  • Example user interface includes standard graphic and textual elements 472 such as commands, options, and other items.
  • the voicemail rendering provides information to the email user interface, which assists in navigation and playback of the voicemail audio. For example, clicking on a word in the textual voicemail rendering 476 may start playing the voicemail from the position in which that word was spoken. Highlighting a set of words may play just that segment of the voicemail audio.
  • a standard audio playback user interface element 474 may also be provided.
  • Portions of the voicemail rendering are made actionable in the email user interface by integrating with presence information available to the email system. For example, right-clicking on a name brings up a list of actions that a user may want to carry out relative to the name such as adding that name to the user's contact list, contacting the named person via an instant message, placing a voice call to the named person, and similar actions.
  • the actions may be presented in a pop-up menu 480 or similar user interface elements (hovering display element, drop-down menu, and the like).
  • the rendered textual voicemail 476 may include graphic or color scheme based elements to provide additional information on syntax and actionable words. Furthermore, the information available in the rendered voicemail may be added to the search index of the email server. Thus, voicemails are made searchable both in the text that is rendered and, potentially, in the metadata underlying the text.
  • the previewed data is also searchable via desktop search systems, mobile device systems, and so on, not just servers.
  • the email carrying the voicemail preview may include the names of the caller and callee, as well as the caller's detailed presence information to enable the callee take proper action based on the information in the email.
  • a user interface for a messaging application capable of presenting voicemail preview may include additional or fewer textual and graphical elements, and may employ various graphical, color, and other configuration schemes to display different functionalities and associated processes.
  • FIG. 5 is an example networked environment, where embodiments may be implemented.
  • a platform providing communication services with enhanced voicemail preview may be implemented via software executed over one or more servers 518 such as a hosted service.
  • the platform may communicate with client applications on individual computing devices such as a cellular phone 513 , a laptop computer 512 , and desktop computer 511 (client devices) through network(s) 510 .
  • voicemail may be delivered from a number of sources including traditional phone systems, enhanced communication systems, and so on, within the network(s) 510 or through external network(s) 520 to a voicemail management application/server.
  • the voicemail management application/server may receive additional information including, but not limited to, presence information, contact information, as well as voicemail metadata. This contextual information may be utilized in transcription of the voicemail into textual data and generation of advanced capabilities such as actionable data to be presented in a textual communication to a subscriber receiving the voicemail.
  • Client devices 511 - 513 are used to facilitate communications through a variety of modes between subscribers of the communication system.
  • Information associated with subscribers and facilitating enhanced voicemail preview may be stored in one or more data stores (e.g. data store 516 ), which may be managed by any one of the servers 518 or by database server 514 .
  • Network(s) 510 may comprise any topology of servers, clients, Internet service providers, and communication media.
  • a system according to embodiments may have a static or dynamic topology.
  • Network(s) 510 may include a secure network such as an enterprise network, an unsecure network such as a wireless open network, or the Internet.
  • Network(s) 510 may also coordinate communication over other networks such as PSTN or cellular networks (e.g. external network(s) 520 ).
  • Network(s) 510 provides communication between the nodes described herein.
  • network(s) 510 may include wireless media such as acoustic, RF, infrared and other wireless media.
  • FIG. 6 and the associated discussion are intended to provide a brief, general description of a suitable computing environment in which embodiments may be implemented.
  • computing device 600 a block diagram of an example computing operating environment for an application according to embodiments is illustrated, such as computing device 600 .
  • computing device 800 may be a voicemail management server as part of a communication system and include at least one processing unit 602 and system memory 604 .
  • Computing device 600 may also include a plurality of processing units that cooperate in executing programs.
  • the system memory 604 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.) or some combination of the two.
  • System memory 604 typically includes an operating system 605 suitable for controlling the operation of the platform, such as the WINDOWS® operating systems from MICROSOFT CORPORATION of Redmond, Wash.
  • the system memory 604 may also include one or more software applications such as program modules 606 , voicemail application 622 , and transcription module 624 .
  • Voicemail application 622 may be part of a service that facilitates communication through various modalities between client applications, servers, and other devices (e.g. voice, email, instant messaging, etc.). Transcription module 624 may transcribe audio voicemail files into textual data using contextual information as discussed previously. Transcription module 624 and voicemail application 622 may be separate applications or integral modules of a hosted service that provides enhanced communication services to client applications/devices such as voicemail preview with actionable information through email or instant messaging. This basic configuration is illustrated in FIG. 6 by those components within dashed line 608 .
  • Computing device 600 may have additional features or functionality.
  • the computing device 600 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape.
  • additional storage is illustrated in FIG. 6 by removable storage 609 and non-removable storage 610 .
  • Computer readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data.
  • System memory 604 , removable storage 609 and non-removable storage 610 are all examples of computer readable storage media.
  • Computer readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 600 . Any such computer readable storage media may be part of computing device 600 .
  • Computing device 600 may also have input device(s) 612 such as keyboard, mouse, pen, voice input device, touch input device, and comparable input devices.
  • Output device(s) 614 such as a display, speakers, printer, and other types of output devices may also be included. These devices are well known in the art and need not be discussed at length here.
  • Computing device 600 may also contain communication connections 616 that allow the device to communicate with other devices 618 , such as over a wireless network in a distributed computing environment, a satellite link, a cellular link, and comparable mechanisms.
  • Other devices 618 may include computer device(s) that execute communication applications, email or presence servers, and comparable devices.
  • Communication connection(s) 616 is one example of communication media.
  • Communication media can include therein computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media.
  • modulated data signal means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
  • Example embodiments also include methods. These methods can be implemented in any number of ways, including the structures described in this document. One such way is by machine operations, of devices of the type described in this document.
  • Another optional way is for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program.
  • FIG. 7 illustrates a logic flow diagram of process 700 for providing enhanced voicemail preview according to embodiments.
  • Process 700 may be implemented as part of an enhanced communication system in a voicemail server.
  • Process 700 begins with operation 710 , where a voicemail is received for a subscriber associated with the voicemail server.
  • the voicemail may be stored as an audio file.
  • contextual information such as presence or contact information associated with the subscriber (and their contacts), email history information, and voicemail metadata (a source of the voicemail and any additional information that may be included with the voicemail in an enhanced communication system) may be received by the voicemail server.
  • the audio voicemail is transcribed into textual data using the contextual information.
  • the contextual information is not only used to improve fidelity of the transcription, but it may also be used to surface information within the voicemail in textual format. As discussed previously, embodiments are not limited to automatic voice-recognition based transcriptions or textual data. Other forms of conversion from voicemail into textual and/or graphical data may be utilized to provide enhanced voicemail preview.
  • the transcribed voicemail is further processed based on contextual information such as rendering some of the information actionable, adding color/graphics schemes to improve presentation, switch-back and search capability on the audio file, and similar actions.
  • This enhanced voicemail preview may be integrated into an email, instant message, or similar text-based communication message. The message may also include a copy of the audio version of the voicemail or a link to a location of the audio file.
  • the message including the enhanced voicemail preview may be forwarded to the subscriber.
  • Actions provided to the called subscriber based on key portions of the voicemail preview may include actions to be performed by an email application displaying the voicemail preview or actions to be performed by other applications.
  • elements of the email user interface displaying the voicemail preview may be dynamically modified based on the actionable items in the voicemail preview.
  • process 700 is for illustration purposes. Enhanced voicemail preview via email or instant messaging may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein.

Abstract

Textual preview of a voicemail is generated and provided through email or similar media to users along with the audio version. Transcription of the textual version, as well as additional capabilities such as actionable terms, playback-jump, switching between text and audio versions, direct or metadata based searchability, and enhanced response capabilities are provided based on contextual data obtained from voicemail metadata and user associated data stores such as contact list, email history.

Description

    BACKGROUND
  • It is becoming common for users to have access to their voicemail in their email or instant messaging inbox. Primarily, this comes in the form of an audio attachment to an email (or instant message) where the audio contains the contents of the voicemail. The audio can be played back when desired. Processing of voicemail is often marked as a discontinuity in the information worker's typical communications workflow that is dominated by email and similar technologies.
  • Other approaches produce (either automatically or by use of humans) a transcription of the voicemail into the user's inbox. These services are typically not integrated into the user's normal data flow. Their primary value is derived from reading the transcription itself and, thus, a near-perfect transcription is important.
  • SUMMARY
  • This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to exclusively identify key features or essential features of the claimed subject matter, nor is it intended as an aid in determining the scope of the claimed subject matter.
  • Embodiments are directed to enabling voicemail preview in a combination of text and audio formats taking advantage of information automatically extracted from data sources associated with a user and voicemail metadata. In addition to back-and-forth switching capability between text and audio versions of the voicemail with playback-jump, elements of and interactions with the voicemail preview extend a value of the voicemail beyond simple reading of the text. According to some embodiments, information in the voicemail is surfaced and made actionable using contextual data.
  • These and other features and advantages will be apparent from a reading of the following detailed description and a review of the associated drawings. It is to be understood that both the foregoing general description and the following detailed description are explanatory and do not restrict aspects as claimed.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram illustrating an example unified communications system, where embodiments may be implemented for enhanced voicemail usage through automatic voicemail preview;
  • FIG. 2 is a conceptual diagram illustrating a basic example system for providing enhanced voicemail preview;
  • FIG. 3 is illustrates major components in providing enhanced voicemail preview according to embodiments;
  • FIG. 4 illustrates a screenshot of an example user interface for providing enhanced voicemail preview;
  • FIG. 5 is a networked environment, where a system according to embodiments may be implemented;
  • FIG. 6 is a block diagram of an example computing operating environment, where embodiments may be implemented; and
  • FIG. 7 illustrates a logic flow diagram for providing enhanced voicemail preview according to embodiments.
  • DETAILED DESCRIPTION
  • As briefly described above, a textual preview of a voicemail may be generated and provided through email or similar media to users along with the audio version. Transcription to a textual version, as well as additional capabilities such as actionable information is performed based on contextual data obtained from user associated data stores and voicemail metadata. In the following detailed description, references are made to the accompanying drawings that form a part hereof, and in which are shown by way of illustrations specific embodiments or examples. These aspects may be combined, other aspects may be utilized, and structural changes may be made without departing from the spirit or scope of the present disclosure. The following detailed description is therefore not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims and their equivalents.
  • While the embodiments will be described in the general context of program modules that execute in conjunction with an application program that runs on an operating system on a personal computer, those skilled in the art will recognize that aspects may also be implemented in combination with other program modules.
  • Generally, program modules include routines, programs, components, data structures, and other types of structures that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that embodiments may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and comparable computing devices. Embodiments may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
  • Embodiments may be implemented as a computer-implemented process (method), a computing system, or as an article of manufacture, such as a computer program product or computer readable media. The computer program product may be a computer storage medium readable by a computer system and encoding a computer program that comprises instructions for causing a computer or computing system to perform example process(es). The computer-readable storage medium can for example be implemented via one or more of a volatile computer memory, a non-volatile memory, a hard drive, a flash drive, a floppy disk, or a compact disk, and comparable media. The computer program product may also be a propagated signal on a carrier (e.g. a frequency or phase modulated signal) or medium readable by a computing system and encoding a computer program of instructions for executing a computer process.
  • Throughout this specification, the term “platform” may be a combination of software and hardware components for managing communication applications utilized for voicemail preview delivery. Examples of platforms include, but are not limited to, a hosted service executed over a plurality of servers, an application executed on a single server, and comparable systems. The term “server” generally refers to a computing device executing one or more software programs typically in a networked environment. However, a server may also be implemented as a virtual server (software programs) executed on one or more computing devices viewed as a server on the network. More detail on these technologies and example operations is provided below.
  • Moreover, the term “Voicemail preview” as used herein refers to textual data derived from a voicemail marked up with actionable items and integrated into a productivity service such as an email application, an instant message application and similar communication applications. The textual data may include machine transcriptions (e.g. automatic voice recognition), human transcriptions, and comparable forms that may reflect the complete voicemail or a summary of the same. According to some embodiments, the data derived from voicemail may also include graphical features.
  • Referring to FIG. 1, a diagram 100 of an example unified communications system, where embodiments may be implemented for enhanced voicemail usage through automatic voicemail preview, is illustrated. A unified communication system is an example of modern communication systems with a wide range of capabilities and services that can be provided to subscribers such as enhanced voicemail preview. A unified communication system is a real-time communications system facilitating instant messaging, presence, audio-video conferencing, web conferencing functionality, and comparable capabilities.
  • In a unified communication (“UC”) system such as the one shown in diagram 100, users may communicate via a variety of end devices (102, 104), which are client devices of the UC system. Each client device may be capable of executing one or more communication applications for voice communication, video communication, instant messaging, application sharing, data sharing, and the like. In addition to their advanced functionality, the end devices may also facilitate traditional phone calls through an external connection such as through PBX 124 to a Public Switched Telephone Network (“PSTN”). End devices may include any type of smart phone, cellular phone, any computing device executing a communication application, a smart automobile console, and advanced phone devices with additional functionality.
  • UC Network(s) 110 includes a number of servers performing different tasks. For example, UC servers 114 provide registration, presence, and routing functionalities. Routing functionality enables the system to route calls to a user to anyone of the client devices assigned to the user based on default and/or user set policies. For example, if the user is not available through a regular phone, the call may be forwarded to the user's cellular phone, and if that is not answering a number of voicemail options may be utilized. Since the end devices can handle additional communication modes, UC servers 114 may provide access to these additional communication modes (e.g. instant messaging, video communication, etc.) through access server 112. Access server 112 resides in a perimeter network and enables connectivity through UC network(s) 110 with other users in one of the additional communication modes. UC servers 114 may include servers that perform combinations of the above described functionalities or specialized servers that only provide a particular functionality. For example, home servers providing presence functionality, routing servers providing routing functionality, rights management servers, and so on. Similarly, access server 112 may provide multiple functionalities such as firewall protection and connectivity, or only specific functionalities.
  • Audio/Video (AN) conferencing server 118 provides audio and/or video conferencing capabilities by facilitating those over an internal or external network. Mediation server 116 mediates signaling and media to and from other types of networks such as a PSTN or a cellular network (e.g. calls through PBX 124 or from cellular phone 122). Voicemail server 115 may manage voicemails for subscribers of the UC system performing tasks like storage and delivery of voicemails, transcription of audio files into textual data and generation of enhanced voicemail preview emails or instant messages according to some embodiments. Mediation server 116 may also act as a Session Initiation Protocol (SIP) user agent.
  • In a UC system, users may have one or more identities, which is not necessarily limited to a phone number. The identity may take any form depending on the integrated networks, such as a telephone number, a Session Initiation Protocol (SIP) Uniform Resource Identifier (URI), or any other identifier. While any protocol may be used in a UC system, SIP is a preferred method.
  • SIP is an application-layer control (signaling) protocol for creating, modifying, and terminating sessions with one or more participants. It can be used to create two-party, multiparty, or multicast sessions that include Internet telephone calls, multimedia distribution, and multimedia conferences. SIP is designed to be independent of the underlying transport layer.
  • SIP clients may use Transport Control Protocol (“TCP”) to connect to SIP servers and other SIP endpoints. SIP is primarily used in setting up and tearing down voice or video calls. However, it can be used in any application where session initiation is a requirement. These include event subscription and notification, terminal mobility, and so on. Voice and/or video communications are typically done over separate session protocols, typically Real Time Protocol (“RTP”).
  • While the example system in FIG. 1 has been described with specific components such as mediation server, AN server, and similar devices, embodiments are not limited to this system of the example components and configurations. A service for enhanced voicemail preview may be implemented in other systems and configurations employing fewer or additional components. Furthermore, such systems do not have to be enhanced communication systems integrating various communication modes. Embodiments may also be implemented for voicemails delivered in traditional communication systems such as PSTN or cellular networks using the principles described herein.
  • FIG. 2 is a conceptual diagram 200 illustrating a basic example system for providing enhanced voicemail preview. While a system according to embodiments is likely to include a number of servers, client devices, and services such as those illustratively discussed in FIG. 1, only those relevant to embodiments are shown in FIG. 2.
  • As discussed previously, there are many situations, where listening to voicemail, even in form of an attached recording, is not possible or is inconvenient. A text form of the voicemail is beneficial in many instances, but available solutions have previously mentioned shortcomings inherent in transcriptions. Embodiments provide technologies supporting the integration of voicemail more thoroughly and more richly into the user's information workflow. This integration into the user's most prevalent information processing (email, instant messaging, and similar forms) is aided by the context available in the user's data store and provides additional capabilities beyond the simple reading of the message such as audio navigation, contact generation, search over voicemail, instant message behavior, and comparable ones.
  • According to some embodiments, a textual preview of the voicemail is generated by means of automatic speech recognition and delivered it to the recipient through email, instant message, or similar messaging technology. A speech recognizer is integrated directly with the voicemail and messaging systems, according to one embodiment. Due to this deep integration, the speech recognizer is able to leverage significant contextual information about the caller and callee to improve the recognition accuracy (fidelity). This includes, but is not limited to, the names of the parties, their respective contact lists, their organizational structures, previous communications between the parties, communications relevant to the parties, etc. As mentioned above, however, embodiments are not limited to data generated by automatic speech recognition. Actionable items and contextual information may also be provided employing human transcribed data, combination of human transcribed and machine generated data, and comparable information that is marked up in a schema decipherable by client applications.
  • Thus, voicemail server 234 in the example system receives the voicemail in audio form and further receives contextual information from sources like presence server 236, email server 238, and so on. For example presence information may include location associated with the caller and words in the voicemail may be recognized with higher accuracy based on the knowledge of where the caller is. Voicemail server 234 may process the text from the speech recognizer or another transcription source such that key information (e.g. names or key points) can be identified, specially rendered, and made actionable as appropriate for the user's benefit. A rendering sub-system may use further contextual information along with the conceptual highlighting provided by the speech sub-system to provide a visual representation of the voicemail for the user. The rendered voicemail preview 240 may be provided to the user's (244) messaging application executed on computing device 242 to be presented by a messaging user interface. According to other embodiments, client applications (instead of voicemail server) may integrate various information sources such as presence into the voicemail preview message.
  • An example implementation of a messaging application for delivering enhanced voicemail preview is email. An email message may deliver the rendered voicemail preview along with an audio file of the voicemail such that switch back between the textual data and the audio file is enabled along with search capability allowing the recipient to search easily for portions of the audio file in addition to having the capability to request/perform actions based on the information in the voicemail. While email and instant messaging is frequently referred to as example services to which a voicemail preview may be integrated, embodiments are not limited to those. Other messaging systems such as SMS, RSS feeds, and comparable ones may also be employed in providing an enhanced voicemail preview experience to users.
  • FIG. 3 is illustrates major components in providing enhanced voicemail preview according to embodiments in diagram 300. As discussed above, contextual information such as presence information 352, contact/address book information 354 (associated with the caller and/or callee), email history information 356, and similar data is used at various stages of generating enhanced voicemail preview.
  • For example, the audio version of the voicemail 358 may be transcribed into a rich text form 360 (with actionable terms, highlights, and other features) using the contextual information to improve accuracy of transcription and to add the features. The rich text forms 360 of the voicemail may then be further processed (362) with additional features again using the contextual information. The information used at different stages may be distinct. For example, caller associated information may be used in one stage, while callee associated information may be employed for the other stage.
  • The end product of processing 362, the enhanced voicemail preview may be integrated into an email message along with the audio version of the voicemail (358) and presented to the subscriber through an email user interface 364. The audio version of the voicemail may be attached to the email message or a link to its location may be provided.
  • Through these major components and their interactions, voicemail is integrated into email workflow along with presence information. Key portions of the voicemail are made actionable, and advanced navigation of the voicemail audio is provided allowing the subscriber to “jump” to any location in the voicemail audio file by clicking on the appropriate text (i.e. playback jump or audio repositioning). Thus, the voicemail's text representation is presented such that the subscriber can more easily find key information without distraction from the typically present speech recognition errors.
  • The above discussed scenarios, example systems, applications, and configurations are for illustration purposes. Embodiments are not restricted to those examples. Other forms of transcription, configuration, communication modes, and scenarios may be used in implementing enhanced voicemail preview in a similar manner using the principles described herein.
  • FIG. 4 illustrates a screenshot of an example user interface for providing enhanced voicemail preview. The elements and configuration of the user interface on the screenshot are for illustration purposes only and do not constitute a limitation on embodiments. A messaging application capable of presenting enhanced voicemail previews to users may employ any user interface with other elements and configurations. Example user interface includes standard graphic and textual elements 472 such as commands, options, and other items.
  • The voicemail rendering provides information to the email user interface, which assists in navigation and playback of the voicemail audio. For example, clicking on a word in the textual voicemail rendering 476 may start playing the voicemail from the position in which that word was spoken. Highlighting a set of words may play just that segment of the voicemail audio. In addition to the text-audio connection, a standard audio playback user interface element 474 may also be provided.
  • Portions of the voicemail rendering are made actionable in the email user interface by integrating with presence information available to the email system. For example, right-clicking on a name brings up a list of actions that a user may want to carry out relative to the name such as adding that name to the user's contact list, contacting the named person via an instant message, placing a voice call to the named person, and similar actions. The actions may be presented in a pop-up menu 480 or similar user interface elements (hovering display element, drop-down menu, and the like).
  • The rendered textual voicemail 476 may include graphic or color scheme based elements to provide additional information on syntax and actionable words. Furthermore, the information available in the rendered voicemail may be added to the search index of the email server. Thus, voicemails are made searchable both in the text that is rendered and, potentially, in the metadata underlying the text. The previewed data is also searchable via desktop search systems, mobile device systems, and so on, not just servers. The email carrying the voicemail preview may include the names of the caller and callee, as well as the caller's detailed presence information to enable the callee take proper action based on the information in the email.
  • A user interface for a messaging application capable of presenting voicemail preview may include additional or fewer textual and graphical elements, and may employ various graphical, color, and other configuration schemes to display different functionalities and associated processes.
  • FIG. 5 is an example networked environment, where embodiments may be implemented. A platform providing communication services with enhanced voicemail preview may be implemented via software executed over one or more servers 518 such as a hosted service. The platform may communicate with client applications on individual computing devices such as a cellular phone 513, a laptop computer 512, and desktop computer 511 (client devices) through network(s) 510.
  • As discussed above, voicemail may be delivered from a number of sources including traditional phone systems, enhanced communication systems, and so on, within the network(s) 510 or through external network(s) 520 to a voicemail management application/server. The voicemail management application/server may receive additional information including, but not limited to, presence information, contact information, as well as voicemail metadata. This contextual information may be utilized in transcription of the voicemail into textual data and generation of advanced capabilities such as actionable data to be presented in a textual communication to a subscriber receiving the voicemail.
  • Client devices 511-513 are used to facilitate communications through a variety of modes between subscribers of the communication system. Information associated with subscribers and facilitating enhanced voicemail preview may be stored in one or more data stores (e.g. data store 516), which may be managed by any one of the servers 518 or by database server 514.
  • Network(s) 510 may comprise any topology of servers, clients, Internet service providers, and communication media. A system according to embodiments may have a static or dynamic topology. Network(s) 510 may include a secure network such as an enterprise network, an unsecure network such as a wireless open network, or the Internet. Network(s) 510 may also coordinate communication over other networks such as PSTN or cellular networks (e.g. external network(s) 520). Network(s) 510 provides communication between the nodes described herein. By way of example, and not limitation, network(s) 510 may include wireless media such as acoustic, RF, infrared and other wireless media.
  • Many other configurations of computing devices, applications, data sources, and data distribution systems may be employed to implement a voicemail preview system with advanced capabilities. Furthermore, the networked environments discussed in FIG. 5 are for illustration purposes only. Embodiments are not limited to the example applications, modules, or processes.
  • FIG. 6 and the associated discussion are intended to provide a brief, general description of a suitable computing environment in which embodiments may be implemented. With reference to FIG. 6, a block diagram of an example computing operating environment for an application according to embodiments is illustrated, such as computing device 600. In a basic configuration, computing device 800 may be a voicemail management server as part of a communication system and include at least one processing unit 602 and system memory 604. Computing device 600 may also include a plurality of processing units that cooperate in executing programs. Depending on the exact configuration and type of computing device, the system memory 604 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.) or some combination of the two. System memory 604 typically includes an operating system 605 suitable for controlling the operation of the platform, such as the WINDOWS® operating systems from MICROSOFT CORPORATION of Redmond, Wash. The system memory 604 may also include one or more software applications such as program modules 606, voicemail application 622, and transcription module 624.
  • Voicemail application 622 may be part of a service that facilitates communication through various modalities between client applications, servers, and other devices (e.g. voice, email, instant messaging, etc.). Transcription module 624 may transcribe audio voicemail files into textual data using contextual information as discussed previously. Transcription module 624 and voicemail application 622 may be separate applications or integral modules of a hosted service that provides enhanced communication services to client applications/devices such as voicemail preview with actionable information through email or instant messaging. This basic configuration is illustrated in FIG. 6 by those components within dashed line 608.
  • Computing device 600 may have additional features or functionality. For example, the computing device 600 may also include additional data storage devices (removable and/or non-removable) such as, for example, magnetic disks, optical disks, or tape. Such additional storage is illustrated in FIG. 6 by removable storage 609 and non-removable storage 610. Computer readable storage media may include volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data. System memory 604, removable storage 609 and non-removable storage 610 are all examples of computer readable storage media. Computer readable storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 600. Any such computer readable storage media may be part of computing device 600. Computing device 600 may also have input device(s) 612 such as keyboard, mouse, pen, voice input device, touch input device, and comparable input devices. Output device(s) 614 such as a display, speakers, printer, and other types of output devices may also be included. These devices are well known in the art and need not be discussed at length here.
  • Computing device 600 may also contain communication connections 616 that allow the device to communicate with other devices 618, such as over a wireless network in a distributed computing environment, a satellite link, a cellular link, and comparable mechanisms. Other devices 618 may include computer device(s) that execute communication applications, email or presence servers, and comparable devices. Communication connection(s) 616 is one example of communication media. Communication media can include therein computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.
  • Example embodiments also include methods. These methods can be implemented in any number of ways, including the structures described in this document. One such way is by machine operations, of devices of the type described in this document.
  • Another optional way is for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program.
  • FIG. 7 illustrates a logic flow diagram of process 700 for providing enhanced voicemail preview according to embodiments. Process 700 may be implemented as part of an enhanced communication system in a voicemail server.
  • Process 700 begins with operation 710, where a voicemail is received for a subscriber associated with the voicemail server. The voicemail may be stored as an audio file. At operation 720, contextual information such as presence or contact information associated with the subscriber (and their contacts), email history information, and voicemail metadata (a source of the voicemail and any additional information that may be included with the voicemail in an enhanced communication system) may be received by the voicemail server.
  • At operation 730, the audio voicemail is transcribed into textual data using the contextual information. The contextual information is not only used to improve fidelity of the transcription, but it may also be used to surface information within the voicemail in textual format. As discussed previously, embodiments are not limited to automatic voice-recognition based transcriptions or textual data. Other forms of conversion from voicemail into textual and/or graphical data may be utilized to provide enhanced voicemail preview. At operation 740, the transcribed voicemail is further processed based on contextual information such as rendering some of the information actionable, adding color/graphics schemes to improve presentation, switch-back and search capability on the audio file, and similar actions. This enhanced voicemail preview may be integrated into an email, instant message, or similar text-based communication message. The message may also include a copy of the audio version of the voicemail or a link to a location of the audio file. At operation 750, the message including the enhanced voicemail preview may be forwarded to the subscriber.
  • Some or all of the actionable items may be presented to the subscriber upon completion of an authorization process. Actions provided to the called subscriber based on key portions of the voicemail preview may include actions to be performed by an email application displaying the voicemail preview or actions to be performed by other applications. Furthermore, elements of the email user interface displaying the voicemail preview may be dynamically modified based on the actionable items in the voicemail preview.
  • The operations included in process 700 are for illustration purposes. Enhanced voicemail preview via email or instant messaging may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein.
  • The above specification, examples and data provide a complete description of the manufacture and use of the composition of the embodiments. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims and embodiments.

Claims (21)

1.-20. (canceled)
21. A method to be executed at least in part in a computing device for providing enhanced voicemail preview, the method comprising:
receiving an audio voicemail for a user;
determining contextual information associated with the user and a calling party, wherein the contextual information associated with the calling party includes contact and presence information associated with the calling party;
generating actionable items based on the contextual information associated with the calling party;
generating a voicemail preview based on a transcription of the audio voicemail employing at least a portion of the contextual information such that an accuracy of audio-text fidelity is increased and a plurality of actionable items are provided in the voicemail preview; and
integrating the voicemail preview into a text-based communication for delivery to the user along with access to the audio voicemail.
22. The method of claim 21, further comprising:
providing one or more controls on the voicemail preview for enabling the user to one of: reply to the calling party, forward the voicemail preview to a third party, and translate the voicemail preview.
23. The method of claim 22, wherein the controls further enable the user to one of: play the audio voicemail on a computing device and play the audio voicemail on a phone associated with the user.
24. The method of claim 21, further comprising:
displaying a secondary user interface associated with the displayed voicemail preview enabling the user to one of: view calling party contact information, call the calling party, email the calling party, send a text message to the calling party, schedule a meeting with the calling party, and tag the calling party for presence alerts.
25. The method of claim 24, wherein the secondary user interface includes summary presence information associated with the calling party and enables the user to add the calling party to a contacts list of the user.
26. The method of claim 21, further comprising:
generating the voicemail preview by performing speech recognition on the received audio voicemail and employing the contextual information associated with the user and the calling party to enhance a recognition accuracy, wherein the contextual information includes at least one from a set of: voicemail metadata, presence information, contact information, organizational information, location information, prior communication information, and relevant communication information.
27. The method of claim 21, further comprising:
identifying one or more of a name, a location, and a phone number in the transcribed voicemail as actionable items.
28. The method of claim 27, further comprising:
displaying contact information and providing one or more contact actions associated with identified name;
enabling viewing of a map showing the identified location; and
providing an option to call the identified phone number.
29. The method of claim 21, further comprising:
providing the actionable items and contextual information employing one or more of human transcribed data, combination of human transcribed and machine generated data, and comparable information that is marked up in a schema decipherable by a client application.
30. The method of claim 21, further comprising:
delivering the voicemail preview to the user through one or more of an email message, an instant message, an SMS message, and an RSS feed message.
31. The method of claim 21, further comprising:
providing one or more of rich text, highlighting, and graphics in displaying the transcribed voicemail to deliver contextual information.
32. The method of claim 21, further comprising:
enabling the user to jump to a location on the audio playback of the voicemail through selection of the location on the transcribed text of the voicemail.
33. A communication server for implementing enhanced voicemail preview, the server comprising:
a memory configured to store instructions; and
a processor executing a voicemail application and a transcript module in conjunction with the instructions stored in the memory, wherein the voicemail application is configured to:
receive an audio voicemail for a user;
determine contextual information associated with the user and a calling party, wherein the contextual information associated with the calling party includes contact and presence information associated with the calling party;
generate actionable items based on the contextual information associated with the calling party;
generate a voicemail preview by performing speech recognition on the received audio voicemail and employing the contextual information associated with the user and the calling party to enhance a recognition accuracy;
integrate the voicemail preview into a text-based communication employing one or more of rich text, highlighting, and graphics in displaying the transcribed voicemail;
integrate the audio voicemail with the text-based communication; and
deliver the text-based communication to the user along.
34. The server of claim 33, wherein the audio voicemail is integrated into the text-based communication through one of an attachment and a link, and access to the audio voicemail is provided through an audio playback user interface embedded into the text-based communication.
35. The server of claim 33, wherein the voicemail application enables presentation of one or more secondary user interfaces providing one or more custom actions for each actionable item on the voicemail preview upon selection of each actionable item.
36. The server of claim 33, wherein the voicemail application is further configured to enable addition of information available in the transcribed voicemail to a search index of a server managing the text-based communication such that contents of the voicemail are searchable through the server managing the text-based communication.
37. The server of claim 33, wherein the voicemail application is further configured to enable addition of information available in the transcribed voicemail to a search index of one of a client application and a client desktop such that contents of the voicemail are searchable through the client application and the client desktop.
38. A computer-readable memory device with instructions stored thereon for providing enhanced voicemail preview, the instructions comprising:
receiving an audio voicemail for a user;
determining contextual information associated with the user and a calling party, wherein the contextual information associated with the calling party includes contact and presence information associated with the calling party;
generating actionable items based on the contextual information associated with the calling party;
generating a voicemail preview based on a transcription of the audio voicemail employing at least a portion of the contextual information such that an accuracy of audio-text fidelity is increased and a plurality of actionable items are provided in the voicemail preview;
integrating the voicemail preview into a text-based communication for delivery to the user along with access to the audio voicemail;
enabling presentation of one or more controls on the voicemail preview for enabling the user to one of: reply to the calling party, forward the voicemail preview to a third party, and translate the voicemail preview; and
enabling display of a secondary user interface associated with the displayed voicemail preview enabling the user to one of: view calling party contact information, call the calling party, email the calling party, send a text message to the calling party, schedule a meeting with the calling party, and tag the calling party for presence alerts.
39. The computer-readable medium of claim 38, wherein the instructions further comprise:
enabling dynamic modification of elements of a user interface displaying the voicemail preview based on the actionable items in the voicemail preview.
40. The computer-readable medium of claim 38, wherein the instructions further comprise:
identifying one or more of a name, a location, and a phone number in the transcribed voicemail as actionable items;
displaying contact information and providing one or more contact actions associated with identified name;
enabling viewing of a map showing the identified location; and
providing an option to call the identified phone number.
US13/681,633 2009-01-09 2012-11-20 Enhanced voicemail usage through automatic voicemail preview Abandoned US20130077769A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/681,633 US20130077769A1 (en) 2009-01-09 2012-11-20 Enhanced voicemail usage through automatic voicemail preview

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/351,681 US8345832B2 (en) 2009-01-09 2009-01-09 Enhanced voicemail usage through automatic voicemail preview
US13/681,633 US20130077769A1 (en) 2009-01-09 2012-11-20 Enhanced voicemail usage through automatic voicemail preview

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US12/351,681 Continuation US8345832B2 (en) 2009-01-09 2009-01-09 Enhanced voicemail usage through automatic voicemail preview

Publications (1)

Publication Number Publication Date
US20130077769A1 true US20130077769A1 (en) 2013-03-28

Family

ID=42317047

Family Applications (2)

Application Number Title Priority Date Filing Date
US12/351,681 Active 2030-11-07 US8345832B2 (en) 2009-01-09 2009-01-09 Enhanced voicemail usage through automatic voicemail preview
US13/681,633 Abandoned US20130077769A1 (en) 2009-01-09 2012-11-20 Enhanced voicemail usage through automatic voicemail preview

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US12/351,681 Active 2030-11-07 US8345832B2 (en) 2009-01-09 2009-01-09 Enhanced voicemail usage through automatic voicemail preview

Country Status (11)

Country Link
US (2) US8345832B2 (en)
EP (1) EP2377092A4 (en)
JP (1) JP5362034B2 (en)
KR (1) KR101691239B1 (en)
CN (1) CN102272789B (en)
AU (1) AU2009336048B2 (en)
BR (1) BRPI0922394A2 (en)
CA (1) CA2743586C (en)
RU (1) RU2520355C2 (en)
TW (1) TWI497315B (en)
WO (1) WO2010080261A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120209925A1 (en) * 2011-02-11 2012-08-16 Acer Incorporated Intelligent data management methods and systems, and computer program products thereof
US9992343B2 (en) 2015-04-14 2018-06-05 Microsoft Technology Licensing, Llc Text translation of an audio recording during recording capture
US10771629B2 (en) * 2017-02-06 2020-09-08 babyTel Inc. System and method for transforming a voicemail into a communication session

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9710819B2 (en) * 2003-05-05 2017-07-18 Interactions Llc Real-time transcription system utilizing divided audio chunks
US9020107B2 (en) 2006-11-14 2015-04-28 Nuance Communications, Inc. Performing actions for users based on spoken information
US8254535B1 (en) 2006-11-14 2012-08-28 Nuance Communications, Inc. Performing actions for users based on spoken information
US8290126B2 (en) * 2008-12-19 2012-10-16 CenturyLing Intellectual Property LLC System and method for a visual voicemail interface
US8290124B2 (en) * 2008-12-19 2012-10-16 At&T Mobility Ii Llc Conference call replay
US8537980B2 (en) * 2009-03-27 2013-09-17 Verizon Patent And Licensing Inc. Conversation support
US8509398B2 (en) * 2009-04-02 2013-08-13 Microsoft Corporation Voice scratchpad
WO2011008978A1 (en) * 2009-07-15 2011-01-20 Google Inc. Commands directed at displayed text
US9183834B2 (en) * 2009-07-22 2015-11-10 Cisco Technology, Inc. Speech recognition tuning tool
US20150156154A1 (en) * 2010-03-04 2015-06-04 Google Inc. Storage and retrieval of electronic messages using linked resources
US8417223B1 (en) 2010-08-24 2013-04-09 Google Inc. Advanced voicemail features without carrier voicemail support
CN106027383A (en) * 2010-09-30 2016-10-12 联想(北京)有限公司 Portable electronic device and content release method for portable electronic device
US9119041B2 (en) * 2010-11-15 2015-08-25 At&T Intellectual Property I, L.P. Personal media storage and retrieval for visual voice mail
US8913722B2 (en) 2011-05-05 2014-12-16 Nuance Communications, Inc. Voicemail preview and editing system
JP5799621B2 (en) * 2011-07-11 2015-10-28 ソニー株式会社 Information processing apparatus, information processing method, and program
US9009041B2 (en) 2011-07-26 2015-04-14 Nuance Communications, Inc. Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data
US10388294B1 (en) * 2012-06-20 2019-08-20 Amazon Technologies, Inc. Speech-based and group-based content synchronization
CN103973542B (en) * 2013-02-01 2017-06-13 腾讯科技(深圳)有限公司 A kind of voice information processing method and device
US8537983B1 (en) 2013-03-08 2013-09-17 Noble Systems Corporation Multi-component viewing tool for contact center agents
US11094320B1 (en) * 2014-12-22 2021-08-17 Amazon Technologies, Inc. Dialog visualization
US10412029B2 (en) 2015-12-11 2019-09-10 Microsoft Technology Licensing, Llc Providing rich preview of communication in communication summary
DE102017103533A1 (en) 2017-02-21 2018-08-23 Grundig Business Systems Gmbh Method and device for text-based preview of the content of audio files

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020196910A1 (en) * 2001-03-20 2002-12-26 Steve Horvath Method and apparatus for extracting voiced telephone numbers and email addresses from voice mail messages
US20030128820A1 (en) * 1999-12-08 2003-07-10 Julia Hirschberg System and method for gisting, browsing and searching voicemail using automatic speech recognition
US20050243979A1 (en) * 2004-04-30 2005-11-03 Starbuck Bryan T Integrated messaging user interface with message-based logging
US20070174388A1 (en) * 2006-01-20 2007-07-26 Williams Michael G Integrated voice mail and email system
US20100056113A1 (en) * 2008-08-26 2010-03-04 At&T Mobility Ii Llc Location-Aware Voicemail
US20100158213A1 (en) * 2008-12-19 2010-06-24 At&T Mobile Ii, Llc Sysetms and Methods for Intelligent Call Transcription
US8121263B2 (en) * 2006-07-21 2012-02-21 Google Inc. Method and system for integrating voicemail and electronic messaging
US8526580B2 (en) * 2006-08-31 2013-09-03 Broadcom Corporation System and method for voicemail organization

Family Cites Families (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7386452B1 (en) * 2000-01-27 2008-06-10 International Business Machines Corporation Automated detection of spoken numbers in voice messages
US6507643B1 (en) * 2000-03-16 2003-01-14 Breveon Incorporated Speech recognition system and method for converting voice mail messages to electronic mail messages
US6775651B1 (en) * 2000-05-26 2004-08-10 International Business Machines Corporation Method of transcribing text from computer voice mail
TW561386B (en) * 2000-10-17 2003-11-11 Compal Electronics Inc Voice e-mail forwarding/receiving method and apparatus thereof
US7039585B2 (en) * 2001-04-10 2006-05-02 International Business Machines Corporation Method and system for searching recorded speech and retrieving relevant segments
JP2003032388A (en) * 2001-07-12 2003-01-31 Denso Corp Communication terminal and processing system
US6782086B2 (en) * 2001-08-02 2004-08-24 Intel Corporation Caller ID lookup
US7007085B1 (en) * 2001-09-28 2006-02-28 Bellsouth Intellectual Property Corporation Message log for wireline, voice mail, email, fax, pager, instant messages and chat
ITFI20010199A1 (en) * 2001-10-22 2003-04-22 Riccardo Vieri SYSTEM AND METHOD TO TRANSFORM TEXTUAL COMMUNICATIONS INTO VOICE AND SEND THEM WITH AN INTERNET CONNECTION TO ANY TELEPHONE SYSTEM
JP2003198763A (en) * 2001-12-27 2003-07-11 Nef:Kk System for transferring e-mail of automatic answering telephone message
US7224981B2 (en) * 2002-06-20 2007-05-29 Intel Corporation Speech recognition of mobile devices
US7085554B2 (en) * 2003-01-24 2006-08-01 Common Voices Llc Subscriber migration system
US8229086B2 (en) * 2003-04-01 2012-07-24 Silent Communication Ltd Apparatus, system and method for providing silently selectable audible communication
GB2421879A (en) * 2003-04-22 2006-07-05 Spinvox Ltd Converting voicemail to text message for transmission to a mobile telephone
US8638910B2 (en) * 2003-07-14 2014-01-28 Cisco Technology, Inc. Integration of enterprise voicemail in mobile systems
US7317788B2 (en) * 2004-01-23 2008-01-08 Siemens Communications, Inc. Method and system for providing a voice mail message
US7912186B2 (en) * 2004-10-20 2011-03-22 Microsoft Corporation Selectable state machine user interface system
US7623633B2 (en) * 2005-04-28 2009-11-24 Cisco Technology, Inc. System and method for providing presence information to voicemail users
CN1710892A (en) * 2005-07-27 2005-12-21 北京立通无限科技有限公司 Speech-sound mail transfering system and method
TW200707239A (en) * 2005-08-01 2007-02-16 Chao-Hsin Lo E-mail assisted and text-to-sound system
US8498624B2 (en) * 2005-12-05 2013-07-30 At&T Intellectual Property I, L.P. Method and apparatus for managing voicemail messages
US7693267B2 (en) * 2005-12-30 2010-04-06 Microsoft Corporation Personalized user specific grammars
EP1858005A1 (en) * 2006-05-19 2007-11-21 Texthelp Systems Limited Streaming speech with synchronized highlighting generated by a server
US8385513B2 (en) * 2006-05-31 2013-02-26 Microsoft Corporation Processing a received voicemail message
US8054961B2 (en) * 2006-09-29 2011-11-08 Siemens Enterprise Communications, Inc. MeetMe assistant
US8064576B2 (en) 2007-02-21 2011-11-22 Avaya Inc. Voicemail filtering and transcription
US8107598B2 (en) 2007-02-21 2012-01-31 Avaya Inc. Voicemail filtering and transcription
RU2324296C1 (en) * 2007-03-26 2008-05-10 Закрытое акционерное общество "Ай-Ти Мобайл" Method for message exchanging and devices for implementation of this method
US9036794B2 (en) * 2007-04-25 2015-05-19 Alcatel Lucent Messaging system and method for providing information to a user device
US20080273675A1 (en) * 2007-05-03 2008-11-06 James Siminoff Systems And Methods For Displaying Voicemail Transcriptions
US20100279660A1 (en) * 2007-09-28 2010-11-04 Sony Ericsson Mobile Communications Ab System and method for visual voicemail
US8542804B2 (en) * 2008-02-08 2013-09-24 Voxer Ip Llc Voice and text mail application for communication devices
US20090307090A1 (en) * 2008-06-05 2009-12-10 Embarq Holdings Company, Llc System and Method for Inserting Advertisements in Voicemail
US20100111270A1 (en) * 2008-10-31 2010-05-06 Vonage Holdings Corp. Method and apparatus for voicemail management
US8406386B2 (en) * 2008-12-15 2013-03-26 Verizon Patent And Licensing Inc. Voice-to-text translation for visual voicemail
US8972506B2 (en) * 2008-12-15 2015-03-03 Verizon Patent And Licensing Inc. Conversation mapping
US8204486B2 (en) * 2008-12-19 2012-06-19 Cox Communications, Inc. Dynamic messaging routing and audio-to-text linking for visual voicemail

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030128820A1 (en) * 1999-12-08 2003-07-10 Julia Hirschberg System and method for gisting, browsing and searching voicemail using automatic speech recognition
US20020196910A1 (en) * 2001-03-20 2002-12-26 Steve Horvath Method and apparatus for extracting voiced telephone numbers and email addresses from voice mail messages
US20050243979A1 (en) * 2004-04-30 2005-11-03 Starbuck Bryan T Integrated messaging user interface with message-based logging
US20070174388A1 (en) * 2006-01-20 2007-07-26 Williams Michael G Integrated voice mail and email system
US8121263B2 (en) * 2006-07-21 2012-02-21 Google Inc. Method and system for integrating voicemail and electronic messaging
US8526580B2 (en) * 2006-08-31 2013-09-03 Broadcom Corporation System and method for voicemail organization
US20100056113A1 (en) * 2008-08-26 2010-03-04 At&T Mobility Ii Llc Location-Aware Voicemail
US20100158213A1 (en) * 2008-12-19 2010-06-24 At&T Mobile Ii, Llc Sysetms and Methods for Intelligent Call Transcription

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120209925A1 (en) * 2011-02-11 2012-08-16 Acer Incorporated Intelligent data management methods and systems, and computer program products thereof
US9992343B2 (en) 2015-04-14 2018-06-05 Microsoft Technology Licensing, Llc Text translation of an audio recording during recording capture
US10771629B2 (en) * 2017-02-06 2020-09-08 babyTel Inc. System and method for transforming a voicemail into a communication session

Also Published As

Publication number Publication date
TWI497315B (en) 2015-08-21
RU2011128410A (en) 2013-05-10
WO2010080261A2 (en) 2010-07-15
CA2743586C (en) 2017-03-07
US8345832B2 (en) 2013-01-01
CN102272789B (en) 2013-08-21
TW201027356A (en) 2010-07-16
JP2012514938A (en) 2012-06-28
EP2377092A2 (en) 2011-10-19
AU2009336048B2 (en) 2014-06-05
RU2520355C2 (en) 2014-06-20
CN102272789A (en) 2011-12-07
WO2010080261A3 (en) 2010-09-02
US20100177877A1 (en) 2010-07-15
KR101691239B1 (en) 2016-12-29
JP5362034B2 (en) 2013-12-11
CA2743586A1 (en) 2010-07-15
AU2009336048A1 (en) 2011-07-07
BRPI0922394A2 (en) 2016-01-05
EP2377092A4 (en) 2014-01-29
KR20110117072A (en) 2011-10-26

Similar Documents

Publication Publication Date Title
US8345832B2 (en) Enhanced voicemail usage through automatic voicemail preview
US9258143B2 (en) Contextual summary of recent communications method and apparatus
US8583642B2 (en) Aggregated subscriber profile based on static and dynamic information
US8654958B2 (en) Managing call forwarding profiles
US8526969B2 (en) Nearby contact alert based on location and context
US8781094B2 (en) Contextual call routing by calling party specified information through called party specified form
US8442189B2 (en) Unified communications appliance
US8666052B2 (en) Universal phone number for contacting group members
US8275102B2 (en) Call routing and prioritization based on location context
US7801968B2 (en) Delegated presence for unified messaging/unified communication
US10966063B2 (en) Message management methods and systems
US8117201B2 (en) Pre-populated and administrator defined groups in contacts lists
SG174563A1 (en) Multimodal conversation park and retrieval
US7849206B2 (en) Service for policy rule specification evaluation and enforcement on multiple communication modes
US8490119B2 (en) Communication interface for non-communication applications

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HAMAKER, JON;WILSON, MICHAEL;MILLETT, TOM;AND OTHERS;SIGNING DATES FROM 20121212 TO 20130102;REEL/FRAME:029935/0651

AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HAMAKER, JON;WILSON, MICHAEL;MILLETT, TOM;AND OTHERS;SIGNING DATES FROM 20121212 TO 20130102;REEL/FRAME:029962/0229

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0541

Effective date: 20141014