WO2014098661A1 - A server and a communication apparatus for videoconferencing - Google Patents

A server and a communication apparatus for videoconferencing Download PDF

Info

Publication number
WO2014098661A1
WO2014098661A1 PCT/SE2012/051418 SE2012051418W WO2014098661A1 WO 2014098661 A1 WO2014098661 A1 WO 2014098661A1 SE 2012051418 W SE2012051418 W SE 2012051418W WO 2014098661 A1 WO2014098661 A1 WO 2014098661A1
Authority
WO
WIPO (PCT)
Prior art keywords
video conference
communication apparatus
voiceprint
server
user
Prior art date
Application number
PCT/SE2012/051418
Other languages
French (fr)
Inventor
Ryoji Kato
Shingo Murakami
Takeshi Matsumura
Toshikane Oda
Original Assignee
Telefonaktiebolaget L M Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget L M Ericsson (Publ) filed Critical Telefonaktiebolaget L M Ericsson (Publ)
Priority to PCT/SE2012/051418 priority Critical patent/WO2014098661A1/en
Publication of WO2014098661A1 publication Critical patent/WO2014098661A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1822Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission

Definitions

  • the present invention relates to a server communication apparatus.
  • a conventional video conference service requires a user to perform some operations to
  • server for providing a video conference service includes a detection unit
  • a first communication apparatus for example, PC
  • a second communication apparatus for example, a mobile phone
  • an identification unit configured to
  • a video conference control unit configured to request at least one of the identified candidate communication apparatuses to participate in the video conference.
  • a communication apparatus for use in a video conference, comprising is provided.
  • the apparatus includes: a microphone; a communication unit configured to receive a request to participate in a video conference from a server that provides a video conference service; an obtaining unit configured to obtain a voiceprint from voice collected from the microphone; and a presentation unit configured to present a user interface for
  • obtained voiceprint matches a reference voiceprint.
  • FIG. 1 illustrates an exemplary environment of some embodiments.
  • FIG. 2 illustrates an exemplary block diagram of a video conference server according to some embodiments.
  • FIG. 3 illustrates an exemplary block diagram of a PC according to some embodiments.
  • FIGs. 4-6 illustrate exemplary operations during a video conference service according to some embodiments.
  • FIG. 1 illustrates an exemplary environment according to some embodiments of the present invention.
  • the environment includes a network 100 that is managed by a network operator which can provide a video
  • the network 100 is an IMS (IP Multimedia Subsystem) network, but the present invention is applicable to any other networks which can provide a video conference service.
  • the network 100 includes a video conference server 101, a voice recognition server 102, and a personal network server 103.
  • the video conference server 101 is an application server which provides a video conference service, in cooperation with other IMS components such as an HSS (Home Subscriber Server) , an S-CSCF (Serving Call Session Control Function) , and an MRF (Multimedia Resource Function) .
  • the voice recognition server 102 is an application server which provides a voiceprint matching service.
  • the personal network server 103 is an IMS enabler which manages information of IMS
  • a use case described below involves three users 111, 121, and 131.
  • the user 111 owns a PC 112 in his/her office (Office A) .
  • the user 121 owns a PC 122 in his/her office (Office B) .
  • the user 131 owns PCs 133 and 134 in his/her home and a PC 135 in his/her office (Office C) .
  • the user 131 also has a mobile phone 132 on him/her.
  • the user 111 and the user 121 are conducting a video conference with each other using the PCs 112 and 122.
  • This video conference is provided by the video conference server 101.
  • the lines between the video conference server 101 and the PCs 112 and 122 indicate video conference sessions. At this time, the user 131 does not participate in the video conference.
  • a participant of the video conference (for example, the user 111) is required to invite the user 131 to the video conference and ask the user 131 to make a presentation. Then, the user 111 makes a voice call to the mobile phone 132 via the video conference server 101 using the PC 112. When the user 131 answers the voice call, a voice session is established between the mobile phone 132 and the video conference server 101 and the user 111 can talk with the user 131.
  • the user 131 needs to participate in the ongoing video conference using a PC 133 (that stores a presentation material) .
  • a conventional video conference service requires the user 131 to perform some operations to participate in the video conference, such as activation of a video conference application on the PC 133, input of ID and passcode of the video conference, etc.
  • the mobile phone 132 receives from the video conference server 101 information that allows the PC 133 to participate in the ongoing video conference, and then transfers the information to the PC 133 and instructs the PC 133 to participate in the ongoing video conference via a local wireless link such as WLAN, Bluetooth, etc.
  • a local wireless link such as WLAN, Bluetooth, etc.
  • this approach requires a local wireless link between the mobile phone 132 and the PC 133, which may not be always expected.
  • some embodiments employ a server side approach.
  • the video conference server 101 controls the PC 133 such that the user 131 can easily participate in the ongoing video conference.
  • Fig. 2 illustrates an exemplary configuration of the video conference server 101 to achieve the server side approach.
  • the video conference server 101 includes a CPU 201, a memory 202, a network interface 203, and a video conference controller 204.
  • the CPU 201 controls overall operations of the video conference server 101.
  • the memory 202 stores computer programs and data used for operations of the video conference server 101.
  • the network interface 203 is used to communicate with IMS terminals such as the mobile phone 132, the PCs 112, 122, and 133-135 in Fig. 1, and with other components in the network 100.
  • the video conference server 101 includes a CPU 201, a memory 202, a network interface 203, and a video conference controller 204.
  • the CPU 201 controls overall operations of the video conference server 101.
  • the memory 202 stores computer programs and data used for operations of the video conference server 101.
  • the network interface 203 is used to communicate with IMS terminals such as the mobile phone 132, the PCs 112, 122, and
  • Fig. 3 illustrates an exemplary configuration of the communication apparatus 300 to achieve the server side approach.
  • Each of the PCs 133-135 may have the same configuration as the communication apparatus 300.
  • the PCs 112 and 122 may not be the communication apparatus 300 if they can participate in a video conference provided by the video conference server 101.
  • a PC is used as the communication apparatus 300 in Fig. 3, other apparatuses such as a smartphone, a mobile phone, and so on can be used as the communication apparatus 300.
  • the communication apparatus 300 includes a CPU 301, a memory 302, a network interface 303, a speaker 304, a display 305, a microphone 306, a video
  • the CPU 301 controls overall operations of the communication apparatus 300.
  • the memory 302 stores computer programs and data used for operations of the communication apparatus 300.
  • the network interface 303 is used to communicate with the network 100.
  • the network interface 303 includes a SIM (Subscriber Identity Module) which has unique
  • the speaker 304, the display 305, and the microphone 306 may be well-known components.
  • the video conference client 307 controls a video conference provided by the video conference server 101. The operations of the video conference client 307 will be described in details below.
  • the video conference client 307 may be embodied in a form of software and be installed to the communication apparatus 300.
  • the voiceprint generator 308 generates a voiceprint from voice data.
  • the input device 309 receives input from a user of the communication apparatus and is embodied by a mouse, a keyboard, a keypad, a touch screen, and so on.
  • Fig. 4 illustrates an example of overall operations of the network side approach according to some embodiments.
  • the CPU included in each device executes computer programs stored in memory of each device to process these operations.
  • Each of the video conference controller 204 and the video conference client 307 may include separate units for each
  • step S401 when a video conference starts between the user 111 and the user 121, the video conference controller 204 establishes video conference sessions with the PCs 112 and 122.
  • step S402 the video conference controller 204 receives a voice call request from the PC 112 to the mobile phone 132.
  • step S403 the video conference controller 204
  • the video conference controller 204 detects that a voice session is established with the mobile phone 132, the video conference controller 204 identifies one or more terminals associated with the mobile phone 132. The identified terminals are used as candidates that are possibly used for the user 131 to participate in the ongoing video conference. In this use case, since the PCs 133, 134, and 135 are not limited to the PCs 133, 134, and 135.
  • conference controller 204 identifies these PCs as candidates.
  • the user 131 registers one or more candidates (in this use case, the PCs 133-135) to the network 100 such that the candidates are associated with the mobile phone 132 before the ongoing video conference starts.
  • the user 131 can register candidates when the user 131 subscribes a video conference service provided by the network 100 and update the candidates anytime after the registration. This registration can be performed in different ways.
  • the user 131 may register and update candidates manually through a management console provided by the network 100, etc.
  • the mobile phone 132 more specifically, identification of the mobile phone 132 and candidates (more
  • identification of the candidates may be associated with each other via an IMS account of the user 131.
  • Any components in the network 100 can manage this association.
  • the video conference server 101 may manage this association in its local storage (for example, the memory 202) and identify the candidates with reference to the local storage.
  • a HSS in the network 100 manage this association and the video conference server 101 may identify the candidates with reference to the HSS.
  • step S405 the video conference controller 204 remotely activates the video conference client 307 in each candidate and requests it to make an invitation to the ongoing video conference between the user 111 and the user 121. Since a candidate can be powered off when the user 131 answers the voice call in step S403, the video conference controller 204 may repeat step S405 for a certain duration (for example, while the voice session in step S403 is ongoing or while the video conference is step S401 is ongoing) . This repetition makes it possible that the video conference client 307 in a candidate is activated after the user 131 powers on the candidate.
  • step S406 the video conference client 307 in each candidate presents an invitation so that the user 131 can easily participate in the ongoing video conference using the candidate.
  • the video conference client 307 presents a pop-up window on the display 305 which allows the user 131 to
  • step S407 the user 131 selects a candidate (the PC 133) from the candidates that the user 131 associated with the mobile phone 132 and the video conference client 307 in the PC 133 receives a user input to accept the invitation via the input device 309.
  • step S408 the video conference client 307 in the PC 133 sends the acceptance to the video conference server 101.
  • step S409 the video conference controller 204 remotely deactivates the video conference client 307 in candidates which do not send acceptance (in this use case, the PCs 134 and 135) .
  • the video conference controller 204 remotely deactivates the video conference client 307 in candidates which do not send acceptance (in this use case, the PCs 134 and 135) .
  • step S410 the video conference controller 204 establishes a video conference session between the PC 133 and the video conference server 101 so that the user 133 can participates in the ongoing video
  • step S410 two sessions have been
  • voice from the user 133 is simultaneously picked up with both of the mobile phone 132 and the PC 133 and sent to the video conference server 101 as two
  • the video conference controller 204 may prioritize one of these audio streams. In some embodiments, the video conference controller 204 selects one of these audio streams depending on quality (for example, BER, a volume of voice, etc. ) and sends the selected one to the
  • the video conference controller 204 may discard or lower the volume of the other audio stream which was not selected. Furthermore, the video conference controller 204 may transfer an audio stream from the PCs 112 and 122 to either of the mobile phone 132 or the PC 133.
  • Fig. 5 illustrates another example of overall operations of the network side approach according to some embodiments.
  • the CPU included in each device executes computer programs stored in memory of each device to process these operations.
  • steps S501 to S510 in Fig. 5 are performed instead of steps S405 to S409 in Fig. 4.
  • the PC 135 displays in step S406 a pop-up window which enables a person to
  • step S501 the video conference controller 204 remotely activates the video conference client 307 in each candidate and instructs it to send back a voiceprint.
  • Step S501 may be repeated for a certain duration, similar to step S405.
  • step S502 when the video conference client 307 is activated, the
  • voiceprint generator 308 in each candidate collects sound around the candidate using the microphone 306, extracts voice from the sound, and generates a
  • step S503 the video conference client 307 sends the generated voiceprint to the video
  • the video conference client 307 may send the voiceprint to the video conference server 101 only when the user 131 instructs to do so. If user 131 does not accept transmission of a voiceprint or the voiceprint generator 308 cannot extract voice (for example, in a case that there is not a person around the candidate) , the video conference client 307 may send an error message to the video conference server 101.
  • step S504 the video conference controller 204 narrows down the candidates based on the received voiceprints. Since the user 131 is talking with the user 111 via the mobile phone 132, the PCs 133 and 134, which are near the user 131, are assumed to receive voice of the user 131. Thus, the voiceprint sent to the video conference server 101 from the PCs 133 and 134 in step S503 should include a voiceprint of the user 131. The user 131 may make a speech on purpose so that the candidate can receive voice of the user 131.
  • the video conference controller 204 determines whether the voiceprint sent from each candidate matches a voiceprint of the user 131.
  • the voiceprint of the user 131 which is used as a reference voiceprint in comparison, can be obtained in different ways.
  • the video conference controller 204 generates a voiceprint based on audio data received from the mobile phone 132 over the voice session established in step S403. In some other embodiments, the video conference controller 204 retrieves a
  • voiceprint of the user 131 from an internal storage (for example, the memory 202) or an external server (for example, the personal network server 103), to which the user 132 previously registered his/her voiceprint associated with the mobile phone 132 or his/her IMS account.
  • the user 131 may register his/her voiceprint when the user 131 subscribes a video
  • the video conference controller 204 maintains this candidate.
  • the video conference controller 204 eliminates this candidate. In this case, since a voiceprint from the PC 135, which is not near the user 131, does not match a voiceprint of the user 131, the video conference controller 204 eliminates the PC 135 from the candidates identified in step S404. In other words, the video conference controller 204 narrows down the identified candidates to the PCs 133 and 134.
  • the video conference controller 204 may compare audio stream from the mobile phone 132 with audio stream from each of the candidates (the PCs 133 and 134) to ensure that the user 131 is near the candidates.
  • the video conference controller 204 may establish a voice session between the voice recognition server 102 and each candidate, and forks the audio stream from the mobile phone 132 and routes it to the voice recognition server 102.
  • the voice recognition server 102 compares the audio stream from the mobile phone 132 with the audio stream from each candidate and returns the result to the video conference server 101. In accordance with the result from the voice recognition server 102, the video conference controller 204 may further narrow down the candidates.
  • step S505 the video conference controller 204 remotely requests the video conference client 307 in each of the narrowed down candidates (the PCs 133 and 134) to present an invitation.
  • step S506 the video conference controller 204 remotely deactivates the video conference client 307 in a terminal which has been eliminated from the candidates (the PC 135) .
  • Steps S507 to S510 are similar to step S406 to S409
  • a voiceprint is used to narrow down the candidates.
  • These embodiments can suppress unnecessary audio streams from terminals to the video conference server 101 and preserve privacy of persons near the candidates.
  • FIG. 6 illustrates yet another example of overall operations of the network side approach
  • the CPU included in each device executes computer programs stored in memory of each device to process these operations.
  • steps S601 to S607 in Fig. 6 are performed instead of steps S405 to S409 in Fig. 4.
  • the operations in Fig. 6 can reduce a risk that a person who has no relation with the video conference
  • step S601 the video conference controller 204 remotely activates the video conference client 307 in each candidate.
  • step S602 when the video conference client 307 is activated, the voiceprint generator 308 in each candidate collects sound around the candidate using the microphone 306, extracts voice from the sound, and generates a voiceprint.
  • the video conference client 307 compares the generated voiceprint with a reference voiceprint that is a voiceprint of the user 131.
  • the video conference client 307 can obtain a voiceprint of the user 131 in different ways.
  • the video conference client 307 receives a voiceprint of the user 131 from the video conference server 101 along with the activation request in step S601.
  • the video conference client 307 requests the user 131 to register his/her voiceprint to the terminal of the user 131 before starting a video conference service (for example, when the user 131 installs the video conference client 307 to the PCs 133-135) and retrieves this voiceprint of the user 131 from a local storage (for example, the memory 302) .
  • the video conference client 307 (in this use case, those in the PCs 133 and 134) presents an invitation in step S604, similar to step S406.
  • the video conference client 307 (in this use case, that in the PC 135) does not present an invitation and may send an error message to the video conference server 101.
  • Steps S605 to S607 are similar to step S407 to S409 respectively and thus description thereof is omitted. According to the embodiments described above, the candidates do not send a voiceprint and an audio stream to the video conference server 101 before the video conference session is established.
  • embodiments can suppress unnecessary audio streams from terminals to the network 100 and preserve privacy of persons near the candidates.
  • a location of the candidate may be compared with a location of the mobile phone 132.
  • a location of the candidate may be
  • a location of the mobile phone 132 may be identified by location information obtained by a GPS (not shown) in the mobile phone 132, or by a coverage area (cell) in which the mobile phone 132 is currently located.
  • the user 131 can use a conventional terminal (that is, a terminal which has only a voice call function) as the mobile phone 132.
  • the user 131 can easily (for example, a single click on the PC 133) participate in the ongoing video conference

Abstract

A server for providing a video conference service is provided. The server includes a detection unit configured to detect establishment of a voice session between a first communication apparatus (for example, a PC) that is participating in a video conference and a second communication apparatus (for example, a mobile phone) that is not participating in the video conference; an identification unit configured to identify one or more candidate communication apparatuses (for example, PCs) associated with the second communication apparatus; and a video conference control unit configured to request at least one of the identified candidate communication apparatuses to participate in the video conference.

Description

DESCRIPTION
A SERVER AND A COMMUNICATION APPARATUS FOR VIDEOCONFERENCING
TECHNICAL FIELD
[0001] The present invention relates to a server communication apparatus.
BACKGROUND
[0002] A conventional video conference service requires a user to perform some operations to
participate in a video conference, such as activation of a video conference application on a terminal, input of an ID and a passcode of the video conference, etc. In a case that a user who is talking on a mobile phone with another person wants to participate in a video conference using a PC, such operations can be
cumbersome since the user should be busy with a talk on the mobile phone. It is desirable that a user can participate in a video conference without cumbersome operations.
SUMMARY
[0003] According to a first aspect of the invention, server for providing a video conference service is provided. The server includes a detection unit
configured to detect establishment of a voice session between a first communication apparatus (for example, PC) that is participating in a video conference and a second communication apparatus (for example, a mobile phone) that is not participating in the video
conference; an identification unit configured to
identify one or more candidate communication
apparatuses (for example, PCs) associated with the second communication apparatus; and a video conference control unit configured to request at least one of the identified candidate communication apparatuses to participate in the video conference.
[0004] According to a second aspect of the invention, a communication apparatus for use in a video conference, comprising is provided. The apparatus includes: a microphone; a communication unit configured to receive a request to participate in a video conference from a server that provides a video conference service; an obtaining unit configured to obtain a voiceprint from voice collected from the microphone; and a presentation unit configured to present a user interface for
receiving an acceptance to the request when the
obtained voiceprint matches a reference voiceprint.
[0005] Further features of the present invention will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
BRIEF DESCRIPTION OF DRAWINGS [0006] Fig. 1 illustrates an exemplary environment of some embodiments.
[0007] Fig. 2 illustrates an exemplary block diagram of a video conference server according to some
embodiments.
[0008] Fig. 3 illustrates an exemplary block diagram of a PC according to some embodiments.
[0009] Figs. 4-6 illustrate exemplary operations during a video conference service according to some embodiments.
DETAILED DESCRIPTION
[0010] Embodiments of the present invention will now be described with reference to the attached drawings. Each embodiment described below will be helpful in understanding a variety of concepts from the generic to the more specific. It should be noted that the
technical scope of the present invention is defined by claims, and is not limited by each embodiment described below. In addition, not all combinations of the features described in the embodiments are always indispensable for the present invention.
[0011] Fig. 1 illustrates an exemplary environment according to some embodiments of the present invention. The environment includes a network 100 that is managed by a network operator which can provide a video
conference service. It is assumed herein that the network 100 is an IMS (IP Multimedia Subsystem) network, but the present invention is applicable to any other networks which can provide a video conference service. The network 100 includes a video conference server 101, a voice recognition server 102, and a personal network server 103. The video conference server 101 is an application server which provides a video conference service, in cooperation with other IMS components such as an HSS (Home Subscriber Server) , an S-CSCF (Serving Call Session Control Function) , and an MRF (Multimedia Resource Function) . The voice recognition server 102 is an application server which provides a voiceprint matching service. The personal network server 103 is an IMS enabler which manages information of IMS
terminals.
[ 0012 ] A use case described below involves three users 111, 121, and 131. The user 111 owns a PC 112 in his/her office (Office A) . The user 121 owns a PC 122 in his/her office (Office B) . The user 131 owns PCs 133 and 134 in his/her home and a PC 135 in his/her office (Office C) . The user 131 also has a mobile phone 132 on him/her. In the use case illustrated in Fig. 1, the user 111 and the user 121 are conducting a video conference with each other using the PCs 112 and 122. This video conference is provided by the video conference server 101. The lines between the video conference server 101 and the PCs 112 and 122 indicate video conference sessions. At this time, the user 131 does not participate in the video conference.
[0013] During the video conference, a participant of the video conference (for example, the user 111) is required to invite the user 131 to the video conference and ask the user 131 to make a presentation. Then, the user 111 makes a voice call to the mobile phone 132 via the video conference server 101 using the PC 112. When the user 131 answers the voice call, a voice session is established between the mobile phone 132 and the video conference server 101 and the user 111 can talk with the user 131. In order to make a presentation, the user 131 needs to participate in the ongoing video conference using a PC 133 (that stores a presentation material) . A conventional video conference service requires the user 131 to perform some operations to participate in the video conference, such as activation of a video conference application on the PC 133, input of ID and passcode of the video conference, etc.
However, since the user 131 is busy with a talk with the user 111 on the mobile phone 132, it is desirable that the number of operations to participate in the video conference should be reduced.
[0014] One possible approach to reduce the number of operations is a terminal side approach. In the
terminal side approach, the mobile phone 132 receives from the video conference server 101 information that allows the PC 133 to participate in the ongoing video conference, and then transfers the information to the PC 133 and instructs the PC 133 to participate in the ongoing video conference via a local wireless link such as WLAN, Bluetooth, etc. However, this approach requires a local wireless link between the mobile phone 132 and the PC 133, which may not be always expected. Thus, some embodiments employ a server side approach. In the server side approach, the video conference server 101 controls the PC 133 such that the user 131 can easily participate in the ongoing video conference.
[ 0015 ] Fig. 2 illustrates an exemplary configuration of the video conference server 101 to achieve the server side approach. The video conference server 101 includes a CPU 201, a memory 202, a network interface 203, and a video conference controller 204. The CPU 201 controls overall operations of the video conference server 101. The memory 202 stores computer programs and data used for operations of the video conference server 101. The network interface 203 is used to communicate with IMS terminals such as the mobile phone 132, the PCs 112, 122, and 133-135 in Fig. 1, and with other components in the network 100. The video
conference controller 204 controls a video conference between IMS terminals. The operations of the video conference controller 204 will be described in details below . [0016] Fig. 3 illustrates an exemplary configuration of the communication apparatus 300 to achieve the server side approach. Each of the PCs 133-135 may have the same configuration as the communication apparatus 300. Note that the PCs 112 and 122 may not be the communication apparatus 300 if they can participate in a video conference provided by the video conference server 101. Although a PC is used as the communication apparatus 300 in Fig. 3, other apparatuses such as a smartphone, a mobile phone, and so on can be used as the communication apparatus 300.
[0017] The communication apparatus 300 includes a CPU 301, a memory 302, a network interface 303, a speaker 304, a display 305, a microphone 306, a video
conference client 307, a voiceprint generator 308, and an input device 309. The CPU 301 controls overall operations of the communication apparatus 300. The memory 302 stores computer programs and data used for operations of the communication apparatus 300. The network interface 303 is used to communicate with the network 100. The network interface 303 includes a SIM (Subscriber Identity Module) which has unique
identification information such as IMSI (International Mobile Subscriber Identity) . The speaker 304, the display 305, and the microphone 306 may be well-known components. The video conference client 307 controls a video conference provided by the video conference server 101. The operations of the video conference client 307 will be described in details below. The video conference client 307 may be embodied in a form of software and be installed to the communication apparatus 300. The voiceprint generator 308 generates a voiceprint from voice data. The input device 309 receives input from a user of the communication apparatus and is embodied by a mouse, a keyboard, a keypad, a touch screen, and so on.
[0018] Fig. 4 illustrates an example of overall operations of the network side approach according to some embodiments. The CPU included in each device executes computer programs stored in memory of each device to process these operations. Each of the video conference controller 204 and the video conference client 307 may include separate units for each
operation described below.
[0019] In step S401, when a video conference starts between the user 111 and the user 121, the video conference controller 204 establishes video conference sessions with the PCs 112 and 122. In step S402, the video conference controller 204 receives a voice call request from the PC 112 to the mobile phone 132. In step S403, the video conference controller 204
establishes a voice session with the mobile phone 132, in cooperation with an S-CSCF in the network 100.
[0020] Once the video conference controller 204 detects that a voice session is established with the mobile phone 132, the video conference controller 204 identifies one or more terminals associated with the mobile phone 132. The identified terminals are used as candidates that are possibly used for the user 131 to participate in the ongoing video conference. In this use case, since the PCs 133, 134, and 135 are
associated with the mobile phone 132, the video
conference controller 204 identifies these PCs as candidates.
[ 0021 ] The user 131 registers one or more candidates (in this use case, the PCs 133-135) to the network 100 such that the candidates are associated with the mobile phone 132 before the ongoing video conference starts. For example, the user 131 can register candidates when the user 131 subscribes a video conference service provided by the network 100 and update the candidates anytime after the registration. This registration can be performed in different ways. The user 131 may register and update candidates manually through a management console provided by the network 100, etc. The mobile phone 132 (more specifically, identification of the mobile phone 132) and candidates (more
specifically, identification of the candidates) may be associated with each other via an IMS account of the user 131. Any components in the network 100 can manage this association. For example, the video conference server 101 may manage this association in its local storage (for example, the memory 202) and identify the candidates with reference to the local storage.
Alternatively, a HSS in the network 100 manage this association and the video conference server 101 may identify the candidates with reference to the HSS.
[0022] In step S405, the video conference controller 204 remotely activates the video conference client 307 in each candidate and requests it to make an invitation to the ongoing video conference between the user 111 and the user 121. Since a candidate can be powered off when the user 131 answers the voice call in step S403, the video conference controller 204 may repeat step S405 for a certain duration (for example, while the voice session in step S403 is ongoing or while the video conference is step S401 is ongoing) . This repetition makes it possible that the video conference client 307 in a candidate is activated after the user 131 powers on the candidate.
[0023] In step S406, the video conference client 307 in each candidate presents an invitation so that the user 131 can easily participate in the ongoing video conference using the candidate. In an example, the video conference client 307 presents a pop-up window on the display 305 which allows the user 131 to
participate in the ongoing video conference using a desired terminal (in this use case, the PC 133) simply by clicking an "OK" button or pressing an "Enter" key.
[0024] In step S407, the user 131 selects a candidate (the PC 133) from the candidates that the user 131 associated with the mobile phone 132 and the video conference client 307 in the PC 133 receives a user input to accept the invitation via the input device 309. In step S408, the video conference client 307 in the PC 133 sends the acceptance to the video conference server 101.
[0025] In step S409, the video conference controller 204 remotely deactivates the video conference client 307 in candidates which do not send acceptance (in this use case, the PCs 134 and 135) . When the video
conference client 307 is deactivated, the pop-up window for inviting the user 131 to the video conference is closed. This operation can reduce a risk that a person who has no relation with the video conference
participates in the video conference.
[0026] In step S410, the video conference controller 204 establishes a video conference session between the PC 133 and the video conference server 101 so that the user 133 can participates in the ongoing video
conference using the PC 133.
[0027] After step S410, two sessions have been
established for the user 133, that is, a voice session between the mobile phone 132 and the video conference server 101, and a video conference session between the PC 133 and the video conference server 101. Thus, voice from the user 133 is simultaneously picked up with both of the mobile phone 132 and the PC 133 and sent to the video conference server 101 as two
duplicate audio streams. Without controlling these audio streams, they cause some sound problems such as audio feedback, acoustic echo, etc. In order to suppress these sound problems, the video conference controller 204 may prioritize one of these audio streams. In some embodiments, the video conference controller 204 selects one of these audio streams depending on quality (for example, BER, a volume of voice, etc. ) and sends the selected one to the
participants of the video conference (in this case, the PCs 112 and 122) . The video conference controller 204 may discard or lower the volume of the other audio stream which was not selected. Furthermore, the video conference controller 204 may transfer an audio stream from the PCs 112 and 122 to either of the mobile phone 132 or the PC 133.
[ 0028 ] Fig. 5 illustrates another example of overall operations of the network side approach according to some embodiments. The CPU included in each device executes computer programs stored in memory of each device to process these operations. In this example, steps S501 to S510 in Fig. 5 are performed instead of steps S405 to S409 in Fig. 4. [0029] In the method in Fig. 4, even if the user 131 is not near the PC 135, the PC 135 displays in step S406 a pop-up window which enables a person to
participate in the video conference without inputting information specific to the video conference. Thus, there is a risk that a person who has no relation with the video conference participates in the video
conference. The operations in Fig. 5 can reduce such a risk.
[0030] In step S501, the video conference controller 204 remotely activates the video conference client 307 in each candidate and instructs it to send back a voiceprint. Step S501 may be repeated for a certain duration, similar to step S405. In step S502, when the video conference client 307 is activated, the
voiceprint generator 308 in each candidate collects sound around the candidate using the microphone 306, extracts voice from the sound, and generates a
voiceprint. In step S503, the video conference client 307 sends the generated voiceprint to the video
conference server 101. The video conference client 307 may send the voiceprint to the video conference server 101 only when the user 131 instructs to do so. If user 131 does not accept transmission of a voiceprint or the voiceprint generator 308 cannot extract voice (for example, in a case that there is not a person around the candidate) , the video conference client 307 may send an error message to the video conference server 101.
[0031] In step S504, the video conference controller 204 narrows down the candidates based on the received voiceprints. Since the user 131 is talking with the user 111 via the mobile phone 132, the PCs 133 and 134, which are near the user 131, are assumed to receive voice of the user 131. Thus, the voiceprint sent to the video conference server 101 from the PCs 133 and 134 in step S503 should include a voiceprint of the user 131. The user 131 may make a speech on purpose so that the candidate can receive voice of the user 131.
[0032] The video conference controller 204 determines whether the voiceprint sent from each candidate matches a voiceprint of the user 131. The voiceprint of the user 131, which is used as a reference voiceprint in comparison, can be obtained in different ways. In some embodiments, the video conference controller 204 generates a voiceprint based on audio data received from the mobile phone 132 over the voice session established in step S403. In some other embodiments, the video conference controller 204 retrieves a
voiceprint of the user 131 from an internal storage (for example, the memory 202) or an external server (for example, the personal network server 103), to which the user 132 previously registered his/her voiceprint associated with the mobile phone 132 or his/her IMS account. The user 131 may register his/her voiceprint when the user 131 subscribes a video
conference service provided by the video conference server 101.
[ 0033 ] When it is determined that a voiceprint sent from a candidate matches a voiceprint of the user 131, the video conference controller 204 maintains this candidate. When it is determined that a voiceprint sent from a candidate does not match a voiceprint of the user 131, the video conference controller 204 eliminates this candidate. In this case, since a voiceprint from the PC 135, which is not near the user 131, does not match a voiceprint of the user 131, the video conference controller 204 eliminates the PC 135 from the candidates identified in step S404. In other words, the video conference controller 204 narrows down the identified candidates to the PCs 133 and 134.
[ 0034 ] After narrowing down the candidates, the video conference controller 204 may compare audio stream from the mobile phone 132 with audio stream from each of the candidates (the PCs 133 and 134) to ensure that the user 131 is near the candidates. In some embodiments, the video conference controller 204 may establish a voice session between the voice recognition server 102 and each candidate, and forks the audio stream from the mobile phone 132 and routes it to the voice recognition server 102. The voice recognition server 102 compares the audio stream from the mobile phone 132 with the audio stream from each candidate and returns the result to the video conference server 101. In accordance with the result from the voice recognition server 102, the video conference controller 204 may further narrow down the candidates.
[0035] In step S505, the video conference controller 204 remotely requests the video conference client 307 in each of the narrowed down candidates (the PCs 133 and 134) to present an invitation. In step S506, the video conference controller 204 remotely deactivates the video conference client 307 in a terminal which has been eliminated from the candidates (the PC 135) . Steps S507 to S510 are similar to step S406 to S409
respectively and thus description thereof is omitted.
According to the method in Fig. 5, a voiceprint is used to narrow down the candidates. These embodiments can suppress unnecessary audio streams from terminals to the video conference server 101 and preserve privacy of persons near the candidates.
[0036] Fig. 6 illustrates yet another example of overall operations of the network side approach
according to some embodiments. The CPU included in each device executes computer programs stored in memory of each device to process these operations. In this example, steps S601 to S607 in Fig. 6 are performed instead of steps S405 to S409 in Fig. 4. The operations in Fig. 6 can reduce a risk that a person who has no relation with the video conference
participates in the video conference.
[ 0037 ] In step S601, the video conference controller 204 remotely activates the video conference client 307 in each candidate. In step S602, when the video conference client 307 is activated, the voiceprint generator 308 in each candidate collects sound around the candidate using the microphone 306, extracts voice from the sound, and generates a voiceprint.
[ 0038 ] In step S603, the video conference client 307 compares the generated voiceprint with a reference voiceprint that is a voiceprint of the user 131. The video conference client 307 can obtain a voiceprint of the user 131 in different ways. In some embodiments, the video conference client 307 receives a voiceprint of the user 131 from the video conference server 101 along with the activation request in step S601. In some other embodiments, the video conference client 307 requests the user 131 to register his/her voiceprint to the terminal of the user 131 before starting a video conference service (for example, when the user 131 installs the video conference client 307 to the PCs 133-135) and retrieves this voiceprint of the user 131 from a local storage (for example, the memory 302) .
[ 0039 ] When it is determined that the generated voiceprint matches a voiceprint of the user 131, the video conference client 307 (in this use case, those in the PCs 133 and 134) presents an invitation in step S604, similar to step S406. When it is determined that the generated voiceprint does not match a voiceprint of the user 131, the video conference client 307 (in this use case, that in the PC 135) does not present an invitation and may send an error message to the video conference server 101.
[0040] Steps S605 to S607 are similar to step S407 to S409 respectively and thus description thereof is omitted. According to the embodiments described above, the candidates do not send a voiceprint and an audio stream to the video conference server 101 before the video conference session is established. These
embodiments can suppress unnecessary audio streams from terminals to the network 100 and preserve privacy of persons near the candidates.
[0041] In addition to or instead of the comparison of voiceprints in S504 or S603, a location of each
candidate may be compared with a location of the mobile phone 132. A location of the candidate may be
identified by the personal network server 103 that keeps track of user's context, by information
registered by the user 131 (for example, the user registers the location of the PC 133 to be associated with an address of his/her home) , or by a coverage area (cell) in which the candidate is currently located. A location of the mobile phone 132 may be identified by location information obtained by a GPS (not shown) in the mobile phone 132, or by a coverage area (cell) in which the mobile phone 132 is currently located.
[0042] The network side approach described above has some advantages as follows. No local wireless link is necessary between the mobile phone 132 and the
candidates (the PCs 133-135) . The user 131 can use a conventional terminal (that is, a terminal which has only a voice call function) as the mobile phone 132. The user 131 can easily (for example, a single click on the PC 133) participate in the ongoing video conference
[0043] While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest
interpretation so as to encompass all such
modifications and equivalent structures and functions.

Claims

1. A server (101) for providing a video conference service, comprising:
a detection unit (204) configured to detect establishment of a voice session between a first communication apparatus (112) that is participating in a video conference and a second communication apparatus (132) that is not participating in the video
conference;
an identification unit (204) configured to identify one or more candidate communication
apparatuses (133-135) associated with the second communication apparatus; and
a video conference control unit (204) configured to request at least one (133, 134) of the identified candidate communication apparatuses to participate in the video conference.
2. The server according to claim 1, further
comprising
a voiceprint determination unit (204) configured to determine whether a voiceprint obtained from voice collected by a microphone of each of the identified candidate communication apparatuses matches a
voiceprint of a user of the second communication apparatus, wherein
the video conference control unit is configured to request a candidate communication apparatus whose voiceprint matches the voiceprint of the user to participate in the video conference.
3. The server according to claim 1 or 2, further comprising
a location determination unit (204) configured to determine whether a location of each of the identified candidate communication apparatuses matches a location of the second communication apparatus, wherein
the video conference control unit is configured to request a candidate communication apparatus whose location matches the location of the second
communication apparatus to participate in the video conference .
4. The server according to any one of claims 1-3, wherein
the video conference control unit is further configured to establish a video conference session between the first communication apparatus and a
candidate communication apparatus which sends an acceptance in response to the request, such that the candidate communication apparatus can participate in the video conference.
5. The server according to claim 4, wherein
the video conference control unit is further configured to prioritize either of audio from the second communication apparatus or audio from the candidate communication apparatus that is participating in the video conference.
6. The server according to any one of claims 1-5, wherein
the server is in an IMS network (100);
the server further comprises a storage unit (202) configured to store an IMS account of the second communication apparatus and identification information of one or more candidate communication apparatus such that the IMS account is associated with the
identification information of the one or more candidate communication apparatus; and
the identification unit is configured to identify one or more candidate communication apparatuses with reference to the storage unit.
7. A communication apparatus (300) for use in a video conference, comprising:
a microphone (306);
a communication unit (303) configured to receive a request to participate in a video conference from a server (101) that provides a video conference service; an obtaining unit (308) configured to obtain a voiceprint from voice collected from the microphone; and
a presentation unit (307) configured to present a user interface for receiving an acceptance to the request when the obtained voiceprint matches a
reference voiceprint.
8. The communication apparatus according to claim 7, wherein
the communication unit is further configured to send the obtained voiceprint to the server and receive a response indicating whether the obtained voiceprint matches the reference voiceprint.
9. The communication apparatus according to claim 7, further comprising
a comparison unit (307) configured to compare the obtained voiceprint with the reference voiceprint.
PCT/SE2012/051418 2012-12-18 2012-12-18 A server and a communication apparatus for videoconferencing WO2014098661A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/SE2012/051418 WO2014098661A1 (en) 2012-12-18 2012-12-18 A server and a communication apparatus for videoconferencing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/SE2012/051418 WO2014098661A1 (en) 2012-12-18 2012-12-18 A server and a communication apparatus for videoconferencing

Publications (1)

Publication Number Publication Date
WO2014098661A1 true WO2014098661A1 (en) 2014-06-26

Family

ID=47604016

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SE2012/051418 WO2014098661A1 (en) 2012-12-18 2012-12-18 A server and a communication apparatus for videoconferencing

Country Status (1)

Country Link
WO (1) WO2014098661A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5929897A (en) * 1995-07-12 1999-07-27 Ncr Corporation Automated distribution of video telephone calls
EP1981254A2 (en) * 2007-04-10 2008-10-15 NTT DoCoMo, Inc. Communication control device and communication terminal
US20100324946A1 (en) * 2009-06-22 2010-12-23 Keiji Ohmura Teleconference support system
US8234335B1 (en) * 2004-06-29 2012-07-31 Sprint Spectrum L.P. Customized grouping of participants in real-time conference set-up

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5929897A (en) * 1995-07-12 1999-07-27 Ncr Corporation Automated distribution of video telephone calls
US8234335B1 (en) * 2004-06-29 2012-07-31 Sprint Spectrum L.P. Customized grouping of participants in real-time conference set-up
EP1981254A2 (en) * 2007-04-10 2008-10-15 NTT DoCoMo, Inc. Communication control device and communication terminal
US20100324946A1 (en) * 2009-06-22 2010-12-23 Keiji Ohmura Teleconference support system

Similar Documents

Publication Publication Date Title
US10476917B2 (en) Media channel management apparatus for network communications sessions
US9661269B2 (en) System for enabling communications and conferencing between dissimilar computing devices including mobile computing devices
US20050206721A1 (en) Method and apparatus for disseminating information associated with an active conference participant to other conference participants
US9807128B2 (en) Communication system and computer readable medium
CN107409162A (en) Communication system and the method using the communication system
US9894689B2 (en) System, method, and logic for identifying devices for a virtual meeting session
US20110270936A1 (en) Systems, methods, and computer programs for monitoring a conference and communicating with participants without joining as a participant
US8576996B2 (en) This call
EP2355474B1 (en) Transfer of telephony functions associated with a wireless handheld telephony device to another telephony device
JP2014099117A (en) Communication system, communication method, and program
US20140307859A1 (en) Apparatus and Method for Audio Data Processing
JP7135766B2 (en) Communication system, program, terminal device
US8185573B2 (en) System and method for posting a web logging message via a dispatch communication
US9699632B2 (en) Multi-modality communication with interceptive conversion
US8185575B2 (en) Apparatus and method for posting a web logging message via a dispatch communication
US9906927B2 (en) Multi-modality communication initiation
CN116016459A (en) Audio/video conference call method, system and storage medium
US9948891B1 (en) Conducting an audio or video conference call
WO2014098661A1 (en) A server and a communication apparatus for videoconferencing
CN109923880B (en) Conference flow control method and related equipment
US11677906B2 (en) Secondary mode device software access for primary mode device users
US20240040340A1 (en) Integrated push-to-talk communication
JP2022047969A (en) Call control device, call control method, and computer program
KR20230064303A (en) Method and system for background sound service during call
KR101593912B1 (en) Apparatus and method for managementing presence information in mobile communication system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12818835

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12818835

Country of ref document: EP

Kind code of ref document: A1