WO2001019066A1 - Method, system and software product for transmitting speech on internet - Google Patents

Method, system and software product for transmitting speech on internet Download PDF

Info

Publication number
WO2001019066A1
WO2001019066A1 PCT/FI2000/000759 FI0000759W WO0119066A1 WO 2001019066 A1 WO2001019066 A1 WO 2001019066A1 FI 0000759 W FI0000759 W FI 0000759W WO 0119066 A1 WO0119066 A1 WO 0119066A1
Authority
WO
WIPO (PCT)
Prior art keywords
software
speech
file
internet
data
Prior art date
Application number
PCT/FI2000/000759
Other languages
French (fr)
Inventor
Paavo Eskelinen
Original Assignee
Voxlab Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Voxlab Oy filed Critical Voxlab Oy
Priority to AU70042/00A priority Critical patent/AU7004200A/en
Publication of WO2001019066A1 publication Critical patent/WO2001019066A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/04Protocols specially adapted for terminals or networks with limited capabilities; specially adapted for terminal portability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/34Network arrangements or protocols for supporting network services or applications involving the movement of software or configuration parameters 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • H04M7/0072Speech codec negotiation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/12Arrangements for interconnection between switching centres for working between exchanges having different types of switching equipment, e.g. power-driven and step by step or decimal and non-decimal
    • H04M7/1205Arrangements for interconnection between switching centres for working between exchanges having different types of switching equipment, e.g. power-driven and step by step or decimal and non-decimal where the types of switching equipement comprises PSTN/ISDN equipment and switching equipment of networks other than PSTN/ISDN, e.g. Internet Protocol networks
    • H04M7/1245Arrangements for interconnection between switching centres for working between exchanges having different types of switching equipment, e.g. power-driven and step by step or decimal and non-decimal where the types of switching equipement comprises PSTN/ISDN equipment and switching equipment of networks other than PSTN/ISDN, e.g. Internet Protocol networks where a network other than PSTN/ISDN interconnects two PSTN/ISDN networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • H04L2012/6472Internet
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/64Hybrid switching systems
    • H04L12/6418Hybrid transport
    • H04L2012/6481Speech, voice
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/253Telephone sets using digital voice transmission
    • H04M1/2535Telephone sets using digital voice transmission adapted for voice communication over an Internet Protocol [IP] network

Definitions

  • the invention relates to a method for transmitting speech on the Internet, particularly data containing compressed speech.
  • a call can be implemented using the Internet.
  • there is no circuit-switched connection of fixed data transfer capacity reserved for the call but speech is transferred in packets using a packet-switched data transfer connection available through the Internet.
  • One way of minimizing the delays is to minimize the amount of data to be transferred.
  • PCM Pulse Code Modulation
  • Efficient speech codecs have been developed for Internet calls, as well as for mobile networks, to allow the amount of data to be transferred to be minimized.
  • the term 'speech codec' already indicates that the codec is re- sponsible for coding and decoding speech.
  • the specification H.323 of the ITU defines a speech codec functioning at the rate of 5.3 kbit/s.
  • the audio quality of speech produced by this codec is not as good as the quality of speech produced using the PCM.
  • the manufacturers are therefore continuously developing new, more efficient codecs producing audio of increasingly higher quality.
  • the codecs are in fact usually different compression and decompression methods used for packing digitized speech.
  • the method in question is used for transmitting speech on the Internet and it comprises: compressing data containing digitized speech into a file at a transmission end by using compression software.
  • the method further comprises: arranging into the file decompression software for decompressing the compression employed; dividing the file into data packets; transmitting the packets to a recipient over the Internet; decompressing the received compressed data at the reception end by using the received decompression software.
  • the invention further relates to a system for transferring speech on the Internet.
  • the system comprises: compression software at transmission end equipment for compressing data containing digitized speech into a file and for arranging into the file decompression software to be used for decompressing the compression employed; packet transmission software at the transmission end equipment for dividing the file into data packets and for transmitting the packets to the recipient over the Internet; packet transmission software at a reception end for receiving the packets and for assembling the transmitted file from the packets; the reception end being arranged to decompress the received compressed data by using the received decompression software.
  • the invention still further relates to a computer software product for transmitting speech on the Internet, the product comprising software stored into a software storage means and readable into a computer.
  • the software carries out the method steps of: compressing data containing digitized speech into a file at a transmission end by using compression software; arranging into the file decompression software decompressing the compression employed, the recipient using the software at the reception end to decompress the received compressed data; dividing the file into data packets; transmitting the packets to the recipient over the Internet.
  • the invention is based on the idea that the recipient of a call does not need worry about obtaining the software for decompressing compressed data packets, but the software is delivered into the recipient's equipment to- gether with the speech packets.
  • the method and system of the invention provide many advantages.
  • the invention allows speech to be transmitted on the Internet flexibly by using diverse speech codecs, without the drawbacks involved in the prior art.
  • the user does not necessarily need to know the technical details, but s/he may be content that the speech quality and the costs meet his/her requirements.
  • the recipient of speech coded into packets does not need any decoding system to be able to listen to the speech, only a standard computer provided with standard user interfaces and sound cards, or even just an ordinary telephone.
  • Figures 1A and 1B illustrate different ways of setting up an Internet call
  • Figure 2 illustrates an example of the content of the data packets to be transmitted
  • Figure 3A illustrates an example of a structure of transmission end equipment
  • Figure 3B illustrates an example of a structure of reception end equipment
  • Figure 4 is a flow chart illustrating a method for transmitting speech on the Internet.
  • FIG. 1A illustrates the first method.
  • a user 100 typically has a computer 104 which includes a microphone, a sound card and the software needed to convert speech 102 into data. The conversion is made using a speech codec, which can also be used for compressing data. Speech data is transmitted in packets over the Internet 106 to a recipient 112, using for example TCP/IP as the data transfer protocol.
  • the user's 100 computer 104 and the recipient's 112 computer 108 must be connected to the Internet 106 over an interface 140.
  • the interface 140 to the Internet 106 may be implemented in any prior art manner.
  • a typical way is to use the modem connected to the computer 104 for setting up a public switched connection through a telephone exchange to the server of the Internet service provider. Other alternatives are to use a fixed connection, a cable provided by a cable television network or a wireless radio connection.
  • a characteristic of the first method is that the speech is coded and converted into packet format in the user's equipment, prior to the Internet interface 140.
  • the server delivers the packets to the recipient's 112 computer 108 which also comprises a speech codec that converts the packets back to speech 110 which is played to the recipient 112 using the sound card and loudspeakers of the computer 108.
  • the above example provides only one alternative for implementing an Internet call, current technology already makes other solutions possible as well.
  • FIG. 1B illustrates another method for implementing an Internet call.
  • the user 100 has a standard analog or digital telephone 114 at his/her disposal.
  • the speech codec of an ordinary analog telephone is located in a switching centre where the speech signal is supplied in an analog form to the speech codec which converts it into a digital form.
  • An ISDN telephone comprises a built-in speech codec, therefore the signal supplied to the switching centre is in a digital form.
  • So-called PBX exchanges i.e. private exchanges of companies, may comprise a speech codec, in which case the connection from the PBX to the switching centre is digital, or the connection may be analog, in which case the speech codec is located in the switching centre.
  • the call By calling a specific number with the telephone 114, or by selecting a specific network identifier, the call is connected to a server 118 of an operator providing Internet services. The user 100 is then given the dial tone again and s/he may select the number of the person 112 s/he wishes to call.
  • the speech codec can thus be located for example in a fixed network switching centre or at the service provider's server.
  • the compression software employed and the software for packet-switched transmission are preferably located at the operator's server.
  • Compressed packets are then delivered over the Internet 106 to the recipient 112 of the call. Speech travels on an interface 152 in an uncoded form, on an interface 154 it travels either coded or uncoded, depending on the location of the speech codec, and on an interface 156 coded packets are transmitted.
  • each user may have personal preferences or financial reasons, for example, for preferring a specific speech codec.
  • the recipient of the call must have the same speech codec as the party initiating the call at his/her disposal, or at least its decoding portion, otherwise the speech coding cannot be decoded at the reception end.
  • the speech codec is then transmitted to the recipient by packing it already at the transmitter 100 end into a file comprising compressed speech data.
  • the file is transmitted in a packet-switched format, i.e. it is divided into data packets, over the Internet to the recipient.
  • the computer 110 of the recipient 112 then only releases the speech codec from the packets (or just a speech decoder), installs it and starts to decompress speech from the file consisting of the packets.
  • Another alternative is to run the speech codec only when a speech file is to be listened to.
  • FIG. 2 illustrates an example of the content of the data packets to be transmitted.
  • the rectangular areas depict the packets.
  • details required by the data transfer protocol employed are not shown, but only the payload to be transferred in the packets.
  • the upper part of the example illustrates the transmission of three packets.
  • Two of the packets comprise only compressed speech data 200C, 200B, one packet comprising both compressed speech data 200A and decompression software 202.
  • the lower example shown in the Figure is otherwise the same as the upper one, except for the decompression software 202A, 202B, which is now longer and therefore requires the transfer capacity of two packets. In practice there are usually more packets than those shown in the Figure.
  • Figure 4 is a flow diagram illustrating the method of the invention.
  • the execution of the method starts at block 400.
  • data containing digitized speech is compressed at the transmission end into a file by using compression software.
  • decompression software for decompressing the compression used is added into the file that comprises the compressed data.
  • the file is divided into data packets, for example as shown in Figure 2, decompression software decompressing the compression employed being placed at least in one packet containing compressed data, or into some other packet.
  • decompression software decompressing the compression employed being placed at least in one packet containing compressed data, or into some other packet.
  • the packets are transmitted to the recipient over the
  • the compressed data is decompressed using the received decompression software.
  • the decompression software installed at the reception end is not made as a permanent part of the reception end equipment, but the software is only carried out in the reception end processor when a received file containing speech is to be listened to.
  • the decompression software is run in the random access memory of the recipient's equipment for decompressing data packets or a compiled data file, the speech data obtained being then played to the recipient. When the data has been listened to, the decompression software is not left permanently installed into the recipient's equipment. Another option is to store a decompression software file at the reception end.
  • Compressed speech data can also be stored as a file at the reception end. In that case the recipient in a way records the call, i.e. s/he may listen again at least to the calling party's portion.
  • the decompression software and the compressed speech data are usually both stored into the same file at the reception end, which makes it easy to arrange the message to be played.
  • software compressing speech data is arranged into at least one packet. This allows also the recipient of the call to use the compression used by the caller, either during the same call or later.
  • Voice mail is unidirectional communication that does not take place in real-time.
  • Figure 3A shows the transmission end equipment and Figure 3B the reception end equipment.
  • a computer 104, 108 comprises a display 300, keyboard 302, mouse 304, sound card 314, at least one loudspeaker 306 connected to the sound card, a microphone 308 connected to the sound card, a device provid- ing access to the Internet, e.g. a modem 310, a mass memory device, such as a hard disk 312, and a central processing unit 320.
  • the central processing unit 320 is used for carrying out the operating system and the application software. From the point of view of the invention, the most important application software is the one providing an interface that allows the sound card 314 to be used. This software can be used for playing for example .wav files and for recording .wav files.
  • the computer comprises telecommunications means, together with the software involved, such as the modem 310 and packet transmission software 324, that can be used for establishing a connection for example through the public switched telephone network to the server of the Internet service provider.
  • the Windows environment for example, includes a specific sound reproduction service, MCI, that provides an API (Application Programmers Interdace) speech codec for use.
  • MCI Application Programmers Interdace
  • the API is always the same, irrespective of the sound card, because there are a plural number of different drivers in the Windows environment for different sound cards.
  • the MCI alone cannot, however, be used as a speech codec because it is not efficient enough: when voice is to be created, the amount of data may even double.
  • the software com- prises compression software 326 at least at the transmission end for packing data comprising voice.
  • the transmission end must comprise decompression software 202 which is arranged into the packets together with the speech data 200.
  • the speech data 200 and the decompression software 202 decompressing the speech data are received from the network through the telecommunications means 310; the decompression software then decompresses the packets, plays them with a player 308, the sound being created in the sound card 314 and transmitted to the loud- speacker 306 which reproduces the sound.
  • voice mail may be created by recording speech with the microphone 308 and the sound card 314, in which case the compression software 326 packs the sound into a file.
  • the file is converted into packets using packet transmission software 324 and transmitted into the network using the telecommunications means 310.
  • the file is provided with the decompression software 202.
  • the invention requires an efficient speech codec that can be accommodated into a small space and implemented for example as a Java class, which allows the amount of data needed for transmitting the codec to be minimized.
  • the computer 104 must then naturally comprise a support for the Java.
  • the Java is not, however, the only technology available, but the speech codec can also be implemented by applying other prior art means.
  • the speech codec may also be located in a telephone answering machine, in a device of the Nokia Communicator-type, in a PDA (Personal Digital Assistant), etc.
  • the telephone connection can be established using an analog or a digital telephone connection, as described above, or over the radio path, using a mobile telephone, for example, or over a cable television network, a wireless subscriber connection (Wireless Local Loop), etc.

Abstract

The invention relates to a method, system and software product for transmitting speech on the Internet. The method comprises: (402) compressing data containing digitized speech into a file at the transmission by means of compression software; (404) arranging into the file decompression software for decompressing the compression employed; (406) dividing the file into data packets; (408) transmitting the packets to the recipient over the Internet; (410) decompressing at the reception end the received compressed data by using the received decompression software.

Description

METHOD, SYSTEM AND SOFTWARE PRODUCT FOR TRANSMITTING SPEECH ON INTERNET
FIELD OF THE INVENTION
The invention relates to a method for transmitting speech on the Internet, particularly data containing compressed speech.
BACKGROUND OF THE INVENTION
In addition to the public switched telephone network, a call can be implemented using the Internet. In such case there is no circuit-switched connection of fixed data transfer capacity reserved for the call, but speech is transferred in packets using a packet-switched data transfer connection available through the Internet. Significant problems arise from delays that may occur in the data transfer. One way of minimizing the delays is to minimize the amount of data to be transferred.
In telephones connected to the public switched telephone network, speech is coded using PCM (Pulse Code Modulation). PCM is a three-phase process where speech is first sampled at a rate which is twice the highest frequency, i.e. 2 x 4000 Hz = 8000 Hz. The samples are quantized to 256 separate levels. Finally each quantized sample is coded to provide an 8-bit code word, whereby 8000 samples x 8 bits/sample = 64 000 bit/s, i.e. the transfer rate required by a standard telephone connection is 64 kbit/s. In practice the coding only means that the samples are digitized.
Efficient speech codecs have been developed for Internet calls, as well as for mobile networks, to allow the amount of data to be transferred to be minimized. The term 'speech codec' already indicates that the codec is re- sponsible for coding and decoding speech. The specification H.323 of the ITU (Internal Telecommunications Union), for example, defines a speech codec functioning at the rate of 5.3 kbit/s. However, the audio quality of speech produced by this codec is not as good as the quality of speech produced using the PCM. The manufacturers are therefore continuously developing new, more efficient codecs producing audio of increasingly higher quality. The codecs are in fact usually different compression and decompression methods used for packing digitized speech.
The use of proprietary speech codecs involves major problems re- lated to the incompatibility of the speech codecs of different manufacturers, in other words, data compressed with a speech codec cannot be decoded with another speech codec. The reason for this is that the speech codecs of different manufacturers employ different voice compression algorithms, which are usually implemented by software. A prior art solution to the latter problem is that the recipient fetches a speech codec, or at least software for decompressing the compression employed, from the manufacturer's WWW (World Wide Web) pages. This solution involves some drawbacks. The software must be fetched before a connection is activated, but how can the recipient predict an incoming call and, above all, the speech codec that has been used in the call setup? Another problem is that the user usually must pay for the software, unless the software in question is what is known as freeware or shareware, for which payment is usually expected as well.
BRIEF DESCRIPTION OF THE INVENTION It is therefore an object of the present invention to provide a method and an equipment implementing the method which allow the above problems to be solved. This is achieved with the method described below. The method in question is used for transmitting speech on the Internet and it comprises: compressing data containing digitized speech into a file at a transmission end by using compression software. The method further comprises: arranging into the file decompression software for decompressing the compression employed; dividing the file into data packets; transmitting the packets to a recipient over the Internet; decompressing the received compressed data at the reception end by using the received decompression software. The invention further relates to a system for transferring speech on the Internet. The system comprises: compression software at transmission end equipment for compressing data containing digitized speech into a file and for arranging into the file decompression software to be used for decompressing the compression employed; packet transmission software at the transmission end equipment for dividing the file into data packets and for transmitting the packets to the recipient over the Internet; packet transmission software at a reception end for receiving the packets and for assembling the transmitted file from the packets; the reception end being arranged to decompress the received compressed data by using the received decompression software. The invention still further relates to a computer software product for transmitting speech on the Internet, the product comprising software stored into a software storage means and readable into a computer. The software carries out the method steps of: compressing data containing digitized speech into a file at a transmission end by using compression software; arranging into the file decompression software decompressing the compression employed, the recipient using the software at the reception end to decompress the received compressed data; dividing the file into data packets; transmitting the packets to the recipient over the Internet. The preferred embodiments of the invention are disclosed in the dependent claims.
The invention is based on the idea that the recipient of a call does not need worry about obtaining the software for decompressing compressed data packets, but the software is delivered into the recipient's equipment to- gether with the speech packets.
The method and system of the invention provide many advantages. The invention allows speech to be transmitted on the Internet flexibly by using diverse speech codecs, without the drawbacks involved in the prior art. The user does not necessarily need to know the technical details, but s/he may be content that the speech quality and the costs meet his/her requirements. The recipient of speech coded into packets does not need any decoding system to be able to listen to the speech, only a standard computer provided with standard user interfaces and sound cards, or even just an ordinary telephone.
BRIEF DESCRIPTION OF THE DRAWINGS In the following the invention will be described in greater detail in connection with preferred embodiments and with reference to the accompanying drawings, in which
Figures 1A and 1B illustrate different ways of setting up an Internet call; Figure 2 illustrates an example of the content of the data packets to be transmitted;
Figure 3A illustrates an example of a structure of transmission end equipment;
Figure 3B illustrates an example of a structure of reception end equipment; Figure 4 is a flow chart illustrating a method for transmitting speech on the Internet.
DETAILED DESCRIPTION OF THE INVENTION
There are different ways of transmitting speech on the Internet. In the following, two currently used methods will be described by way of example, the invention not being, however, restricted to them.
Figure 1A illustrates the first method. A user 100 typically has a computer 104 which includes a microphone, a sound card and the software needed to convert speech 102 into data. The conversion is made using a speech codec, which can also be used for compressing data. Speech data is transmitted in packets over the Internet 106 to a recipient 112, using for example TCP/IP as the data transfer protocol. In other words, the user's 100 computer 104 and the recipient's 112 computer 108 must be connected to the Internet 106 over an interface 140. The interface 140 to the Internet 106 may be implemented in any prior art manner. A typical way is to use the modem connected to the computer 104 for setting up a public switched connection through a telephone exchange to the server of the Internet service provider. Other alternatives are to use a fixed connection, a cable provided by a cable television network or a wireless radio connection.
A characteristic of the first method is that the speech is coded and converted into packet format in the user's equipment, prior to the Internet interface 140. The server delivers the packets to the recipient's 112 computer 108 which also comprises a speech codec that converts the packets back to speech 110 which is played to the recipient 112 using the sound card and loudspeakers of the computer 108. The above example provides only one alternative for implementing an Internet call, current technology already makes other solutions possible as well.
Figure 1B illustrates another method for implementing an Internet call. The user 100 has a standard analog or digital telephone 114 at his/her disposal. The speech codec of an ordinary analog telephone is located in a switching centre where the speech signal is supplied in an analog form to the speech codec which converts it into a digital form. An ISDN telephone comprises a built-in speech codec, therefore the signal supplied to the switching centre is in a digital form. So-called PBX exchanges, i.e. private exchanges of companies, may comprise a speech codec, in which case the connection from the PBX to the switching centre is digital, or the connection may be analog, in which case the speech codec is located in the switching centre. By calling a specific number with the telephone 114, or by selecting a specific network identifier, the call is connected to a server 118 of an operator providing Internet services. The user 100 is then given the dial tone again and s/he may select the number of the person 112 s/he wishes to call.
The speech codec can thus be located for example in a fixed network switching centre or at the service provider's server. The compression software employed and the software for packet-switched transmission, however, are preferably located at the operator's server. Compressed packets are then delivered over the Internet 106 to the recipient 112 of the call. Speech travels on an interface 152 in an uncoded form, on an interface 154 it travels either coded or uncoded, depending on the location of the speech codec, and on an interface 156 coded packets are transmitted.
It is naturally also possible that the speech codec and the software for packet-switched transmission are in the user's 100 telephone 114, the packets being thus formed already in the telephone 114. In that case the local connection of the user 100 must be digital. Consequently, packets do not need to be formed at the server 118 of the operator providing Internet services, but they only need to be transmitted to the Internet.
The systems illustrated by way of example in Figures 1A and 1B can also be combined. With an equipment integrated into the computer 104, the user can call another user over the Internet 106, the other user having a connection through the server 120 of the service provider to his/her telephone 124 over the public switched telephone network 122, as shown in Figure 1 B.
It can be anticipated that in the future telephone services will be charged according to the amount of bits transferred, instead of the time the line is being occupied, as in current billing. An interesting issue from the user's point of view will therefore be the efficiency of speech coding: the more efficient the speech coding is, the less transfer capacity is needed for transferring the call, and the lower will be the call charge. It is possible that in the future the party making the call may select the speech codec s/he wishes to use. For important calls the user selects a speech codec that ensures good audio qual- ity but requires a large amount of transfer capacity. For less important calls the user selects a speech codec that ensures understandable speech. In practice current speech codecs of fixed network telephones are chips, in mobile telephones they are provided by an optimized digital signal processor with the related software, and in Internet calls by a standard computer processor, the speech codec being implemented by software alone, without any special circuits.
Consequently, there may be any number of different speech codecs available in the future. Each user may have personal preferences or financial reasons, for example, for preferring a specific speech codec. However, the recipient of the call must have the same speech codec as the party initiating the call at his/her disposal, or at least its decoding portion, otherwise the speech coding cannot be decoded at the reception end.
There are currently different standardization bodies, such as the ETSI and the ITU, which set standards for example for accepted speech codecs. A problem that arises is therefore whether a speech codec used by a user is accepted by the standardization body, in which case the speech codecs used by users support it. If the speech codec is not accepted, it is not necessarily supported either, which makes it unsuitable for large scale use. Or, if the user wishes to use a specific speech codec, s/he must make sure in advance that the recipient of the call also has the speech codec in question at his/her disposal.
In Internet calls the speech codec is then transmitted to the recipient by packing it already at the transmitter 100 end into a file comprising compressed speech data. The file is transmitted in a packet-switched format, i.e. it is divided into data packets, over the Internet to the recipient. The computer 110 of the recipient 112 then only releases the speech codec from the packets (or just a speech decoder), installs it and starts to decompress speech from the file consisting of the packets. Another alternative is to run the speech codec only when a speech file is to be listened to.
Figure 2 illustrates an example of the content of the data packets to be transmitted. The rectangular areas depict the packets. For the sake of clarity, details required by the data transfer protocol employed are not shown, but only the payload to be transferred in the packets.
The upper part of the example illustrates the transmission of three packets. Two of the packets comprise only compressed speech data 200C, 200B, one packet comprising both compressed speech data 200A and decompression software 202. The lower example shown in the Figure is otherwise the same as the upper one, except for the decompression software 202A, 202B, which is now longer and therefore requires the transfer capacity of two packets. In practice there are usually more packets than those shown in the Figure. Figure 4 is a flow diagram illustrating the method of the invention.
The execution of the method starts at block 400. In block 402 data containing digitized speech is compressed at the transmission end into a file by using compression software. Next, in block 404, decompression software for decompressing the compression used is added into the file that comprises the compressed data.
In block 406 the file is divided into data packets, for example as shown in Figure 2, decompression software decompressing the compression employed being placed at least in one packet containing compressed data, or into some other packet. In block 408 the packets are transmitted to the recipient over the
Internet. At the reception end in block 410, the compressed data is decompressed using the received decompression software.
In an embodiment, the decompression software installed at the reception end is not made as a permanent part of the reception end equipment, but the software is only carried out in the reception end processor when a received file containing speech is to be listened to. This provides the advantage that the listening of a speech message or a call does not cause permanent changes to the software in the recipient's equipment. The decompression software is run in the random access memory of the recipient's equipment for decompressing data packets or a compiled data file, the speech data obtained being then played to the recipient. When the data has been listened to, the decompression software is not left permanently installed into the recipient's equipment. Another option is to store a decompression software file at the reception end. This may provide an advantage in that a user using the same compression software would not need to supply the decompression software any more to the user in question. Compressed speech data can also be stored as a file at the reception end. In that case the recipient in a way records the call, i.e. s/he may listen again at least to the calling party's portion. The decompression software and the compressed speech data are usually both stored into the same file at the reception end, which makes it easy to arrange the message to be played. In a preferred embodiment, software compressing speech data is arranged into at least one packet. This allows also the recipient of the call to use the compression used by the caller, either during the same call or later.
Although the examples illustrate the implementing of an Internet call, the system can also be used for carrying out other speech transmission applications, such as voice mail. Voice mail is unidirectional communication that does not take place in real-time.
With reference to Figures 3A and 3B, an example of an equipment needed in the example according to Figure 1A will be described. Figure 3A shows the transmission end equipment and Figure 3B the reception end equipment.
A computer 104, 108 comprises a display 300, keyboard 302, mouse 304, sound card 314, at least one loudspeaker 306 connected to the sound card, a microphone 308 connected to the sound card, a device provid- ing access to the Internet, e.g. a modem 310, a mass memory device, such as a hard disk 312, and a central processing unit 320. The central processing unit 320 is used for carrying out the operating system and the application software. From the point of view of the invention, the most important application software is the one providing an interface that allows the sound card 314 to be used. This software can be used for playing for example .wav files and for recording .wav files. In addition, the computer comprises telecommunications means, together with the software involved, such as the modem 310 and packet transmission software 324, that can be used for establishing a connection for example through the public switched telephone network to the server of the Internet service provider. Today's standard computers and their operating systems comprise the described elements. The Windows environment, for example, includes a specific sound reproduction service, MCI, that provides an API (Application Programmers Interdace) speech codec for use. The API is always the same, irrespective of the sound card, because there are a plural number of different drivers in the Windows environment for different sound cards. The MCI alone cannot, however, be used as a speech codec because it is not efficient enough: when voice is to be created, the amount of data may even double.
Furthermore, in accordance with the invention, the software com- prises compression software 326 at least at the transmission end for packing data comprising voice. The transmission end must comprise decompression software 202 which is arranged into the packets together with the speech data 200. In a voice mail application, for example, the speech data 200 and the decompression software 202 decompressing the speech data are received from the network through the telecommunications means 310; the decompression software then decompresses the packets, plays them with a player 308, the sound being created in the sound card 314 and transmitted to the loud- speacker 306 which reproduces the sound. Correspondingly, voice mail may be created by recording speech with the microphone 308 and the sound card 314, in which case the compression software 326 packs the sound into a file. The file is converted into packets using packet transmission software 324 and transmitted into the network using the telecommunications means 310. For the recipient, the file is provided with the decompression software 202.
The invention requires an efficient speech codec that can be accommodated into a small space and implemented for example as a Java class, which allows the amount of data needed for transmitting the codec to be minimized. The computer 104 must then naturally comprise a support for the Java. The Java is not, however, the only technology available, but the speech codec can also be implemented by applying other prior art means.
It is to be noted that although the example describes the invention in connection with an Internet call, the invention is not restricted thereto, but it can be utilized in principle in connection with any technology platform employing packet-switched traffic for implementing a call. Consequently, the speech codec may also be located in a telephone answering machine, in a device of the Nokia Communicator-type, in a PDA (Personal Digital Assistant), etc. Similarly, the telephone connection can be established using an analog or a digital telephone connection, as described above, or over the radio path, using a mobile telephone, for example, or over a cable television network, a wireless subscriber connection (Wireless Local Loop), etc.
Although the invention is described above with reference to an example according to the accompanying drawings, it is apparent that the invention is not restricted to it, but may vary in many ways within the inventive idea disclosed in the claims.

Claims

1. A method for transmitting speech on the Internet, the method comprising
(402) compressing data containing digitized speech into a file at a transmission end by using compression software; characterized in by
(404) arranging into the file decompression software for decompressing the compression employed;
(406) dividing the file into data packets; (408) transmitting the packets to the recipient over the Internet;
(410) decompressing the received, compressed data at a reception end by using the received decompression software.
2. A method according to claim 1, characterized in that the decompression software is not installed at the reception end as a permanent part of the reception end equipment software, but it is only carried out in the reception end processor when a received file containing speech is to be listened to.
3. A method according to claim 1, characterized in that the decompression software is stored as a separate file at the reception end.
4. A method according to claim 1, characterized in that the compressed speech data is stored into a file at the reception end.
5. A method according to claim 1, characterized in that at least one packet is provided with compression software for compressing speech data.
6. A method according to claim 1, characterized in that the transmission of speech implements an Internet call.
7. A method according to claim 1, characterized in that the transmission of speech implements a voice mail system.
8. A system for transmitting speech on the Internet, charac- t e r i z e d in that the system comprises compression software (326) at transmission end equipment for compressing data containing digitized speech into a file and for arranging into the file decompression software (202) to be used for decompressing the compression employed; packet transmission software (324) at the transmission end equipment for dividing the file into data packets and for transmitting the packets to the recipient over the Internet (106); packet transmission software (324) at the reception end for receiv- ing the packets and for assembling the transmitted file from the packets; the reception end being arranged to decompress the received compressed data by using the received decompression software (202).
9. A system according to claim 8, characterized in that the decompression software (202) is not installed as a permanent part of the soft- ware in the reception end equipment, but it is only carried out in the reception end processor when a received file containing speech is to be listened to.
10. A system according to claim 8, characterized in that the decompression software (202) is stored as a separate file at the reception end.
11. A system according to claim 8, characterized in that the compressed speech data is stored into a file at the reception end.
12. A system according to claim 8, characterized in that at least one packet is provided with compression software for compressing speech data.
13. A system according to claim 8, characterized in that the system implements an Internet call.
14. A system according to claim 8, characterized in that the system implements a voice mail system.
15. A computer software product for transmitting speech on the Internet, the product comprising software stored into a software storage means and readable into a computer, characterized in that the software carries out the method steps of
(402) compressing data containing digitized speech into a file at a transmission end by using compression software;
(404) arranging into the file decompression software for decom- pressing the compression employed, the recipient using the software at the reception end to decompress the received compressed data; (406) dividing the file into data packets; (408) transmitting the packets to the recipient over the Internet.
PCT/FI2000/000759 1999-09-08 2000-09-07 Method, system and software product for transmitting speech on internet WO2001019066A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU70042/00A AU7004200A (en) 1999-09-08 2000-09-07 Method, system and software product for transmitting speech on internet

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FI19991914 1999-09-08
FI991914A FI19991914A (en) 1999-09-08 1999-09-08 Procedure, systems and computer software for transmitting voice over the Internet

Publications (1)

Publication Number Publication Date
WO2001019066A1 true WO2001019066A1 (en) 2001-03-15

Family

ID=8555255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/FI2000/000759 WO2001019066A1 (en) 1999-09-08 2000-09-07 Method, system and software product for transmitting speech on internet

Country Status (3)

Country Link
AU (1) AU7004200A (en)
FI (1) FI19991914A (en)
WO (1) WO2001019066A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7400678B2 (en) 2001-01-04 2008-07-15 Fast Search & Transfer Asa Methods in transmission and searching of video information

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0600682A1 (en) * 1992-12-03 1994-06-08 AT&T Corp. Image communication technique
US5606599A (en) * 1994-06-24 1997-02-25 Intel Corporation Method and apparatus for automatically converting from an analog voice mode to a simultaneous voice and data mode for a multi-modal call over a telephone line
US5673392A (en) * 1994-04-26 1997-09-30 Murata Mfg. Co., Ltd. Method of executing communication program in modem apparatus
WO1998040969A2 (en) * 1997-03-14 1998-09-17 J.Stream, Inc. Text file compression system
WO1999022557A2 (en) * 1997-10-30 1999-05-14 Nokia Mobile Phones Limited Subnetwork dependent convergence protocol for a mobile radio network
EP0929173A2 (en) * 1998-01-09 1999-07-14 Siemens Information & Communication Networks, Inc. Universal voice/fax/modem line over compressed media

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0600682A1 (en) * 1992-12-03 1994-06-08 AT&T Corp. Image communication technique
US5673392A (en) * 1994-04-26 1997-09-30 Murata Mfg. Co., Ltd. Method of executing communication program in modem apparatus
US5606599A (en) * 1994-06-24 1997-02-25 Intel Corporation Method and apparatus for automatically converting from an analog voice mode to a simultaneous voice and data mode for a multi-modal call over a telephone line
WO1998040969A2 (en) * 1997-03-14 1998-09-17 J.Stream, Inc. Text file compression system
WO1999022557A2 (en) * 1997-10-30 1999-05-14 Nokia Mobile Phones Limited Subnetwork dependent convergence protocol for a mobile radio network
EP0929173A2 (en) * 1998-01-09 1999-07-14 Siemens Information & Communication Networks, Inc. Universal voice/fax/modem line over compressed media

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7400678B2 (en) 2001-01-04 2008-07-15 Fast Search & Transfer Asa Methods in transmission and searching of video information
US7936815B2 (en) 2001-01-04 2011-05-03 Microsoft International Holdings B.V. Methods in transmission and searching of video information

Also Published As

Publication number Publication date
AU7004200A (en) 2001-04-10
FI19991914A (en) 2001-03-08

Similar Documents

Publication Publication Date Title
JPH10336325A (en) Network independent communication system
US8103254B2 (en) Method and system for providing multimedia ring back tone service by using receiver-side switching center
US7251314B2 (en) Voice message transfer between a sender and a receiver
WO2002023319A1 (en) Voice integrated voip system
US20110222531A1 (en) voIP ACCESSORY
WO2001065787A1 (en) Method, apparatus, and system for using tcp/ip as the transport layer for screen phones
WO2007133856A2 (en) Network-independent ringback feature
KR20010084869A (en) Internet based telephone apparatus
US6195358B1 (en) Internet telephony signal conversion
JP4473260B2 (en) Telephone communication device
US6977911B1 (en) Scalable voice over IP system configured for dynamically switching codecs during a call
US8837324B2 (en) Methods for accessing end-to-end broadband network via network access server platform
US20020042825A1 (en) Internet based telephony service method
KR100590539B1 (en) Method and System for Providing Ring Back Tone Service in Packet Communication Network
WO2001019066A1 (en) Method, system and software product for transmitting speech on internet
EP1758274A1 (en) Information providing system, method and program
KR20020031007A (en) The system and method of a internet phone, capable of oneclick communication with an appointor on website
US7551729B1 (en) Method and apparatus for increasing channel capacity in an IP-based voice messaging system
KR100735350B1 (en) METHOD FOR FIXING A CODEC WITH RESPECT TO LINK STATE ON VoIP SERVICE IN AN ACCESS NETWORK
KR20020016333A (en) Telephone message procsssing system in wire/wireless communication network and telephone message procsssing method using the same
EP1040637A1 (en) Automatic answering telephone accessible to internet
JP3712967B2 (en) COMMUNICATION SYSTEM, COMMUNICATION DEVICE, COMMUNICATION METHOD, COMMUNICATION PROGRAM, AND RECORDING MEDIUM CONTAINING COMMUNICATION PROGRAM
KR100917363B1 (en) IP phone having audio file play function
Penton et al. Telgo323: An H. 323 bridge for Deaf telephony
KR100331882B1 (en) Intelligent Peripheral system and Method for image service using the same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ CZ DE DE DK DK DM DZ EE EE ES FI FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP