WO2000067417A1 - Robust coding for the transmission of audio or video signals - Google Patents

Robust coding for the transmission of audio or video signals Download PDF

Info

Publication number
WO2000067417A1
WO2000067417A1 PCT/GB2000/001649 GB0001649W WO0067417A1 WO 2000067417 A1 WO2000067417 A1 WO 2000067417A1 GB 0001649 W GB0001649 W GB 0001649W WO 0067417 A1 WO0067417 A1 WO 0067417A1
Authority
WO
WIPO (PCT)
Prior art keywords
bits
group
digital codes
packet
significance
Prior art date
Application number
PCT/GB2000/001649
Other languages
French (fr)
Inventor
Mark Brian Sandler
Geith Mark Benjamin Leslie
Original Assignee
Insonify Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Insonify Limited filed Critical Insonify Limited
Priority to EP00925502A priority Critical patent/EP1177651A1/en
Priority to AU44223/00A priority patent/AU4422300A/en
Publication of WO2000067417A1 publication Critical patent/WO2000067417A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0083Formatting with frames or packets; Protocol or part of protocol for error control
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L2001/0098Unequal error protection

Definitions

  • the present invention relates to the transfer of digital data.
  • the Internet and other digital media are being used more and more for the transmission of data.
  • the transmitted data falls into two categories. These categories comprises data that must be received, allowing for error correction, exactly as it was transmitted and data that, on reception, need only correspond with the transmitted data within certain tolerances, i.e. loss-tolerant signals.
  • the first category includes document files, financial information and program code, for example Java applets.
  • the second category comprises data that is primarily intended to be rendered perceptible to the human senses, for example photographic images, music and speech.
  • the present invention is concerned with the transmission of data in the second category.
  • a method of transmitting a loss-tolerant signal comprising selecting a bit from each digital code in a group of digital codes representing a time-varying signal at a plurality of instants, the selected bits all having the same significance, and transmitting the selected bits together.
  • an apparatus for transmitting a loss-tolerant signal comprising selection means for selecting a bit from each digital code in a group of digital codes representing a time- varying signal at a plurality of instants, the selected bits all having the same significance, and means for transmitting the selected bits together.
  • the selected bits have the significance of the most significant 1 bit in the group of digital codes.
  • a further bit may be selected from each digital code, the selected further bits all having the same, lower significance, and transmitting the selected further bits together.
  • the transmitted signal increases in fidelity as the number of times bits are selected increases.
  • the selected further bits are transmitted after said selected bits. However, this is not essential but the selections should be made from an unbroken series of bit significances.
  • the digital codes each comprise a sign bit and a plurality of magnitude bits, with the sign bits being accorded a significance equivalent to that of the most significant magnitude bits.
  • the step of selecting bits having the same significance may be repeated for different significance levels, said significance levels being selected in dependence on the bandwidth of a channel through which the bits are to be transmitted and being in an unbroken sequence.
  • a single file containing, for example a piece of music can be used to provide the piece of music to a remote listener with different degrees of fidelity simply by reading and transmitting the appropriate sub-set of bits from the file.
  • a first subset of the bits, having significances in a first upper range, may be selected and transmitted in a first packet with a second subset of the bits, having significances in a second lower range, being selected and transmitted in a second packet, the packets including the same destination address.
  • the first packets can be sent with, for example, approximately 25% of the data required for full fidelity to provide a preview to a user who can then request the second packets.
  • the data from the second packets is added to the data from the first packets at the receiver to increase the fidelity of the signal presented to the user.
  • the second packets may also include the data sent in the first packets.
  • the first and second packets are preferably distinguished by respective quality of service codes in accordance with a quality of service routing protocol of a router in a path to the destination identified by the destination address. Consequently, the quality of service routing protocol is more likely to discard less significant data so that gaps in the received signal are less likely to occur, instead there will be short-duration reductions in fidelity which a listener, for example in the case of audio signals, may not even notice.
  • the bits are transmitted in packets and each packet comprises bits selected from a respective first group of digital codes and bits selected from a respective second group of digital codes, the second group representing an earlier part of said time-varying signal than the first group. More preferably, a greater portion of the packet is given over to the bits selected from the first group than to the bits selected from the second group.
  • the packets may comprise three or more sections generally of decreasing size and containing successively older data.
  • the transmitted bits are compressed.
  • a method according to the present invention may comprises re-ordering the bits of said digital codes so as to group the bits thereof by significance before selecting bits for transmission.
  • a method according to the present invention includes transforming time domain samples into frequency domain coefficients, wherein said digital codes comprises frequency domain coefficients.
  • frequency domain coefficients includes quasi-frequency domain coefficients such as produced by wavelet packet transforms as well as the coefficients produced by modified discrete cosine transformations (MDCT) and the like. Wavelet packet transforms produce "scale” coefficients. However, these coefficients are dependent on the spectral content of the transformed signal.
  • each coefficient produced by the wavelet packet transform should be the result of the same number of filtering and decimation steps, such as a symmetrically branching tree arrangement of filter and decimator functional units.
  • Each node having the same depth will have the same number of branches but the number of branches may differ between nodes at different depths.
  • the bits to be transmitted may be compressed during a re-ordering process.
  • the re-ordering comprises arranging the coefficients in a representation of a two-dimensional matrix having a separate column for each time slot and separate row for each frequency subband. More preferably, the rows are ordered by frequency subband. However, the rows may be ordered in whatever manner produces the best compression, which will be dependent on the spectral content of the original signal. Still more preferably, the re-ordering comprises for each significance level of the coefficients in each column in row order, replacing runs of zeros terminating at an edge row with a termination marker, e.g.
  • the re- ordering comprises for each significance level of the coefficients in each column in row order, replacing runs of zeros through coefficients, having a most significant 1 bit in a yet unhandled significance level, and terminating before an edge row with a run-length code.
  • the run-length code may comprise a prefix defining a range and suffix defining a position within the range defined by the suffix.
  • bits in significance levels above that containing the most significant 1 bit among the coefficients are discarded during re-orde ⁇ ng.
  • a method of receiving a loss-tolerant signal comprising receiving bits of a first group of digital codes, re-ordering the bits to produce a second group of digital codes, ⁇ ach member of which comprises at least the most significant bit or bits of a corresponding member of said first group of digital codes, and generating a time-varying signal using said second group of codes.
  • an apparatus for receiving a loss-tolerant signal comprising: receiving means for receiving bits of a first group of digital codes, the more significant bits of said codes preceding the less significant bits; means for re-ordering the bits to produce a second group of digital codes, each member of which comprises at least the most significant bit or bits of a corresponding member of said first group of digital codes; and means for generating a time-varying signal using said second group of codes.
  • the receiving method and apparatus should be adapted to meet the requirements of the various signal forms produced in accordance with the transmission aspect of the present invention.
  • the codes of the second group are padded with zeros in positions for which bits were not received, such that the digital codes of the second group have the same number of bits as the digital codes of the first group.
  • the method according to the present invention comprises receiving a first subset of the bits of the digital codes of the first group, having significances in a first upper range, in a first packet and receiving a second subset of the bits of the digital codes of the first group, having significances in a second lower range, in a second packet, and appending the bits of the second subset to those of the first subset before re-ordering to produce the second group of digital codes.
  • a method comprises receiving a plurality of packets, each packet comprises a first section followed by a second section, the first section comprising re-ordered bits from one group of digital codes and the second section comprising re-ordered bits from another group of digital codes, wherein the bits in the second section represent an earlier part of said time-varying signal than those in the first section.
  • such a method comprises the steps of:- receiving a first packet; re-ordering the bits in the first section for reproducing said time-varying signal; receiving a second packet; determining that an intervening packet has been lost; re-ordering the bits of the second section of the second packet for reproducing said time-varying signal; and thereafter re-ordering the bits of the first section of the second packet for reproducing said time-varying signal.
  • the re-ordered bits are preferably represented in the form of a two- dimensional matrix having a column for each time slot and a row for each frequency subband represented by the coefficients and re-ordering comprises for each significance level the coefficients of each time slot, replacing a predetermined termination marker, if present, with a run of zeros terminating at an edge row.
  • re-ordering preferably comprises for each significance level of each time slot, determining the presence of a run-length code and, if a run-length code is present, replacing it with a run of zeros having a length determined by the run- length code. More preferably, the run-length code comprises a prefix defining a range and suffix defining a position within the range defined by the suffix.
  • Padding of untransmitted more significant bit positions of the first group digital codes may be required.
  • a significance code may need to be obtained from the received signal.
  • the present invention may be embodied in an audio playback device, wjiich may be portable, including memory means for storing bits received by the receiving means, wherein the re-ordering means is configured to re-order bits stored in the memory for generating a playback audio signal.
  • a method of operating a network routing node for routing a signal packet in which more significant bits precede less significant bits comprising determining the bandwidth available in a path away from the node and, if the bandwidth is below a threshold value, truncating the data in said packet in dependence on the determined bandwidth.
  • the method may be applied in a network having a TDMA wireless link to a terminal apparatus, wherein the bandwidth varies with the number of slots available in each frame for transmissions to the terminal apparatus.
  • Transmitters and receivers according to the present invention may take many forms.
  • a voice over IP (VOIP) terminal can be made by combining the transmitting and receiving parts in the same apparatus.
  • VOIP voice over IP
  • Figure 1 shows a transmitter and a receiver connected by a network
  • Figure 2 illustrates a transmitter according to the present invention
  • Figure 3 illustrates the process of interleaving the bits of a group of digital codes
  • Figure 4 illustrates a receiver according to the present invention
  • Figure 5 illustrates a wireless link to a terminal apparatus
  • Figure 6 is a flowchart illustrating the operation of a network node
  • Figure 7 is a flowchart illustrating the operation of a receiver according to the present invention.
  • Figure 8 illustrates a transmitter according to the present invention
  • Figure 9 is a flowchart illustrating the operation of a transmitter according to the present invention.
  • Figure 10 illustrates the format of a data part of a packet according to the present invention
  • Figure 11 is a flowchart illustrating the operation of a receiver according to the present invention.
  • Figure 12 illustrates a receiver according to the present invention
  • Figure 13 illustrates a wavelet packet transform operation
  • Figure 14 illustrates the output of a wavelet packet transform operation
  • Figure 15 is a flowchart illustrating the operation of a transmitter according to the present invention
  • Figure 16 is a flowchart illustrating the operation of a receiver according to the present invention
  • Figure 17 illustrates a receiver according to the present invention
  • Figure 18 is a flowchart illustrating the operation of a receiver according to the present invention
  • Figure 19 is a flowchart illustrating the operation of a transmitter according to the present invention
  • Figure 20 is a simplified view of the data for one time-slot illustrating the compression effected by the method illustrated in Figure 21;
  • Figure 21 is a flowchart illustrating the operation of a receiver according to the present invention.
  • Figure 22 shows a portable audio playback device according to the present invention.
  • a transmission system comprises a transmitter 1, a receiver 2 and a transmission medium 3.
  • the transmission medium 3 may be characterised as having a bandwidth (B) determined by the minimum bandwidth of the signal path, e.g. bottleneck 4, and a notional switch 5 that directs signal portions into oblivion 6.
  • the switch 5 may operate stochastically or according to a pattern, e.g. EMC from vehicle ignition systems.
  • EMC electronic Chemetic Call
  • the transmitter 1 comprises a computer provided with a source of audio signals 7, e.g. a magnetic tape recording or a microphone, an analogue-to-digital converter 8 and a hardware network interface (not shown).
  • the computer also supports an interleaver process 9, a transmission process 10, a UDP socket 11 and a request processing process or processes (not shown).
  • the analogue-to-digital converter 8 digitises the output of the audio signal source 7 ( Figure 3(a)) as signed 4-bit numbers ( Figure 3(b)). (4-four bit numbers are used here in the interests of clarity and it will be appreciated that a larger number of bits, e.g. 10 or 16, would be desirable in many circumstances.)
  • the interleaver process 9 reads the samples from the digital-to-analogue converter 8.
  • the interleaver process 9 processes the samples in groups of 32.
  • the bits of the samples in a complete group are interleaved and then added to a file 12, created for the current audio signal, or supplied to the transmission process 10.
  • a characteristic of interleaved sample groups is that their tails can be lopped without making the remaining data unusable. Referring to Figure 3(d), it can be seen that the form of the original waveform is largely retained even though the last 32 bits of a sample group have been lost. This characteristic is even more marked in the case of samples comprising larger numbers of bits.
  • the transmission process 10 sends interleaved sample data from the interleaver process 9 or a file 12 to a UDP socket for transmission to a requesting receiver 2. To achieve this the transmission process 10 is also provided with a channel class and a destination address. The operation of the transmission process 10 will be described in more detail below.
  • the receiver 2 comprises a computer provided with a digital-to-analogue converter 20, a loudspeaker 21 and a hardware network interface (not shown).
  • the computer also supports a UDP socket 16, a reception process 17, a de-interleaver process 18 and a buffering process 19.
  • UDP datagrams from the transmitter 1 are received by the UDP socket 16 and their data contents passed to the de-interleaver process 18 by the reception process 17.
  • the de- interleaver process 18 de-interleaves the received data and passes the de-interleaved sample groups to the buffer process 19.
  • the buffer process 19 ensures that, subject to the datagrams arriving at rate greater than some system-dependent threshold rate, the samples are presented to the digital-to-analogue converter 20 at a constant rate and the output of the digital-to-analogue converter 20, and hence the output of the loudspeaker, faithfully reproduces the frequency components of the original signal.
  • the ability of the system to tolerate the truncation of interleaved sample groups means that the contents of datagrams can be tailored according to the bandwidth (B) of the signal path between the transmitter 1 and the receiver 2.
  • a user of the receiver 2 may wish to hear a piece of music stored in a file 12 at the transmitter 1.
  • the user therefore instructs a process (not shown) running at the receiver 2 to send a request to the transmitter 1.
  • This request includes the identity of the file 12 and a channel class.
  • the channel class may be determined on the basis of the type of connection between the receiver 2 and the intervening network, e.g. dial-up, ISDN, WAP, and/or a locally determined actually achieved bitrate.
  • the request is received by a receiver process (not shown) which starts the transmission process 10 with the file and channel class as parameters.
  • the transmission process 10 starts to read groups of interleaved samples from the file 12.
  • the transmission process 10 includes all the bits of a respective interleaved sample group in each transmitted UDP datagram. However, if the channel belongs to a second lower bandwidth class, the transmission process 10 omits the final 32 bits of each interleaved sample group when generating the UDP datagram data parts. Similarly, if the channel belongs to a third, worse class, the final 64 bits of each interleaved sample group are omitted. Thus, it can be seen that a single data file can produce useable signals at different bit-rates very simply with little, if any, processing overhead.
  • the reception process 18 when each datagram is received, the reception process 18 sends its data part to the de-interleaver process 18. If the number of data bits in the datagram is less than 160, the reception process 18 pads the end of the received data with zeros to make it up to 160 bits in total.
  • a mobile telephone network includes a serving GPRS support node (SGSN) 25 and a mobile station 26.
  • GPRS General Packet Radio Service
  • SGSN serving GPRS support node
  • mobile station 26 GPRS (General Packet Radio Service) allocates time slots in a GSM frame for data transmission dynamically on the basis of traffic levels.
  • a GPRS transmission will be allocated eight time slots per frame, if there is little traffic in the cell containing the source/destination mobile station 26 but will be allocated successively fewer time slots per frame as traffic in the cell increases.
  • the SGSN 25 is largely conventional save that it is configured, e.g. by programming a component computer, to identify UDP datagrams of the type produced by the transmitter 1 shown in Figure 2. To achieve this, the seventh bits of th£ "type of service" portions of the UDP headers are set to one by the UDP socket 11 and the SGSN 25 looks for these. Referring to Figure 6, when the SGSN 25 detects a UDP datagram of the subject type (step si), it determines the number of slots per frame allocated to data communication with the destination mobile station 26 (step s2). If the number of slots per frame is less than 2 (step s3), the SGSN 26 removes a predetermined portion of the tail of each UDP datagram (step s4) before sending it on to the mobile station 26.
  • one or more further thresholds may be used to trigger truncation of packets by differing amounts.
  • threshold or thresholds triggering truncation of the UDP packets will depend on the packet sizes and the data capacity of each time slot.
  • a "quality of service" routing protocol has been proposed for TCP/IP transmissions.
  • an overloaded router in a TCP/IP network will discard new incoming packets simply because it has run out buffer space.
  • a "quality of service” approach marks each datagram with a service quality code and routers use this code to determine which packets to discard when they become overloaded. For instance, a router may store a table containing the service quality code for all of the datagrams in its input buffer. When the buffer is full and a new datagram arrives, the router reads its service quality code and then searches the table for the most recent datagram having a lower service quality code. If a datagram having a lower service quality code is located, it is discarded and the new datagram is stored in the router's input buffer. However, if no datagrams with a lower service quality code are found, the newly incoming datagram is discarded.
  • a modified form of the transmitter 1 makes use of a "quality of service" routing protocol to advantage.
  • the operations of the transmitter 1 is substantially as described in the case of the first preferred embodiment except for the operation of the transmission process 10.
  • the transmission process 10 treats each interleaved sample group as four sub-groups.
  • the first sub-group comprises the sign bits and the most significant bits and the second, third and fourth groups comprise respectively the second, third and fourth most significant bits.
  • the bits of the first group are sent to the UDP socket 11 for transmission in a first datagram marked with a high service quality code.
  • the second, third and fourth sub-groups are sent to the UDP socket separately for transmission in respective UDP datagrams with successively lower service quality codes.
  • the datagrams are routed through one or more overloaded routers, the datagrams with the most important information, i.e. the sign bits and the most significant magnitude bits, are routed in preference to the packets containing bits of lesser significance.
  • the receiver 2 is modified for the present embodiment so that the reception process 17 will attempt to reconstruct the original interleaved sample groups and then pass the reconstructions to the de-interleaver process 18.
  • step sll UDP datagram
  • step sl2 the contents, if any, of a 160-bit buffer are output to the de-interleaver process 18 (step si 3) and the buffer is set to contain all zeros (step si 4).
  • step si 5 the data from the datagram is stored in the first 64 bits of the buffer. If the next received datagram relates to a preceding sample group (step si 6), it is discarded (step si 7) otherwise the data from the new packet is stored in the appropriate place in the buffer (step sl8), i.e.
  • the performance of this embodiment can be improved by adding a buffer for temporarily storing datagrams before they are passed to the reception process 17.
  • the buffer provides time for out of sequence datagrams to be received and then fed to the reception process 17 in their correct position.
  • the scalability of the signal format can be employed in a modification to the present embodiment.
  • the quality of service code is omitted or the same for all packets.
  • a user of the receiver 2 can request a preview of an audio file from the transmitter 1.
  • the transmitter 1 responds by sending the most significant data, e.g 25% of the whole, in a first set of datagrams.
  • the user can then play the received file and decide whether to download the full quality version. If the user decides to download the full quality version, the omitted data is sent in a second stream of datagrams.
  • a transmitter 1 comprises a memory 31, such as a hard disk drive, in which a plurality of audio data files 32, 33, 34 are stored.
  • Each of the audio data files 32, 33, 34 comprises groups of interleaved samples, such as those produced by the interleaver process 9 shown in Figure 2.
  • the data in the files 32, 33, 34 can be selectively read by a transmission process 35 in response to a request therefor from a receiver 2 which includes the file's name or some other identifier.
  • the transmission process 35 combines data front rour sequential sample groups to produce one datagram and sends one datagram for every sample group. It will be appreciated that the first three datagrams will be padded with runs of zeros because there will be insufficient sample groups to provide real data.
  • the transmission process 35 reads all 160 bits of the n th sample group (step s21). Then the transmission process 35 reads and appends the first 96 bits from the n-T sample group (step s22), followed by the first 64 bits of the n-2 th and n-3 th sample groups (steps s23 and s24). The result is a datagram data part as shown in Figure 10. This data is then sent to a UDP socket 36 for transmission to the receiver 2 (step s25).
  • the receiver 2 is structurally similar to that shown in Figure 4. However, it differs in the operation of the reception process 17.
  • the reception process 17 when the reception process 17 receives (step s31) a datagram conveying the content of a file 32, 33, 34, it determines whether the datagram has a sequence position before the last received datagram (step s32). If this is the case, the newly received datagram is discarded (step s33).
  • the reception process 17 outputs bits 320 to 351 of the data part of the datagram and 96 zeros to the de- interleaver process 18 (step s35).
  • the reception process 17 outputs bits 256 to 319 of the data part of the datagram and 96 zeros to the de-interleaver process 18 (step s37). Then if the datagram has a sequence position indicating that one datagram has been sent to oblivion 6 (step s38), the reception process 17 outputs bits 160 to 255 of the data part of the datagram and 64 zeros to the de-interleaver process 18 (step s39).
  • the reception process 17 outputs bits 0 to 159 of the datagram's data part to the de-interleaver process 18 (step s40).
  • the de-interleaver process 18 receives a block of 160 bits it carries out the inverse process of that the interleaver process 9 in Figure 2 to reconstructs the original sample groups, as well as possible on the basis of the received datagrams, and sends the reconstructed samples to the loudspeaker 21 via the buffer 20.
  • the performance of this embodiment can be improved by adding a buffer for temporarily storing datagram before they are passed to the reception process 17.
  • the buffer provides time for out of sequence datagrams to be received and then fed to the reception process 17 in their correct position.
  • error correction coding and/or compression techniques may be employed.
  • the present invention has been illustrated in the first to fourth embodiments in the context of systems which transmit time domain samples of a time-varying signal.
  • such signals be sent as coefficients produced by a time to frequency domain transformation, e.g. a wavelet packet transform or a discrete cosine transform such as the Modified Discrete Cosine Transform (MDCT).
  • a time to frequency domain transformation e.g. a wavelet packet transform or a discrete cosine transform such as the Modified Discrete Cosine Transform (MDCT).
  • MDCT Modified Discrete Cosine Transform
  • the transmitter 1 comprises a c ⁇ mputer provided with a source of audio signals 107, e.g. a magnetic tape recording or a microphone, an analogue-to-digital converter 108 and a hardware network interface (not shown).
  • the computer also supports a wavelet packet transformer process 109, an encoder process 110, a transmission process 111, a UDP socket 112 and a request processing process or processes (not shown).
  • the analogue-to-digital converter 108 digitises the output of the audio signal source 107 to produce signed 16-bit data.
  • the wavelet packet transformer process 109 reads the samples from the digital-to-analogue converter 108 and outputs frequency domain coefficients which are encoded by the encoder process 110.
  • the encoded coefficients are stored in a file 113 and then transmitted by the transmission process 111 and the UDP socket 112.
  • the wavelet packet transformer process 109 implements a seven layer (only three shown) digital filter tree structure.
  • Each layer of the tree comprises a high-pass filter 121 and a low-pass filter 122 for each branch and a down-sample by 2 decimator 123 for decimating the output of each filter.
  • the tree functions as an array of slightly overlapping bandpass filters with each filter in the last layer outputting samples for signal components in a substantially discrete frequency band.
  • the result of the filtering is a set of samples that can be illustrated as a matrix having sixteen samples along a time axis and 64 samples along a frequency axis. An 8 by 8 section of such a matrix is shown in Figure.
  • the data in the "matrix" is taken by the encoder process 110 as its input.
  • the purpose of the encoder process 110 is to produce an output in which information about significant bits precedes information about bits of lesser significance. This could be achieved by a modified interleaving process as shown in Figure 15.
  • the encoder process scans the matrix to determine the largest value (step slOl) and stores the integer part (n) of the base 2 logarithm of this value as a 5-bit code (step si 02).
  • the encoder process appends the sign bit for each sample in the matrix to the stored value n (step si 03).
  • the encoder process appends the value of the bit with the current significance to the previously stored values (step si 04).
  • the stored data is sent to the file 113 (step sl07).
  • This stored data constitutes one sample group and a plurality of such groups will represent a piece of music, for example, and be stored in a file together.
  • the data in the file 113 can be selectively read by a transmission process 111 in response to a request therefor from a receiver 2 which includes the file's name or some other identifier.
  • the transmission process 111 combines data from four sequential sample groups to produce one datagram and sends one datagram for every sample group. It will be appreciated that the first three datagrams will be padded with runs of zeros because there will be insufficient sample groups to provide real data.
  • the transmission process 111 reads and stores all of the bits of the n th sample group (step sl22). The transmission process 111 determines how many bits this is by multiplying the value (m) represented by the first five bits plus one (to account for the sign bits) by 1024 and adding 5 (step sl21).
  • the transmission process 111 calculates the integer part of ((m/2) + 1) x 1024 + 5 for the n-l th sample group (step sl23) and reads and appends that number of bits of the n-l th sample group (step sl24), headed by the n-l th sample group's first five bits.
  • the transmission process 111 calculates the integer part of ((m/4) + 1) x 1024 + 5 for the n-2 th sample group (step sl25) and reads and appends that number of bits of the n-2 lh sample group (step sl26), headed by the n-2 th sample group's first five bits.
  • the transmission process 111 calculates the integer part of ((m/4) + 1) x 1024 + 5 for the n-3 th sample group (step sl27) and reads and appends that number of bits of the n-l th sample group (step sl28), headed by the n-3 th sample group's first five bits..
  • step si 29 The result is a string of bits similar to the datagram data part shown in Figure 10 but with the potential for variation in the lengths of the subsections.
  • This data is then compressed (step si 29) using a convenient conventional technique and sent to the UDP socket 112 for transmission to the receiver 2 (step si 30).
  • the receiver 2 in this example, comprises a UDP socket 116, a reception process 117, a decoder 118, an inverse wavelet packet transformer 119, a digital-to-analogue converter 120 and a loudspeaker 121.
  • the data in received datagrams is passed by the UDP socket 116 to the reception process 117.
  • the reception process 117 de-compresses the data and compensates for lost or out of sequence datagrams.
  • the reception process 117 when the reception process 117 receives (step si 30) a datagram conveying the content of the file 113 ( Figure 15), it determines whether the datagram has a sequence position before the last received datagram (step sl31). If this is the case, the newly received datagram is discarded (step sl32). Otherwise, it decompresses the data in the datagram (step sl33).
  • the reception process 117 determines the boundaries between data from different sample groups within the data.
  • the reception process reads the first 5 bits of the data, giving value m and calculates (m+ 1) x 1024 + 5 to get the zero-based position a, of first bit of the second section of the data (step si 34).
  • the reception process then reads five bits starting from bit a 15 giving value m 1 ⁇ and calculates ((m-/2) + 1) x 1024 + 5 to get the position a- of the first bit of the third section (step sl35) and reads five bits from bit a 2 inclusive giving value m, and Calculates ((m 2 /2) + 1) x 1024 + 5 to get the position a 3 of the first bit of the fourth section (step si 36).
  • the reception process 117 If the datagram has a sequence position indicating that three or more datagrams have been sent to oblivion 6 by switch 5 ( Figure 1) (step si 37), the reception process 117 outputs bits a 3 to the end of the data to the decoder process 118 (step sl38). Otherwise, if the datagram has a sequence position indicating that two datagrams have been sent to oblivion 6 (step sl39), the reception process 117 outputs bits a 2 to a 3 -l of the data part of the datagram to the decode process 118 (step sl40).
  • the reception process 117 outputs bits a, to a 2 -l of the data part of the datagram to the decoder process 18 (step sl42).
  • the reception process 17 outputs bits 0 to a--l of the datagram's data part to the decoder process 18 (step sl43).
  • the data compression function is performed by the encoder process 110 rather than the transmission process 111 and the transformer process 109 is a MDCT transformer process giving an output matrix of coefficients comprising two columns in the time direction and 512 rows in the frequency direction.
  • Sig_coeffs con t ains pointers to coefficients which have been found to be "significant" and is initially empty.
  • the Pending list contains pointers to coefficients found to be significant in the present iteration.
  • Ts_ptr comprises a respective pointer to the next coefficient to be checked for significance in each time slot.
  • the encoder process 110 determines and outputs resolution definitions, one per Bark scale band, which communicate the minimum resolutions required to ensure that quantisation noise is imperceptible (step s201). The encoder process 110 then finds the largest coefficient (step s202). If c max is 0 (step s203) which indicate a silent frame, the process outputs 00000 (step s206) and termina t es. However, if c max is not 0, the process determines the code for the most significant bit position (N) which has a 1 in the binary representation of this number (step s204) using: where c m ⁇ x is the largest coefficient value.. The coefficients are floating point numbers.
  • step s205 If N less than 1 (step s205), the process moves to step s206. If however N is not less than 1 (step s205), five bits representing the new value of N are output (step s207) and the samples are then processed on a time slot by time slot basis (steps s216 and s217). First it is determined whether there are any unmasked newly significant coefficients beyond the current Ts_ptr position in the current time slot (step s208). A newly significant coefficient is masked if its most significant 1 bit is below the resolution level set for the Bark band containing it. Such masked bits are treated at this point as zeros.
  • a 1 is output (step s209) and the length (R) of the run of insignificant coefficients from the Ts_ptr position to the significant coefficient is determined and run-length encoded (step. s210).
  • the run-length code (step s211) and the sign bit of the significant coefficient are then output (step s212 , and a pointer to the coefficient is added to the Pending list (s213).
  • a zero is output (step s214) and the Ts_ptr pointer for the present time slot is changed to point to the first extant insignificant coefficient (step s215). It is then determined whether the last time slot has been processed (step s216). If not, the process moves on to the next time slot (step s217).
  • the Sig_coeffs list is looped through (step s220). While the Sig_coeffs list is being looped through, the Nth bit of the coefficient pointer to by the next list member is output (step s219), if the resolution data indicates that it has a material effect on the perception of the signal by a listener (step s218). After looping through the Sig_coeffs list has been completed, the contents of the Pending list are transferred to the Sig_coeffs list (step s221) and N is decremented by 1 (step s222).
  • step s223 flow returns to step s208 otherwise the process terminates.
  • the processing is finished when there is no more space available in the output datagram or datagram section, where packet loss recovery is being employed.
  • the size of the datagram data part is determined by the required bit rate which may established in the design of the system or dynamically during operation to reflect different fidelity requirements.
  • bits representing a particular significance level for a particular time slot terminate with a zero after the sign bit of the highest frequency significant bit. This removes long runs of zeros than would otherwise occur with audio signals which tend to contain relatively little energy at higher frequencies. However, long runs of zeros can frequently occur before a significant coefficient or between significant coefficients.
  • these "preceding" and "interposed" zero runs are encoded using a run-length code.
  • bits which do not have a role in creating the listener's perception of the transmitted signal are not transmitted. Instead, bits defining the necessary resolutions for masked components in the bands of the Bark scale are transmitted.
  • the result of the encoding process 110 is a file or bitstream comprising a header and a plurality of data blocks.
  • the header comprises the resolution definition data and the 5-bit significance code (N).
  • the data blocks comprise the meaningful data from respective significance levels and within each block the data concerning coefficients becoming significant at the associated level precede data refining the values of coefficients that became significant at higher levels. It should also be noted that the number of Bark bands, i.e. 24, is much lower than the number of coefficients, i.e. 512, for each time slot.
  • the each run-length code comprises a prefix and a suffix.
  • the prefix defines a range of values and the suffix the position within the range.
  • the prefixes and suffix lengths are as follows:-
  • the value represented by a code value is Prefix followed by Suffix value - R m ⁇ .
  • the data decompression function is again performed by the decoder process 118 rather than the reception process 117.
  • the decoder process 118 reconstructs the original coefficients.
  • Sig_coeffs contains pointers to coefficients which have been found to be "significant" and is initially empty.
  • the Pending list contains pointers to coefficients found to be significant in the present iteration.
  • Ts_ptr comprises pointers to the next coefficient to be processed in each time slot.
  • the resolution definitions are obtained from the first 96 bits of the datagram data (step s301) and the value N is obtained from the next five bits of the data part of the datagram (step s302). If N is zero (step s303), the coefficients are output to the transformer process 119 (step s304). If N is not zero (step s303) and there are more than 2 unprocessed datagram data bits (step s305), N is used to set a threshold at 2 N'9 (step s306) and the incoming data is then processed in respect of the coefficients in time slot order (steps s308 and s321).
  • step s306 it is determined whether there are any unprocessed datagram data bits left (step s307). If the result at step s307 is yes, the processing of the next time slot is performed, otherwise the process moves to step s304.
  • processing starts from the coefficient pointed to by the relevant Ts_ptr member (s309) and the datagram data is tested (step s310) to determine whether the next bit is a 1 and that there are more than two bits left. If the answer at step s310 is no, the process moves on to the next time slot after resetting the Ts_ptr member for the current time slot (step s320). However, if the answer at step s310 is yes, it is determined whether the next bit is 0 (step s312). If so, it is determined whether there are more than 2 bits left in the data part of the datagram (step s313).
  • step s313 the run-length code prefix defaults initially to the prefix for the lowest range
  • step s311 If the answer at step s311 is no, it is determined, on the basis of the run-length code prefix, whether there are sufficient bits left in the datagram data to complete the run-length code (step s314). If there are not, the process returns to step s310 otherwise the number of coefficients indicated by the run-length code are skipped (step s315) and the next bit of the datagram data is read as the sign of the next coefficient (step s316) and the magnitude of the coefficient is set to the value of the threshold (step s317). The current coefficient is then added to the Pending list (step s318).
  • the members of Sig_coeffs are processed. For each member of Sig_coeffs (steps s322 and s326), the next bit of the datagram data is added as the h bit to the current coefficient (step s324), if the resolution definition data does not indicate that the value of the current coefficient is irrelevant (step s323).
  • N is decremented (step s326) and, if all the data in the received datagram has not been processed (step s327), the members of the Pending list are transferred to the Sig_coeffs list (step s328) and the process returns to step s305. If, however, all the data in the received datagram has been processed at step s327, the process moves to step s304 to output the coefficients.
  • a feature of the above-described decoder process 118 is that the input bitstream may be truncated at any point within the blocks of coefficient data without causing a failure.
  • the decoder does not actually need to be aware to the number of significance levels represented in the received signal.
  • the encoder can be operated to produce the number of bits required to fill the main or only data part in the present datagram and therefore data for as many significance levels as possible is transmitted, maximising the fidelity within the current bandwidth constraints.
  • the "old" data parts can be formed by truncating the data from a previous frame at a particular bit position rather than at a particular boundary between data for different significance levels.
  • a portable audio playback device 200 comprises a control circuit 201 in the form of microcomputer circuitry, a serial communication interface 202, a large flash ROM memory 203, a keypad 204 for controlling the operation of the device, an audio module 205 including a digital-to-analogue converter and a variable-gain amplifier, and a jack socket 206 for connecting the device to an earpiece.
  • the control circuit 201 includes an embedded version of the Linux operating system which includes an ftp daemon. By connecting the device to a personal computer (not shown), a user can transfer files or selected parts thereof to the memory 203 using the ftp protocol.
  • the transferred files are preferably in a format of the types produced by the encoder processes in the embodiments described above. Consequently, the user can trade fidelity, i.e. number of significance levels, against duration. Thus, the user may choose between a few high fidelity recordings or many lower fidelity recordings.
  • the control circuit 201 is also provided with a program for reading data from the memory 203 and decoding and, if necessary, transforming it.
  • the resultant time domain digital data is then sent to the audio module 205 for output as an analogue signal via the jack socket 206.
  • the present invention has been described solely in the context of single channel signals, such as monophonic audio.
  • the present invention can be applied to multichannel signals, e.g. stereophonic audio.
  • a multichannel signal may be sent with each channel being carried by a separate stream of packets, the packets for each channel being interleaved with each other.
  • each packet contain data from all of the channels with the data grouped according to significance so that the most significant data from all of the channels is grouped together and the next most significant data from all of the channels is grouped together and so on.
  • stereophonic signals may be sent as sum and difference signals. This also maintains compatibility with monophonic receiving or reproducing apparatus.
  • the left and right channel signals are added together to produce the sum signal, and subtracted one from the each other to produce the difference signal. When there is correlation between the channels, the difference signal has a much smaller amplitude than the sum.
  • the sum and difference signal generation may be carried out either on the raw time-domain signals before any time-frequency transformation or on the transformed frequency domain versions of the signals.
  • the latter is more efficient and it is preferred because the correlation is generally greater in the frequency domain.
  • the sum and difference signals are then encoded, or encoded and compressed, by one of the methods above.
  • the total bitrate available may be divided up between the sum and difference signals in a fixed way, so that each receives a fixed proportion of the total bitrate, regardless of the signal characteristics.
  • the proportion allocated to the sum will always be higher, but the best compromise would need to be determined by experiment.
  • the encoding/compression for each frame is carried out simultaneously on the sum and difference signals.
  • Encoding stops when the sum of the bits used for the two parts is equal to the number dictated by the cuVrent bitrate. However, the initial threshold is the same for both, and will normally be dictated by the sum signal, since it will in general be greater. The result of this process is that sum and difference will be specified with equal precision but that the number of bits used for the two will vary according to how similar or different the two channels are.
  • each packe t could contain sum and difference significance blocks interleaved or the sum and difference signals could be transmitted in separate packets. In both cases the packet loss recovery system described above can be employed.
  • the sum packets c + ould be transmitted wi t h a high service quality code and the difference packets with low service quality code. Then at times of high network traffic, a mono signal only would be available, but at times of low traffic, the full stereo signal would be provided.
  • An alternative approach takes advantage of two phenomena related to the perception of the stereophonic 'image'. At low frequencies, channels may be amalgamated without affecting the stereo image and, at high frequencies, the perception of the stereo image depends more on the temporal envelope of the signal than on the fine structure.
  • the subbands for the n channels may be replaced by a single set of subbands equal to the average values of the coefficients in those subbands.
  • a certain frequency typically 2-3 kHz
  • the separate subband coefficients are used to generate the initial significance information.
  • the subbands are averaged as above, and these averaged subbands are used to generate joint refinement information for all channels. In this way, the higher-frequency subbands are conveyed with different envelopes, but with the same fine structure.
  • the encoding or encoding and compression is then carried out using: (a) joint low- frequency subbands, (b) separate mid-frequency subbands, (c) separate high- frequency significance information (i.e. most significant bits) and (d) joint high- frequency refinement information (i.e. bits other than most significant).
  • time-frequency domain transformation may be adapted to the nature of the input signal and the transmission path.
  • the optimum division of the packet will depend on the packet loss characteristics of the channel through which packets are to be sent.

Abstract

In a data transfer system, the bits of a frame of a loss-tolerant digital signal, such as one representing speech or music, are grouped according to their significance. These groups may be transferred together or separately. The less significant bits need not be transferred at the cost of some loss of fidelity but the result is a scalable system in which enables easy selection of points in a bandwidth-fidelity space for the transfer of data. In preferred forms, the transferred data is compressed before transfer. In order to overcome loss of packets in network communication, each packet contains data from a plurality of frames, preferably with older frames being represented with lower fidelity than more recent frames.

Description

ROBUST CODING FOR THE TRANSMISSION OF AUDIO OR VIDEO SIGNALS
Field of the Invention
The present invention relates to the transfer of digital data.
Background to the Invention
The Internet and other digital media are being used more and more for the transmission of data. The transmitted data falls into two categories. These categories comprises data that must be received, allowing for error correction, exactly as it was transmitted and data that, on reception, need only correspond with the transmitted data within certain tolerances, i.e. loss-tolerant signals. The first category includes document files, financial information and program code, for example Java applets. The second category comprises data that is primarily intended to be rendered perceptible to the human senses, for example photographic images, music and speech.
The present invention is concerned with the transmission of data in the second category.
Summary of the Invention
According to the present invention, there is provided a method of transmitting a loss-tolerant signal, the method comprising selecting a bit from each digital code in a group of digital codes representing a time-varying signal at a plurality of instants, the selected bits all having the same significance, and transmitting the selected bits together.
According to the present invention, there is also provided an apparatus for transmitting a loss-tolerant signal, the apparatus comprising selection means for selecting a bit from each digital code in a group of digital codes representing a time- varying signal at a plurality of instants, the selected bits all having the same significance, and means for transmitting the selected bits together.
Preferably, the selected bits have the significance of the most significant 1 bit in the group of digital codes. A further bit may be selected from each digital code, the selected further bits all having the same, lower significance, and transmitting the selected further bits together. Thus, the transmitted signal increases in fidelity as the number of times bits are selected increases. Conveniently, the selected further bits are transmitted after said selected bits. However, this is not essential but the selections should be made from an unbroken series of bit significances.
In the case of bipolar signals, it is preferable that the digital codes each comprise a sign bit and a plurality of magnitude bits, with the sign bits being accorded a significance equivalent to that of the most significant magnitude bits.
The step of selecting bits having the same significance may be repeated for different significance levels, said significance levels being selected in dependence on the bandwidth of a channel through which the bits are to be transmitted and being in an unbroken sequence. Thus, a single file containing, for example a piece of music, can be used to provide the piece of music to a remote listener with different degrees of fidelity simply by reading and transmitting the appropriate sub-set of bits from the file.
A first subset of the bits, having significances in a first upper range, may be selected and transmitted in a first packet with a second subset of the bits, having significances in a second lower range, being selected and transmitted in a second packet, the packets including the same destination address. Thus, the first packets can be sent with, for example, approximately 25% of the data required for full fidelity to provide a preview to a user who can then request the second packets. The data from the second packets is added to the data from the first packets at the receiver to increase the fidelity of the signal presented to the user. In order to make the receiver more simple, the second packets may also include the data sent in the first packets.
In a network having a quality of service routing protocol, the first and second packets are preferably distinguished by respective quality of service codes in accordance with a quality of service routing protocol of a router in a path to the destination identified by the destination address. Consequently, the quality of service routing protocol is more likely to discard less significant data so that gaps in the received signal are less likely to occur, instead there will be short-duration reductions in fidelity which a listener, for example in the case of audio signals, may not even notice.
Preferably, the bits are transmitted in packets and each packet comprises bits selected from a respective first group of digital codes and bits selected from a respective second group of digital codes, the second group representing an earlier part of said time-varying signal than the first group. More preferably, a greater portion of the packet is given over to the bits selected from the first group than to the bits selected from the second group. The packets may comprise three or more sections generally of decreasing size and containing successively older data.
Preferably, the transmitted bits are compressed.
A method according to the present invention may comprises re-ordering the bits of said digital codes so as to group the bits thereof by significance before selecting bits for transmission.
Preferably, for audio signals in particular, a method according to the present invention includes transforming time domain samples into frequency domain coefficients, wherein said digital codes comprises frequency domain coefficients. In the context of the present specification, "frequency domain coefficients" includes quasi-frequency domain coefficients such as produced by wavelet packet transforms as well as the coefficients produced by modified discrete cosine transformations (MDCT) and the like. Wavelet packet transforms produce "scale" coefficients. However, these coefficients are dependent on the spectral content of the transformed signal.
In the case of wavelet packet transforms, 64 coefficients in each of 16 time slots from 1024 time domain samples has been found to be effective for music. Each coefficient produced by the wavelet packet transform should be the result of the same number of filtering and decimation steps, such as a symmetrically branching tree arrangement of filter and decimator functional units. Each node having the same depth will have the same number of branches but the number of branches may differ between nodes at different depths.
In the case of MDCT, 512 coefficients in each of 2 time slots has been found to be effective for music. However, other configurations may be advantageous for other signals, e.g. electrocardiograph signals.
The bits to be transmitted may be compressed during a re-ordering process.
Preferably, the re-ordering comprises arranging the coefficients in a representation of a two-dimensional matrix having a separate column for each time slot and separate row for each frequency subband. More preferably, the rows are ordered by frequency subband. However, the rows may be ordered in whatever manner produces the best compression, which will be dependent on the spectral content of the original signal. Still more preferably, the re-ordering comprises for each significance level of the coefficients in each column in row order, replacing runs of zeros terminating at an edge row with a termination marker, e.g. 0, and/or the re- ordering comprises for each significance level of the coefficients in each column in row order, replacing runs of zeros through coefficients, having a most significant 1 bit in a yet unhandled significance level, and terminating before an edge row with a run-length code. If used, the run-length code may comprise a prefix defining a range and suffix defining a position within the range defined by the suffix.
Preferably, bits in significance levels above that containing the most significant 1 bit among the coefficients are discarded during re-ordeπng.
According to the present invention, there is further provided a method of receiving a loss-tolerant signal, the method comprising receiving bits of a first group of digital codes, re-ordering the bits to produce a second group of digital codes, ζach member of which comprises at least the most significant bit or bits of a corresponding member of said first group of digital codes, and generating a time-varying signal using said second group of codes.
According to the present invention, there is yet further provided an apparatus for receiving a loss-tolerant signal, the apparatus comprising: receiving means for receiving bits of a first group of digital codes, the more significant bits of said codes preceding the less significant bits; means for re-ordering the bits to produce a second group of digital codes, each member of which comprises at least the most significant bit or bits of a corresponding member of said first group of digital codes; and means for generating a time-varying signal using said second group of codes.
The receiving method and apparatus should be adapted to meet the requirements of the various signal forms produced in accordance with the transmission aspect of the present invention.
Preferably, the codes of the second group are padded with zeros in positions for which bits were not received, such that the digital codes of the second group have the same number of bits as the digital codes of the first group.
In an embodiment, the method according to the present invention comprises receiving a first subset of the bits of the digital codes of the first group, having significances in a first upper range, in a first packet and receiving a second subset of the bits of the digital codes of the first group, having significances in a second lower range, in a second packet, and appending the bits of the second subset to those of the first subset before re-ordering to produce the second group of digital codes.
In another embodiment a method according to the present invention comprises receiving a plurality of packets, each packet comprises a first section followed by a second section, the first section comprising re-ordered bits from one group of digital codes and the second section comprising re-ordered bits from another group of digital codes, wherein the bits in the second section represent an earlier part of said time-varying signal than those in the first section. Preferably, such a method comprises the steps of:- receiving a first packet; re-ordering the bits in the first section for reproducing said time-varying signal; receiving a second packet; determining that an intervening packet has been lost; re-ordering the bits of the second section of the second packet for reproducing said time-varying signal; and thereafter re-ordering the bits of the first section of the second packet for reproducing said time-varying signal.
Where the transmitted bits are from compressed, re-ordered frequency domain coefficients, the re-ordered bits are preferably represented in the form of a two- dimensional matrix having a column for each time slot and a row for each frequency subband represented by the coefficients and re-ordering comprises for each significance level the coefficients of each time slot, replacing a predetermined termination marker, if present, with a run of zeros terminating at an edge row.
Where the transmitted bits are from compressed, re-ordered frequency domain coefficients, re-ordering preferably comprises for each significance level of each time slot, determining the presence of a run-length code and, if a run-length code is present, replacing it with a run of zeros having a length determined by the run- length code. More preferably, the run-length code comprises a prefix defining a range and suffix defining a position within the range defined by the suffix.
Padding of untransmitted more significant bit positions of the first group digital codes may be required. In this case, a significance code may need to be obtained from the received signal.
The present invention may be embodied in an audio playback device, wjiich may be portable, including memory means for storing bits received by the receiving means, wherein the re-ordering means is configured to re-order bits stored in the memory for generating a playback audio signal.
According to the present invention, there is yet further provided a method of operating a network routing node for routing a signal packet in which more significant bits precede less significant bits, the method comprising determining the bandwidth available in a path away from the node and, if the bandwidth is below a threshold value, truncating the data in said packet in dependence on the determined bandwidth. The method may be applied in a network having a TDMA wireless link to a terminal apparatus, wherein the bandwidth varies with the number of slots available in each frame for transmissions to the terminal apparatus.
Transmitters and receivers according to the present invention may take many forms. For instance, a voice over IP (VOIP) terminal can be made by combining the transmitting and receiving parts in the same apparatus.
Brief Description of the Drawings
Figure 1 shows a transmitter and a receiver connected by a network;
Figure 2 illustrates a transmitter according to the present invention; Figure 3 illustrates the process of interleaving the bits of a group of digital codes;
Figure 4 illustrates a receiver according to the present invention;
Figure 5 illustrates a wireless link to a terminal apparatus;
Figure 6 is a flowchart illustrating the operation of a network node;
Figure 7 is a flowchart illustrating the operation of a receiver according to the present invention;
Figure 8 illustrates a transmitter according to the present invention;
Figure 9 is a flowchart illustrating the operation of a transmitter according to the present invention;
Figure 10 illustrates the format of a data part of a packet according to the present invention;
Figure 11 is a flowchart illustrating the operation of a receiver according to the present invention;
Figure 12 illustrates a receiver according to the present invention; Figure 13 illustrates a wavelet packet transform operation;
Figure 14 illustrates the output of a wavelet packet transform operation;
Figure 15 is a flowchart illustrating the operation of a transmitter according to the present invention; Figure 16 is a flowchart illustrating the operation of a receiver according to the present invention;
Figure 17 illustrates a receiver according to the present invention;
Figure 18 is a flowchart illustrating the operation of a receiver according to the present invention; Figure 19 is a flowchart illustrating the operation of a transmitter according to the present invention;
Figure 20 is a simplified view of the data for one time-slot illustrating the compression effected by the method illustrated in Figure 21;
Figure 21 is a flowchart illustrating the operation of a receiver according to the present invention; and
Figure 22 shows a portable audio playback device according to the present invention.
Preferred Embodiments of the Invention Preferred embodiments of the present invention will now be described, by way of example, with reference to the accompanying drawings.
Referring to Figure 1, in the general case, a transmission system comprises a transmitter 1, a receiver 2 and a transmission medium 3. The transmission medium 3 may be characterised as having a bandwidth (B) determined by the minimum bandwidth of the signal path, e.g. bottleneck 4, and a notional switch 5 that directs signal portions into oblivion 6. The switch 5 may operate stochastically or according to a pattern, e.g. EMC from vehicle ignition systems. In the context of the Internet packet loss has been found to follow a Pareto distribution in which the loss of a packet increases the likelihood of the next packet being lost. The present invention in its packet loss handling aspect has been found to be particularly good at dealing with packet loss according to a Pareto distribution. In either case, it can generally be assumed that the operation of the switch 5 is independent of any data signals being transmitted from the transmitter 1 to the receiver 2.
First Preferred Embodiment Referring to Figure 2, in this example, the transmitter 1 comprises a computer provided with a source of audio signals 7, e.g. a magnetic tape recording or a microphone, an analogue-to-digital converter 8 and a hardware network interface (not shown). The computer also supports an interleaver process 9, a transmission process 10, a UDP socket 11 and a request processing process or processes (not shown).
The analogue-to-digital converter 8 digitises the output of the audio signal source 7 (Figure 3(a)) as signed 4-bit numbers (Figure 3(b)). (4-four bit numbers are used here in the interests of clarity and it will be appreciated that a larger number of bits, e.g. 10 or 16, would be desirable in many circumstances.) The interleaver process 9 reads the samples from the digital-to-analogue converter 8.
The interleaver process 9 processes the samples in groups of 32. The bits of the samples in a complete group are interleaved and then added to a file 12, created for the current audio signal, or supplied to the transmission process 10.
In the interleaver process 9, the sign bits and the most significant bits of the samples are first read and output. Then the second most significant bits of the samples are read and output and then the third most significant bits of the samples are read and output. Finally, the least significant bits of the samples are read and output. The result in this case is an interleaved bitstream such as that shown in Figure 3(c) which is either stored in the file 12 for later transmission or supplied to the transmission process 10.
A characteristic of interleaved sample groups is that their tails can be lopped without making the remaining data unusable. Referring to Figure 3(d), it can be seen that the form of the original waveform is largely retained even though the last 32 bits of a sample group have been lost. This characteristic is even more marked in the case of samples comprising larger numbers of bits.
The transmission process 10 sends interleaved sample data from the interleaver process 9 or a file 12 to a UDP socket for transmission to a requesting receiver 2. To achieve this the transmission process 10 is also provided with a channel class and a destination address. The operation of the transmission process 10 will be described in more detail below.
Referring to Figure 4, in this example, the receiver 2 comprises a computer provided with a digital-to-analogue converter 20, a loudspeaker 21 and a hardware network interface (not shown). The computer also supports a UDP socket 16, a reception process 17, a de-interleaver process 18 and a buffering process 19. UDP datagrams from the transmitter 1 are received by the UDP socket 16 and their data contents passed to the de-interleaver process 18 by the reception process 17. The de- interleaver process 18 de-interleaves the received data and passes the de-interleaved sample groups to the buffer process 19. The buffer process 19 ensures that, subject to the datagrams arriving at rate greater than some system-dependent threshold rate, the samples are presented to the digital-to-analogue converter 20 at a constant rate and the output of the digital-to-analogue converter 20, and hence the output of the loudspeaker, faithfully reproduces the frequency components of the original signal.
The ability of the system to tolerate the truncation of interleaved sample groups means that the contents of datagrams can be tailored according to the bandwidth (B) of the signal path between the transmitter 1 and the receiver 2.
A user of the receiver 2 may wish to hear a piece of music stored in a file 12 at the transmitter 1. The user therefore instructs a process (not shown) running at the receiver 2 to send a request to the transmitter 1. This request includes the identity of the file 12 and a channel class. The channel class may be determined on the basis of the type of connection between the receiver 2 and the intervening network, e.g. dial-up, ISDN, WAP, and/or a locally determined actually achieved bitrate. At the transmitter 1, the request is received by a receiver process (not shown) which starts the transmission process 10 with the file and channel class as parameters. The transmission process 10 starts to read groups of interleaved samples from the file 12.
If the channel class indicates a high-bandwidth channel, the transmission process 10 includes all the bits of a respective interleaved sample group in each transmitted UDP datagram. However, if the channel belongs to a second lower bandwidth class, the transmission process 10 omits the final 32 bits of each interleaved sample group when generating the UDP datagram data parts. Similarly, if the channel belongs to a third, worse class, the final 64 bits of each interleaved sample group are omitted. Thus, it can be seen that a single data file can produce useable signals at different bit-rates very simply with little, if any, processing overhead.
Referring again to Figure 4, when each datagram is received, the reception process 18 sends its data part to the de-interleaver process 18. If the number of data bits in the datagram is less than 160, the reception process 18 pads the end of the received data with zeros to make it up to 160 bits in total.
Second Preferred Embodiment
Referring to Figure 5, a mobile telephone network includes a serving GPRS support node (SGSN) 25 and a mobile station 26. GPRS (General Packet Radio Service) allocates time slots in a GSM frame for data transmission dynamically on the basis of traffic levels. Thus, a GPRS transmission will be allocated eight time slots per frame, if there is little traffic in the cell containing the source/destination mobile station 26 but will be allocated successively fewer time slots per frame as traffic in the cell increases.
The SGSN 25 is largely conventional save that it is configured, e.g. by programming a component computer, to identify UDP datagrams of the type produced by the transmitter 1 shown in Figure 2. To achieve this, the seventh bits of th£ "type of service" portions of the UDP headers are set to one by the UDP socket 11 and the SGSN 25 looks for these. Referring to Figure 6, when the SGSN 25 detects a UDP datagram of the subject type (step si), it determines the number of slots per frame allocated to data communication with the destination mobile station 26 (step s2). If the number of slots per frame is less than 2 (step s3), the SGSN 26 removes a predetermined portion of the tail of each UDP datagram (step s4) before sending it on to the mobile station 26.
If necessary, one or more further thresholds may be used to trigger truncation of packets by differing amounts.
It will be appreciated that the threshold or thresholds triggering truncation of the UDP packets will depend on the packet sizes and the data capacity of each time slot.
Third Preferred Embodiment
A "quality of service" routing protocol has been proposed for TCP/IP transmissions. Presently, an overloaded router in a TCP/IP network will discard new incoming packets simply because it has run out buffer space.
A "quality of service" approach marks each datagram with a service quality code and routers use this code to determine which packets to discard when they become overloaded. For instance, a router may store a table containing the service quality code for all of the datagrams in its input buffer. When the buffer is full and a new datagram arrives, the router reads its service quality code and then searches the table for the most recent datagram having a lower service quality code. If a datagram having a lower service quality code is located, it is discarded and the new datagram is stored in the router's input buffer. However, if no datagrams with a lower service quality code are found, the newly incoming datagram is discarded.
This discarding of packets is represented in Figure 1 by the switch 5.
Referring again to Figure 2, a modified form of the transmitter 1 makes use of a "quality of service" routing protocol to advantage. In this case, the operations of the transmitter 1 is substantially as described in the case of the first preferred embodiment except for the operation of the transmission process 10. The transmission process 10 treats each interleaved sample group as four sub-groups. The first sub-group comprises the sign bits and the most significant bits and the second, third and fourth groups comprise respectively the second, third and fourth most significant bits.
The bits of the first group are sent to the UDP socket 11 for transmission in a first datagram marked with a high service quality code. The second, third and fourth sub-groups are sent to the UDP socket separately for transmission in respective UDP datagrams with successively lower service quality codes.
Consequently, if the datagrams are routed through one or more overloaded routers, the datagrams with the most important information, i.e. the sign bits and the most significant magnitude bits, are routed in preference to the packets containing bits of lesser significance.
Referring again to Figure 4, the receiver 2 is modified for the present embodiment so that the reception process 17 will attempt to reconstruct the original interleaved sample groups and then pass the reconstructions to the de-interleaver process 18.
Referring to Figure 7, when the reception process 17 receives a UDP datagram (step sll) containing a first sub-group (step sl2), the contents, if any, of a 160-bit buffer are output to the de-interleaver process 18 (step si 3) and the buffer is set to contain all zeros (step si 4). Once the buffer has been reset, the data from the datagram is stored in the first 64 bits of the buffer (step si 5). If the next received datagram relates to a preceding sample group (step si 6), it is discarded (step si 7) otherwise the data from the new packet is stored in the appropriate place in the buffer (step sl8), i.e. second sub-group data in bits 64 to 95, third sub-group data in bits 96 to 127 and fouπh sub-group data in bits 128 - 159. Thus, as long as the datagrams containing the first sub-group data are received, a meaningful output can be produced by the loudspeaker and loss of datagrams will tend not to result in annoying gaps in the output audio but less annoying temporary drops in audio quality.
The performance of this embodiment can be improved by adding a buffer for temporarily storing datagrams before they are passed to the reception process 17. The buffer provides time for out of sequence datagrams to be received and then fed to the reception process 17 in their correct position.
The scalability of the signal format can be employed in a modification to the present embodiment. In the modified form, the quality of service code is omitted or the same for all packets. A user of the receiver 2 can request a preview of an audio file from the transmitter 1. The transmitter 1 responds by sending the most significant data, e.g 25% of the whole, in a first set of datagrams. The user can then play the received file and decide whether to download the full quality version. If the user decides to download the full quality version, the omitted data is sent in a second stream of datagrams.
When the second stream of datagrams has been received, the data parts from the two streams of datagrams are merged and stored for later play back.
Fourth Preferred Embodiment
Referring to Figure 8, a transmitter 1 comprises a memory 31, such as a hard disk drive, in which a plurality of audio data files 32, 33, 34 are stored. Each of the audio data files 32, 33, 34 comprises groups of interleaved samples, such as those produced by the interleaver process 9 shown in Figure 2.
The data in the files 32, 33, 34 can be selectively read by a transmission process 35 in response to a request therefor from a receiver 2 which includes the file's name or some other identifier. The transmission process 35 combines data front rour sequential sample groups to produce one datagram and sends one datagram for every sample group. It will be appreciated that the first three datagrams will be padded with runs of zeros because there will be insufficient sample groups to provide real data.
The generation of the data part of an nth datagram will now be described with reference to Figure 9.
The transmission process 35 reads all 160 bits of the nth sample group (step s21). Then the transmission process 35 reads and appends the first 96 bits from the n-T sample group (step s22), followed by the first 64 bits of the n-2th and n-3th sample groups (steps s23 and s24). The result is a datagram data part as shown in Figure 10. This data is then sent to a UDP socket 36 for transmission to the receiver 2 (step s25).
In this example, the receiver 2 is structurally similar to that shown in Figure 4. However, it differs in the operation of the reception process 17.
Referring to Figure 11, when the reception process 17 receives (step s31) a datagram conveying the content of a file 32, 33, 34, it determines whether the datagram has a sequence position before the last received datagram (step s32). If this is the case, the newly received datagram is discarded (step s33).
If the datagram has a sequence position indicating that three or more datagrams have been sent to oblivion 6 by switch 5 (Figure 1) (step s34), the reception process 17 outputs bits 320 to 351 of the data part of the datagram and 96 zeros to the de- interleaver process 18 (step s35).
Then if the datagram has a sequence position indicating that two datagrams have been sent to oblivion 6 (step s36), the reception process 17 outputs bits 256 to 319 of the data part of the datagram and 96 zeros to the de-interleaver process 18 (step s37). Then if the datagram has a sequence position indicating that one datagram has been sent to oblivion 6 (step s38), the reception process 17 outputs bits 160 to 255 of the data part of the datagram and 64 zeros to the de-interleaver process 18 (step s39).
Finally and in any event, the reception process 17 outputs bits 0 to 159 of the datagram's data part to the de-interleaver process 18 (step s40).
Whenever the de-interleaver process 18 receives a block of 160 bits it carries out the inverse process of that the interleaver process 9 in Figure 2 to reconstructs the original sample groups, as well as possible on the basis of the received datagrams, and sends the reconstructed samples to the loudspeaker 21 via the buffer 20.
It can be seen that when small numbers of datagrams are lost, the missing samples can be recreated, with some loss of fidelity, from data in the next received datagram.
Again, the performance of this embodiment can be improved by adding a buffer for temporarily storing datagram before they are passed to the reception process 17. The buffer provides time for out of sequence datagrams to be received and then fed to the reception process 17 in their correct position.
In modifications of the foregoing first to fourth preferred embodiments, error correction coding and/or compression techniques may be employed.
The present invention has been illustrated in the first to fourth embodiments in the context of systems which transmit time domain samples of a time-varying signal.
However, it is preferable that such signals be sent as coefficients produced by a time to frequency domain transformation, e.g. a wavelet packet transform or a discrete cosine transform such as the Modified Discrete Cosine Transform (MDCT).
Fifth Preferred Embodiment
Referring to Figure 12, in this example, the transmitter 1 comprises a cømputer provided with a source of audio signals 107, e.g. a magnetic tape recording or a microphone, an analogue-to-digital converter 108 and a hardware network interface (not shown). The computer also supports a wavelet packet transformer process 109, an encoder process 110, a transmission process 111, a UDP socket 112 and a request processing process or processes (not shown).
The analogue-to-digital converter 108 digitises the output of the audio signal source 107 to produce signed 16-bit data. The wavelet packet transformer process 109 reads the samples from the digital-to-analogue converter 108 and outputs frequency domain coefficients which are encoded by the encoder process 110. The encoded coefficients are stored in a file 113 and then transmitted by the transmission process 111 and the UDP socket 112.
Referring to Figure 13, the wavelet packet transformer process 109 implements a seven layer (only three shown) digital filter tree structure. Each layer of the tree comprises a high-pass filter 121 and a low-pass filter 122 for each branch and a down-sample by 2 decimator 123 for decimating the output of each filter. The tree functions as an array of slightly overlapping bandpass filters with each filter in the last layer outputting samples for signal components in a substantially discrete frequency band. The result of the filtering is a set of samples that can be illustrated as a matrix having sixteen samples along a time axis and 64 samples along a frequency axis. An 8 by 8 section of such a matrix is shown in Figure.
The data in the "matrix" is taken by the encoder process 110 as its input. The purpose of the encoder process 110 is to produce an output in which information about significant bits precedes information about bits of lesser significance. This could be achieved by a modified interleaving process as shown in Figure 15.
Referring to Figure 15, the encoder process scans the matrix to determine the largest value (step slOl) and stores the integer part (n) of the base 2 logarithm of this value as a 5-bit code (step si 02). First, starting with the nth most significant bit, the encoder process appends the sign bit for each sample in the matrix to the stored value n (step si 03). Then from the nth significant bit to the leas^ significant bit (steps sl05 and sl06), scanning in the frequency direction in time order, the encoder process appends the value of the bit with the current significance to the previously stored values (step si 04).
When the scanning is complete, the stored data is sent to the file 113 (step sl07). This stored data constitutes one sample group and a plurality of such groups will represent a piece of music, for example, and be stored in a file together.
The data in the file 113 can be selectively read by a transmission process 111 in response to a request therefor from a receiver 2 which includes the file's name or some other identifier. The transmission process 111 combines data from four sequential sample groups to produce one datagram and sends one datagram for every sample group. It will be appreciated that the first three datagrams will be padded with runs of zeros because there will be insufficient sample groups to provide real data.
The generation of the data part of an nth datagram will now be described with reference to Figure 16.
The transmission process 111 reads and stores all of the bits of the nth sample group (step sl22). The transmission process 111 determines how many bits this is by multiplying the value (m) represented by the first five bits plus one (to account for the sign bits) by 1024 and adding 5 (step sl21).
Then the transmission process 111 calculates the integer part of ((m/2) + 1) x 1024 + 5 for the n-lth sample group (step sl23) and reads and appends that number of bits of the n-lth sample group (step sl24), headed by the n-lth sample group's first five bits..
Then the transmission process 111 calculates the integer part of ((m/4) + 1) x 1024 + 5 for the n-2th sample group (step sl25) and reads and appends that number of bits of the n-2lh sample group (step sl26), headed by the n-2th sample group's first five bits.. Finally, the transmission process 111 calculates the integer part of ((m/4) + 1) x 1024 + 5 for the n-3th sample group (step sl27) and reads and appends that number of bits of the n-lth sample group (step sl28), headed by the n-3th sample group's first five bits..
The result is a string of bits similar to the datagram data part shown in Figure 10 but with the potential for variation in the lengths of the subsections. This data is then compressed (step si 29) using a convenient conventional technique and sent to the UDP socket 112 for transmission to the receiver 2 (step si 30).
Referring to Figure 17, the receiver 2, in this example, comprises a UDP socket 116, a reception process 117, a decoder 118, an inverse wavelet packet transformer 119, a digital-to-analogue converter 120 and a loudspeaker 121.
The data in received datagrams is passed by the UDP socket 116 to the reception process 117. The reception process 117 de-compresses the data and compensates for lost or out of sequence datagrams.
Referring to Figure 18, when the reception process 117 receives (step si 30) a datagram conveying the content of the file 113 (Figure 15), it determines whether the datagram has a sequence position before the last received datagram (step sl31). If this is the case, the newly received datagram is discarded (step sl32). Otherwise, it decompresses the data in the datagram (step sl33).
The reception process 117 then determines the boundaries between data from different sample groups within the data. The reception process reads the first 5 bits of the data, giving value m and calculates (m+ 1) x 1024 + 5 to get the zero-based position a, of first bit of the second section of the data (step si 34). The reception process then reads five bits starting from bit a15 giving value m1} and calculates ((m-/2) + 1) x 1024 + 5 to get the position a- of the first bit of the third section (step sl35) and reads five bits from bit a2 inclusive giving value m, and Calculates ((m2/2) + 1) x 1024 + 5 to get the position a3 of the first bit of the fourth section (step si 36). If the datagram has a sequence position indicating that three or more datagrams have been sent to oblivion 6 by switch 5 (Figure 1) (step si 37), the reception process 117 outputs bits a3 to the end of the data to the decoder process 118 (step sl38). Otherwise, if the datagram has a sequence position indicating that two datagrams have been sent to oblivion 6 (step sl39), the reception process 117 outputs bits a2 to a3-l of the data part of the datagram to the decode process 118 (step sl40). Otherwise, if the datagram has a sequence position indicating that one datagram has been sent to oblivion 6 (step sl41), the reception process 117 outputs bits a, to a2-l of the data part of the datagram to the decoder process 18 (step sl42).
Finally and in any event, the reception process 17 outputs bits 0 to a--l of the datagram's data part to the decoder process 18 (step sl43).
Whenever the decoder process 118 receives a block bits, it extends it with zeros to a length of 16 x 1024 + 1024 = 17408 bits, after removal of the first 5 bits. Where m is the value of the first 5 bits, (16 - m) x 1024 zeros are added to the front of the data and (m x 1024) - (initial data length - 5) zeros are added to the end of the data. The resultant bitstream is de-interleaved to produce 1024 16-bit sample values. These values are supplied to the inverse wavelet packet transformer process 119 which reconstructs the original time-domain samples and these are then converted into an analogue signal by the digital-to-analogue converter 120 for output by the loudspeaker 121.
Sixth Preferred Embodiment
Referring to again to Figure 12, in another example, the data compression function is performed by the encoder process 110 rather than the transmission process 111 and the transformer process 109 is a MDCT transformer process giving an output matrix of coefficients comprising two columns in the time direction and 512 rows in the frequency direction.
The operation of the encoder process 110 for one frame will now be described with reference to Figure 19. At step s200, three lists Sig_coeffs, Pending and Ts_ptr are initialised. Sig_coeffs contains pointers to coefficients which have been found to be "significant" and is initially empty. The Pending list contains pointers to coefficients found to be significant in the present iteration. Ts_ptr comprises a respective pointer to the next coefficient to be checked for significance in each time slot.
When the lists have been initialised, the encoder process 110 determines and outputs resolution definitions, one per Bark scale band, which communicate the minimum resolutions required to ensure that quantisation noise is imperceptible (step s201). The encoder process 110 then finds the largest coefficient (step s202). If cmax is 0 (step s203) which indicate a silent frame, the process outputs 00000 (step s206) and terminates. However, if cmax is not 0, the process determines the code for the most significant bit position (N) which has a 1 in the binary representation of this number (step s204) using:
Figure imgf000023_0001
where cmιx is the largest coefficient value.. The coefficients are floating point numbers.
If N less than 1 (step s205), the process moves to step s206. If however N is not less than 1 (step s205), five bits representing the new value of N are output (step s207) and the samples are then processed on a time slot by time slot basis (steps s216 and s217). First it is determined whether there are any unmasked newly significant coefficients beyond the current Ts_ptr position in the current time slot (step s208). A newly significant coefficient is masked if its most significant 1 bit is below the resolution level set for the Bark band containing it. Such masked bits are treated at this point as zeros. If there are any unmasked newly significant coefficients at step s208, a 1 is output (step s209) and the length (R) of the run of insignificant coefficients from the Ts_ptr position to the significant coefficient is determined and run-length encoded (step. s210). The run-length code (step s211) and the sign bit of the significant coefficient are then output (step s212 , and a pointer to the coefficient is added to the Pending list (s213). If at step s208, the current time slot has no more newly significant members, a zero is output (step s214) and the Ts_ptr pointer for the present time slot is changed to point to the first extant insignificant coefficient (step s215). It is then determined whether the last time slot has been processed (step s216). If not, the process moves on to the next time slot (step s217).
If all of the time slots have been processed, the Sig_coeffs list is looped through (step s220). While the Sig_coeffs list is being looped through, the Nth bit of the coefficient pointer to by the next list member is output (step s219), if the resolution data indicates that it has a material effect on the perception of the signal by a listener (step s218). After looping through the Sig_coeffs list has been completed, the contents of the Pending list are transferred to the Sig_coeffs list (step s221) and N is decremented by 1 (step s222). Then, if the processing has not finished, (step s223), flow returns to step s208 otherwise the process terminates. The processing is finished when there is no more space available in the output datagram or datagram section, where packet loss recovery is being employed. The size of the datagram data part is determined by the required bit rate which may established in the design of the system or dynamically during operation to reflect different fidelity requirements.
From the foregoing, it will be apparent that the bits representing a particular significance level for a particular time slot terminate with a zero after the sign bit of the highest frequency significant bit. This removes long runs of zeros than would otherwise occur with audio signals which tend to contain relatively little energy at higher frequencies. However, long runs of zeros can frequently occur before a significant coefficient or between significant coefficients. In the present embodiment, these "preceding" and "interposed" zero runs are encoded using a run-length code. Furthermore, bits which do not have a role in creating the listener's perception of the transmitted signal are not transmitted. Instead, bits defining the necessary resolutions for masked components in the bands of the Bark scale are transmitted. The result of the encoding process 110 is a file or bitstream comprising a header and a plurality of data blocks. The header comprises the resolution definition data and the 5-bit significance code (N). The data blocks comprise the meaningful data from respective significance levels and within each block the data concerning coefficients becoming significant at the associated level precede data refining the values of coefficients that became significant at higher levels. It should also be noted that the number of Bark bands, i.e. 24, is much lower than the number of coefficients, i.e. 512, for each time slot.
Referring to Figure 20 in which sixteen illustrative 8-bit samples of one time slot are written vertically with the most significant bit uppermost and in which frequency increases from left to right, it can be seen that applying the above-described encoding process eliminates the transmission of runs of zeros at the "tail" (area A) of each time slot and run-length coding is used for runs of zeros preceding newly significant.
In the present example, the each run-length code comprises a prefix and a suffix. The prefix defines a range of values and the suffix the position within the range. In the present case, the prefixes and suffix lengths are as follows:-
Figure imgf000025_0001
The value represented by a code value is Prefix followed by Suffix value - Rmιπ. Thus, 5 = 01 100 and 330 = 0000 011100001.
Referring to again to Figure 17, in the present example, the data decompression function is again performed by the decoder process 118 rather than the reception process 117. Referring to Figure 21, the decoder process 118 reconstructs the original coefficients.
At step s300, three lists Sig_coeffs, Pending and Ts_ptr are initialised and the output coefficients are each set to "0000 0000 0000 0000". Sig_coeffs contains pointers to coefficients which have been found to be "significant" and is initially empty. The Pending list contains pointers to coefficients found to be significant in the present iteration. Ts_ptr comprises pointers to the next coefficient to be processed in each time slot.
When the lists have been initialised, the resolution definitions are obtained from the first 96 bits of the datagram data (step s301) and the value N is obtained from the next five bits of the data part of the datagram (step s302). If N is zero (step s303), the coefficients are output to the transformer process 119 (step s304). If N is not zero (step s303) and there are more than 2 unprocessed datagram data bits (step s305), N is used to set a threshold at 2N'9 (step s306) and the incoming data is then processed in respect of the coefficients in time slot order (steps s308 and s321). When the threshold has been set (step s306), it is determined whether there are any unprocessed datagram data bits left (step s307). If the result at step s307 is yes, the processing of the next time slot is performed, otherwise the process moves to step s304.
For each time slot, processing starts from the coefficient pointed to by the relevant Ts_ptr member (s309) and the datagram data is tested (step s310) to determine whether the next bit is a 1 and that there are more than two bits left. If the answer at step s310 is no, the process moves on to the next time slot after resetting the Ts_ptr member for the current time slot (step s320). However, if the answer at step s310 is yes, it is determined whether the next bit is 0 (step s312). If so, it is determined whether there are more than 2 bits left in the data part of the datagram (step s313). If there are less than two bits left, the process returns to stjep s310, otherwise the next run-length code prefix is selected (step s313) (the run-length code prefix defaults initially to the prefix for the lowest range) and the process returns to step s312.
If the answer at step s311 is no, it is determined, on the basis of the run-length code prefix, whether there are sufficient bits left in the datagram data to complete the run-length code (step s314). If there are not, the process returns to step s310 otherwise the number of coefficients indicated by the run-length code are skipped (step s315) and the next bit of the datagram data is read as the sign of the next coefficient (step s316) and the magnitude of the coefficient is set to the value of the threshold (step s317). The current coefficient is then added to the Pending list (step s318).
If it is determined that all of the time slots have been processed for the present threshold at step s320, the members of Sig_coeffs are processed. For each member of Sig_coeffs (steps s322 and s326), the next bit of the datagram data is added as the h bit to the current coefficient (step s324), if the resolution definition data does not indicate that the value of the current coefficient is irrelevant (step s323).
When all of the members of Sig_coeff have been processed, N is decremented (step s326) and, if all the data in the received datagram has not been processed (step s327), the members of the Pending list are transferred to the Sig_coeffs list (step s328) and the process returns to step s305. If, however, all the data in the received datagram has been processed at step s327, the process moves to step s304 to output the coefficients.
A feature of the above-described decoder process 118 is that the input bitstream may be truncated at any point within the blocks of coefficient data without causing a failure. Thus, the decoder does not actually need to be aware to the number of significance levels represented in the received signal. This also means that the encoder can be operated to produce the number of bits required to fill the main or only data part in the present datagram and therefore data for as many significance levels as possible is transmitted, maximising the fidelity within the current bandwidth constraints. In the case of the multipart datagrams, the "old" data parts can be formed by truncating the data from a previous frame at a particular bit position rather than at a particular boundary between data for different significance levels.
Seventh Preferred Embodiment
Referring to Figure 22, a portable audio playback device 200 comprises a control circuit 201 in the form of microcomputer circuitry, a serial communication interface 202, a large flash ROM memory 203, a keypad 204 for controlling the operation of the device, an audio module 205 including a digital-to-analogue converter and a variable-gain amplifier, and a jack socket 206 for connecting the device to an earpiece.
The control circuit 201 includes an embedded version of the Linux operating system which includes an ftp daemon. By connecting the device to a personal computer (not shown), a user can transfer files or selected parts thereof to the memory 203 using the ftp protocol.
The transferred files are preferably in a format of the types produced by the encoder processes in the embodiments described above. Consequently, the user can trade fidelity, i.e. number of significance levels, against duration. Thus, the user may choose between a few high fidelity recordings or many lower fidelity recordings.
The control circuit 201 is also provided with a program for reading data from the memory 203 and decoding and, if necessary, transforming it. The resultant time domain digital data is then sent to the audio module 205 for output as an analogue signal via the jack socket 206.
Preferred Multichannel and Stereophonic Embodiments In the foregoing, the present invention has been described solely in the context of single channel signals, such as monophonic audio. However, the present invention can be applied to multichannel signals, e.g. stereophonic audio. A multichannel signal may be sent with each channel being carried by a separate stream of packets, the packets for each channel being interleaved with each other. However, it is preferred that each packet contain data from all of the channels with the data grouped according to significance so that the most significant data from all of the channels is grouped together and the next most significant data from all of the channels is grouped together and so on.
In order to exploit the correlation between the left and right channels, stereophonic signals may be sent as sum and difference signals. This also maintains compatibility with monophonic receiving or reproducing apparatus. The left and right channel signals are added together to produce the sum signal, and subtracted one from the each other to produce the difference signal. When there is correlation between the channels, the difference signal has a much smaller amplitude than the sum.
The sum and difference signal generation may be carried out either on the raw time- domain signals before any time-frequency transformation or on the transformed frequency domain versions of the signals. The latter is more efficient and it is preferred because the correlation is generally greater in the frequency domain.
The sum and difference signals are then encoded, or encoded and compressed, by one of the methods above.
In one approach, the total bitrate available may be divided up between the sum and difference signals in a fixed way, so that each receives a fixed proportion of the total bitrate, regardless of the signal characteristics. The proportion allocated to the sum will always be higher, but the best compromise would need to be determined by experiment.
Alternatively, the encoding/compression for each frame is carried out simultaneously on the sum and difference signals. Encoding stops when the sum of the bits used for the two parts is equal to the number dictated by the cuVrent bitrate. However, the initial threshold is the same for both, and will normally be dictated by the sum signal, since it will in general be greater. The result of this process is that sum and difference will be specified with equal precision but that the number of bits used for the two will vary according to how similar or different the two channels are.
The two parts can then be transmitted over a network in a number of ways, e.g. each packet could contain sum and difference significance blocks interleaved or the sum and difference signals could be transmitted in separate packets. In both cases the packet loss recovery system described above can be employed.
In a network, with quality of service guarantees, the sum packets c + ould be transmitted with a high service quality code and the difference packets with low service quality code. Then at times of high network traffic, a mono signal only would be available, but at times of low traffic, the full stereo signal would be provided.
An alternative approach takes advantage of two phenomena related to the perception of the stereophonic 'image'. At low frequencies, channels may be amalgamated without affecting the stereo image and, at high frequencies, the perception of the stereo image depends more on the temporal envelope of the signal than on the fine structure.
These are exploited as follows.
Below a certain frequency (a few hundred Hz) the subbands for the n channels may be replaced by a single set of subbands equal to the average values of the coefficients in those subbands. Above a certain frequency (typically 2-3 kHz), the separate subband coefficients are used to generate the initial significance information. The subbands are averaged as above, and these averaged subbands are used to generate joint refinement information for all channels. In this way, the higher-frequency subbands are conveyed with different envelopes, but with the same fine structure. The encoding or encoding and compression is then carried out using: (a) joint low- frequency subbands, (b) separate mid-frequency subbands, (c) separate high- frequency significance information (i.e. most significant bits) and (d) joint high- frequency refinement information (i.e. bits other than most significant).
In order to maintain the scalability of the bitstream, the coding for items (b) and (c). above is carried out in an interleaved fashion.
It will be appreciated that many modifications may be made to the embodiments described above. In particular, the time-frequency domain transformation may be adapted to the nature of the input signal and the transmission path. In the case of the multisection packets as shown in Figure 10, the optimum division of the packet will depend on the packet loss characteristics of the channel through which packets are to be sent.
In the foregoing, embodiments of the present invention have been described with reference to UDP in TCP/IP networks. It is to be understood that the present invention is not limited in its application to any one form of digital communications network and may be applied to TCP in TCP/IP networks and other forms of network, e.g. ATM networks.

Claims

Claims
1. A method of transmitting a loss-tolerant signal, the method comprising selecting a bit from each digital code in a group of digital codes representing a timeτ varying signal at a plurality of instants, the selected bits all having the same significance, and transmitting the selected bits together.
2. A method according to claim 1, comprising selecting a further bit from each digital code, the selected further bits all having the same, lower significance, and transmitting the selected further bits together.
3. A method according to claim 2, wherein said selected further bits are transmitted after said selected bits.
4. A method according to claim 1, 2 or 3, wherein the digital codes each comprise a sign bit and a plurality of magnitude bits, the sign bits being accorded a significance equivalent to that of the most significant magnitude bits.
5. A method according to any preceding claim, wherein the step of selecting bits having the same significance is repeated for different significance levels, said significance levels being selected in dependence on the bandwidth of a channel through which the bits are to be transmitted and being in an unbroken sequence.
6. A method according to any one of claims 1 to 4, wherein a first subset of the bits, having significances in a first upper range, is selected and transmitted in a first packet and a second subset of the bits, having significances in a second lower range, is selected and transmitted in a second packet, the packets including the same destination address and being distinguished by respective quality of service codes in accordance with a quality of service routing protocol of a router in a path to the destination identified by the destination address.
7. A method according to any preceding claim, wherein the bits are transmitted in packets and each packet comprises bits selected from a respective first group of digital codes and bits selected from a respective second group of digital codes, the second group representing an earlier part of said time-varying signal than the first group.
8. A method according to any preceding claim, including compressing the transmitted bits.
9. A method according to any preceding claim, comprising re-ordering the bits of said digital codes so as to group the bits thereof by significance before selecting bits for transmission.
10. A method according to any one of claims 1 to 7, including transforming time domain samples into frequency domain coefficients, wherein said digital codes comprises frequency domain coefficients.
11. A method according to claim 10, wherein the time domain samples are transformed into frequency domain coefficients by a wavelet packet transform.
12. A method according to claim 10, wherein the time domain samples are transformed into frequency domain coefficients by a modified discrete cosine transform.
13. A method according to any one of claims 10, 11 or 12, wherein the selected bits are compressed.
14. A method according to claim 13, comprising re-ordering the bits of said digital codes so as to group the bits thereof by significance before selecting bits for transmission.
15. A method according to claim 14, wherein the bits are compressed during reordering.
16. A method according to claim 15, wherein re-ordering comprises arranging the coefficients in a representation of a two-dimensional matrix having a separate column for each time slot and separate row for each frequency subband.
17. A method according to claim 16, wherein the rows are ordered by frequency subband.
18. A method according to claim 16 or 17, wherein re-ordering comprises for each significance level of the coefficients in each column in row order, replacing runs of zeros terminating at an edge row with a termination marker.
19. A method according to claim 18, wherein the termination marker is a 0.
20. A method according to any one of claims 16 to 18, wherein re-ordering comprises for each significance level of the coefficients in each column in row order, replacing runs of zeros through coefficients, having a most significant 1 bit in a yet unhandled significance level, and terminating before an edge row with a run- length code.
21. A method according to claim 20, wherein the run-length code comprises a prefix defining a range and suffix defining a position within the range defined by the suffix.
22. A method according to any one of claims 16 to 21, wherein bits in significance levels above that containing the most significant 1 bit among the coefficients are discarded during re-ordering.
23. An apparatus for transmitting a loss-tolerant signal, the apparatus comprising selection means for selecting a bit from each digital code in a group of digital codes representing a time-varying signal at a plurality of instants, the selected bits all having the same significance, and means for transmitting the selected bfrs together.
24. An apparatus according to claim 23, wherein the selection means is configured for selecting a further bit from each digital code, the selected further bits all having the same, lower significance, and the means for transmitting is configured for transmitting the selected further bits together.
25. An apparatus according to claim 24, wherein said selected further bits are transmitted after said selected bits.
26. An apparatus according to claim 23, 24 or 25, wherein the digital codes each comprise a sign bit and a plurality of magnitude bits, the sign bits being accorded a significance equivalent to that of the most significant magnitude bits.
27. An apparatus according to any one of claims 23 to 26, including means for determining a channel quality level for a channel through which the selected bits are to be transmitted, wherein the means for selecting bits is configured for repeatedly selecting bits having the same significance for different significance levels, the significance levels being selected in dependence on the channel quality level determined by the means for determining a channel quality and being in an unbroken sequence.
28. An apparatus according to any one of claims 23 to 26, wherein the selection means is configured for selecting a first subset of the bits, having significances in a first upper range, and for selecting a second subset of the bits, having significances in a second lower range, and the transmitting means is configured for transmitting the first subset in a first packet and the second subset in a second packet, the packets including the same destination address and optionally being distinguished by respective quality of service codes in accordance with a quality of service routing protocol of a router in a path to the destination identified by the destination address.
29. An apparatus according to any one of claims 23 to 28, wherein the transmitting means is configured to transmit the bits in packets and each packet comprises bits selected by the selection means from a respective first group of digital codes followed by bits selected by the selection means from a respective second group of digital codes, the second group representing an earlier part of said time-varying signal than the first group.
) 30. An apparatus according to any one of claims 23 to 29, including means for compressing the bits to be transmitted by the transmitting means.
31. An apparatus according to any one of claims 23 to 30, comprising reordering means for re-ordering the bits of said digital codes so as to group the bits 0 thereof by significance for selection by the selection means for transmission.
32. An apparatus according to any one of claims 23 to 29, including transformer means for transforming time domain samples into frequency domain coefficients, wherein said digital codes comprises frequency domain coefficients. 5
33. An apparatus according to claim 32, wherein the transformer performs a wavelet packet transform.
34. An apparatus according to claim 32, wherein the transformer performs a 0 modified discrete cosine transform.
35. An apparatus according to any one of claims 32, 33 or 34, including compression means for compressing the selected bits.
5 36. An apparatus according to claim 35, comprising re-ordering means for reordering the bits of said digital codes so as to group the bits thereof by significance.
37. An apparatus according to claim 36, wherein compression means comprises the re-ordering means.
30
38. An apparatus according to claim 37, wherein the re-ordering me&ns is configured to arranging the coefficients in a representation of a two-dimensional matrix having a separate column for each time slot and separate row for each frequency subband.
39. An apparatus according to claim 38, wherein the rows are ordered by frequency subband.
40. An apparatus according to claim 38 or 39, wherein the re-ordering means is configured such that for each significance level of the coefficients in each column in row order, runs of zeros terminating at an edge row are replaced with a termination marker.
41. An apparatus according to claim 40, wherein the termination marker is a 0.
42. An apparatus according to any one of claims 38 to 41, wherein the re- ordering means is configured such that for each significance level of the coefficients in each column in row order, runs of zeros through coefficients, having a most significant 1 bit in a yet unhandled significance level, and terminating before an edge row are replaced with a run-length code.
43. An apparatus according to claim 42, wherein the run-length code comprises a prefix defining a range and suffix defining a position within the range defined by the suffix.
44. An apparatus according to any one of claims 38 to 43, wherein re-ordering means is configured such that bits in significance levels above that containing the most significant 1 bit among the coefficients are discarded during re-ordering.
45. A method of receiving a loss-tolerant signal, the method comprising receiving bits of a first group of digital codes, re-ordering the bits to produce a second group of digital codes, each member of which comprises at least the most significant bit or bits of a corresponding member of said first group of digital codes, and generating a time-varying signal using said second group of codes.
46. A method according to claim 45, wherein the digital codes of the second group each comprise a sign bit position and a plurality of magnitude bit positions, the sign bit position being according a significance equivalent to that of the most significant magnitude bit position.
47. A method according to claim 45 or 46, comprising padding the codes of the second group with zeros in positions for which bits were not received, such that the digital codes of the second group have the same number of bits as the digital codes of the first group.
48. A method according to claim 45 or 46, comprising receiving a first subset of the bits of the digital codes of the first group, having significances in a first upper range, in a first packet and receiving a second subset of the bits of the digital codes of the first group, having significances in a second lower range, in a second packet, and appending the bits of the second subset to those of the first subset before reordering to produce the second group of digital codes.
49. A method according to any one of claims 45 to 48, comprising receiving a plurality of packets, each packet comprises a first section followed by a second section, the first section comprising re-ordered bits from one group of digital codes and the second section comprising re-ordered bits from another group of digital codes, wherein the bits in the second section represent an earlier part of said time- varying signal than those in the first section.
50. A method according to claim 49, comprising the steps of:- receiving a first packet; re-ordering the bits in the first section for reproducing said time-varying signal; receiving a second packet; determining that an intervening packet has been lost; re-ordering the bits of the second section of the second packet fJDr reproducing said time-varying signal; and thereafter re-ordering the bits of the first section of the second packet for reproducing said time-varying signal.
51. A method according to any one of claims 45 to 50, including decompressing the received data.
52. A method according to any one of claims 45 to 50, including performing a frequency domain to time domain transform on the second group of digital codes.
53. A method according to claim 52, wherein the frequency domain to time domain transform is an inverse wavelet packet transform.
54. A method according to claim 52, wherein the frequency domain to time domain transform is an inverse modified discrete cosine transform.
55. A method according to any one of claims 52 to 54, wherein the re-ordered bits are decompressed.
56. A method according to claim 55, wherein the re-ordered bits are decompressed during re-ordering.
57. A method according to claim 56, wherein the re-ordered bits are represented in the form of a two-dimensional matrix having a column for each time slot and a row for each frequency subband represented by the coefficients and re-ordering comprises for each significance level the coefficients of each time slot, replacing a predetermined termination marker, if present, with a run of zeros terminating at an edge row.
58. A method according to claim 57, wherein the termination marker is a 0.
59. A method according to claim 56, 57 or 58, wherein re-ordering comprises for each significance level of each time slot, determining the presence of a run-length code and, if a run-length code is present, replacing it with a run of zeros having a length determined by the run-length code.
60. A method according to claim 59, wherein the run-length code comprises a prefix defining a range and suffix defining a position within the range defined by the suffix.
61. A method according to any one of claims 56 to 60, including padding the more significant bit positions of the digital codes of the second group with zeros in dependence on a received significance code, wherein the significance code indicates the most significant bit position for which the bits of digital codes of the first group were transmitted.
62. An apparatus for receiving a loss-tolerant signal, the apparatus comprising: receiving means for receiving bits of a first group of digital codes, the more significant bits of said codes preceding the less significant bits; means for re-ordering the bits to produce a second group of digital codes, each member of which comprises at least the most significant bit or bits of a corresponding member of said first group of digital codes; and means for generating a time-varying signal using said second group of codes.
63. An apparatus according to claim 62, configured for receiving signals in which the digital codes of the second group each comprise a sign bit position and a plurality of magnitude bit positions, wherein the sign bit position is accorded a significance equivalent to that of the most significant magnitude bit position.
64. An apparatus according to claim 62 or 63, including means for padding the codes of the second group with zeros in positions for which bits were not received, such that the digital codes of the second group have the same number of bits as the digital codes of the first group.
65. An apparatus according to claim 62 or 63, comprising means appending the bits in a second packet received by the receiving means to those in a first packet received by the receiving means and outputting the result to the means for reordering to produce the second group of digital codes, wherein the bits in the first packet and the bits in the second packet relate to the same digital codes and the bits in the first packet have significances in a first upper range and the bits in the second packet have significances in a second lower range.
66. An apparatus according to any one of claims 62 to 65, comprising means for determining the loss of a packet and means for selecting a portion of a next received packet to replace a lost packet, wherein each packet comprises a first section followed by a second sections, the first section comprising re-ordered bits from one group of digital codes and the second section representing an earlier part of said time-varying signal than first section.
67. An apparatus according to any one of claims 62 to 66, including means for decompressing the received data.
68. An apparatus according to any one of claims 62 to 66, including transform processing means for performing a frequency domain to time domain transform on the second group of digital codes.
69. An apparatus according to claim 68, wherein the transform processing means performs an inverse wavelet packet transform.
70. An apparatus according to claim 68, wherein the transform processing means performs an inverse modified discrete cosine transform.
71. An apparatus according to any one of claims 68, 69 or 70, including means for decompressing the re-ordered bits.
72. An apparatus according to claim 71. wherein the re-ordered bits are represented in the form of a two-dimensional matrix having a column fφr each time slot and a row for each frequency subband represented by the coefficients and the means for decompressing the re-ordered bits comprises said means for re-ordering.
73. An apparatus according to claim 72, wherein said means for re-ordering is configured such that for each significance level of each time slot, it replaces a predetermined termination marker, if present, with a run of zeros terminating at the highest frequency coefficient.
74. An apparatus according to claim 73, wherein the termination marker is a 0.
75. An apparatus according to claim 72, 73 or 74, wherein said means for re- ordering is configured such that for each significance level of each time slot, it determines the presence of a run-length code and, if a run-length code is present, replaces it with a run of zeros having a length determined by the run-length code.
76. An apparatus according to claim 75, wherein the run-length code comprises a prefix defining a range and suffix defining a position within the range defined by the suffix.
77. An apparatus according to any one of claims 72 to 76, including means for padding the more significant bit positions of the digital codes of the second group with zeros in dependence on a received significance code, wherein the significance code indicates the most significant bit position for which the bits of digital codes of the first group were transmitted.
78. An audio playback device according to any one of claims 62 to 77, including memory means for storing bits received by the receiving means, wherein the reordering means is configured to re-order bits stored in the memory for generating a playback audio signal.
79. A portable device according to claim 78.
80. A method of operating a network routing node for routing a signal packet in which more significant bits precede less significant bits, the method comprising determining the bandwidth available in a path away from the node and, if the bandwidth is below a threshold value, truncating the data in said packet in dependence on the determined bandwidth.
81. A method of operating a node in a network having a TDMA wireless link to a terminal apparatus, the method being in accordance claim 80 and wherein the bandwidth varies with the number of slots available in each frame for transmissions to the terminal apparatus.
82. A network node configured for operation according to claim 80 or 81.
PCT/GB2000/001649 1999-05-01 2000-04-28 Robust coding for the transmission of audio or video signals WO2000067417A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP00925502A EP1177651A1 (en) 1999-05-01 2000-04-28 Robust coding for the transmission of audio or video signals
AU44223/00A AU4422300A (en) 1999-05-01 2000-04-28 Robust coding for the transmission of audio or video signals

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9910002A GB9910002D0 (en) 1999-05-01 1999-05-01 Audio signal encoders and decoders
GB9910002.6 1999-05-01

Publications (1)

Publication Number Publication Date
WO2000067417A1 true WO2000067417A1 (en) 2000-11-09

Family

ID=10852571

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2000/001649 WO2000067417A1 (en) 1999-05-01 2000-04-28 Robust coding for the transmission of audio or video signals

Country Status (4)

Country Link
EP (1) EP1177651A1 (en)
AU (1) AU4422300A (en)
GB (1) GB9910002D0 (en)
WO (1) WO2000067417A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002052240A1 (en) * 2000-12-22 2002-07-04 Telefonaktiebolaget Lm Ericsson (Publ) Method and a communication apparatus in a communication system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2139458A (en) * 1983-04-18 1984-11-07 British Broadcasting Corp Error correction in data transmission or processing
WO1993015502A1 (en) * 1992-01-28 1993-08-05 Qualcomm Incorporated Method and system for the arrangement of vocoder data for the masking of transmission channel induced errors
US5255343A (en) * 1992-06-26 1993-10-19 Northern Telecom Limited Method for detecting and masking bad frames in coded speech signals
US5487061A (en) * 1994-06-27 1996-01-23 Loral Fairchild Corporation System and method for providing multiple loss and service priorities
EP0869622A2 (en) * 1997-04-02 1998-10-07 Samsung Electronics Co., Ltd. Scalable audio coding/decoding method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2139458A (en) * 1983-04-18 1984-11-07 British Broadcasting Corp Error correction in data transmission or processing
WO1993015502A1 (en) * 1992-01-28 1993-08-05 Qualcomm Incorporated Method and system for the arrangement of vocoder data for the masking of transmission channel induced errors
US5255343A (en) * 1992-06-26 1993-10-19 Northern Telecom Limited Method for detecting and masking bad frames in coded speech signals
US5487061A (en) * 1994-06-27 1996-01-23 Loral Fairchild Corporation System and method for providing multiple loss and service priorities
EP0869622A2 (en) * 1997-04-02 1998-10-07 Samsung Electronics Co., Ltd. Scalable audio coding/decoding method and apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002052240A1 (en) * 2000-12-22 2002-07-04 Telefonaktiebolaget Lm Ericsson (Publ) Method and a communication apparatus in a communication system
US7444281B2 (en) 2000-12-22 2008-10-28 Telefonaktiebolaget Lm Ericsson (Publ) Method and communication apparatus generation packets after sample rate conversion of speech stream

Also Published As

Publication number Publication date
AU4422300A (en) 2000-11-17
EP1177651A1 (en) 2002-02-06
GB9910002D0 (en) 1999-06-30

Similar Documents

Publication Publication Date Title
EP0936772B1 (en) Unequal error protection for perceptual audio coders
US7069208B2 (en) System and method for concealment of data loss in digital audio transmission
JP4004707B2 (en) Techniques for multirate coding of signals containing information
US6122338A (en) Audio encoding transmission system
EP1946517B1 (en) Audio data packet format and decoding method thereof and method for correcting mobile communication terminal codec setup error and mobile communication terminal performing same
KR101699548B1 (en) Encoder, decoder and method for encoding and decoding
JP2000324183A (en) Communication device and method
US20020078241A1 (en) Method of accelerating media transfer
KR20090001370A (en) Method of setting configuration of codec and codec using the same
US20040024592A1 (en) Audio data processing apparatus and audio data distributing apparatus
EP0919988B1 (en) Speech playback speed change using wavelet coding
US20050240414A1 (en) Data processing system, data processing method, data processing device, and data processing program
US8023585B2 (en) Apparatus and method for transmitting or receiving data
JPH11511308A (en) Digital transmission system
JP3379610B2 (en) Encoding and decoding apparatus and method using channel masking characteristic for bit allocation
CN1398055A (en) Wireless audio transmission system and method
KR100706968B1 (en) Audio data packet generation apparatus and decoding method thereof
EP1177651A1 (en) Robust coding for the transmission of audio or video signals
JP4077037B2 (en) Method and apparatus for mapping between cellular bitstream and wired waveform
CN1157853C (en) Transmitting device for transmitting a digital information signal alternately in encoded form and non-encoded form
US20010056343A1 (en) Sound signal encoding apparatus and method
CN100339903C (en) Method and apparatus for transmitting audio and non-audio information with error correction
CN115691521A (en) Audio signal coding and decoding method and device
CN115691514A (en) Coding and decoding method and device for multi-channel signal
Becker et al. Influence of the BER on the Intelligibility of the Received DAB Signal

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2000925502

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09959612

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 2000925502

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Ref document number: 2000925502

Country of ref document: EP