CN103999154A - Apparatus and method for audio encoding - Google Patents

Apparatus and method for audio encoding Download PDF

Info

Publication number
CN103999154A
CN103999154A CN201280061303.3A CN201280061303A CN103999154A CN 103999154 A CN103999154 A CN 103999154A CN 201280061303 A CN201280061303 A CN 201280061303A CN 103999154 A CN103999154 A CN 103999154A
Authority
CN
China
Prior art keywords
bandwidth
energy
sound signal
coding
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201280061303.3A
Other languages
Chinese (zh)
Other versions
CN103999154B (en
Inventor
霍利·L·弗朗索瓦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google Technology Holdings LLC
Original Assignee
Motorola Mobility LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Mobility LLC filed Critical Motorola Mobility LLC
Publication of CN103999154A publication Critical patent/CN103999154A/en
Application granted granted Critical
Publication of CN103999154B publication Critical patent/CN103999154B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0204Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using subband decomposition

Abstract

A method (600) and apparatus (100) provides for encoding an audio signal. A bit rate value (141) is received (605). A set of energy thresholds (371), of a plurality of set of thresholds, based on the bit rate value is selected (810). The energy thresholds of each set of energy thresholds correspond on a one-to-one basis with a set of sub-bands of a received audio signal (615). The energy of each sub-band of the set of sub-bands is determined (620). A highest frequency sub-band that has an energy exceeding the corresponding threshold is determined (625). A selected bandwidth of the audio signal is encoded (630). The selected bandwidth includes only those frequencies of the audio signal that are in the highest frequency sub-band that has an energy exceeding the corresponding threshold, and the lower frequencies of the audio signal that are above a high-pass cut-off frequency.

Description

Apparatus and method for audio coding
Technical field
The present invention relates in general to audio coding and decoding.
Background technology
Growth and digital signal processor (DSP) that in the past 20 years, microprocessor speed is several orders of magnitude become omnipresent.By analogue communication, to change digital communication into be feasible and be attractive.Digital communication provides the main advantage that can more effectively utilize bandwidth and allow use error alignment technique.Therefore by using digital technology, people can be sent more information and be sent more reliably information by the spectral space of given distribution.Digital communication can be used radio link (wireless) or physical network medium (for example, optical fiber, copper networks).
For example, digital communication can be used in the dissimilar communication such as voice, audio frequency, image, video or remote sensing.Digital communication system comprises transmitting apparatus and receiving equipment.Can carry out in the system of two-way communication, each equipment have transmission and reception circuit the two.In numeral transmission or receiving equipment, there is multiple-stage treatment, by this multiple-stage treatment signal and generated data signal the digitized version of the received level of input end (for example, microphone, camera, sensor) and signal for modulated carrier and be launched grade between be passed.At (1) signal, after input end is received and is digitized afterwards, (2) can apply some initial noise filterings, subsequently the final chnnel coding in (3) information source coding and (4).In receiving device, this process is carried out according to contrary order; Channel-decoding, information source is recovered, and is transformed to afterwards simulation.The present invention that will describe in continued page can be considered to mainly drop into information source code level.
The main target of information source coding is to reduce bit rate, keeps perceived quality simultaneously as far as possible.For dissimilar media, developed different standards.
Accompanying drawing explanation
Be considered to novel feature of the present invention specifies in claims.Yet, the present invention itself, as tissue and method of operating the two, together with its target and advantage, can be by reference to following detailed description book by best understanding, this instructions has been described some exemplary embodiment that comprises concept of the present invention.Instructions is intended to be understood by reference to the accompanying drawings, in the accompanying drawings:
Fig. 1 is according to the block diagram of the communication facilities of some embodiment.
Fig. 2 is according to the block diagram of the audio coding function of the communication facilities of some embodiment.
Fig. 3 is according to the block diagram of the Sub-band spectrum analysis function of the audio coding function of some embodiment.
Fig. 4 illustrates according to the sequential chart of some exemplary signal in communication facilities of some embodiment.
Fig. 5 illustrates according to the enlarged of the sequential chart from Fig. 4 of some embodiment.
Fig. 6-9th, illustrates according to the process flow diagram of the operation of the audio coding function of various embodiment.
Technician will be understood that the element in figure is illustrated for the purpose of simple clear and needn't draws in proportion.For example, the size of some elements in figure can be exaggerated to help to improve the understanding of embodiments of the invention with respect to other elements.
Embodiment
Although the present invention allows many multi-form embodiment, but shown in the drawings and will be described in detail herein specific embodiment, need to understand the disclosure be considered to the example of principle of the present invention and be not intended to limit the invention to shown in and described specific embodiment.In the following description, identical Reference numeral is for being described in identical, the similar or corresponding part of several views of accompanying drawing.
In the literature, such as the first and second, the relational terms such as top and bottom can only be used for distinguishing an entity or action and another entity or action, and needn't require or imply such relation or the order of any reality between these entities or action.Term " comprises (comprises) ", " comprising (comprising) " or any its other variations, be intended to contain non-exclusive comprising, to comprise process, method, the article of element list or install not only to comprise those elements but also can comprise, there is no explicit listing or other intrinsic elements in these processes, method, article or device.Succeeded by " comprise ... one (comprises ... a) " element, the in the situation that of more restrictions not, get rid of the appearance of the extra identical element in comprising the process of this element, method, article or device.
Run through the literature quoting of " embodiment " " some embodiment " " embodiment " or similar term meaned to specific feature, structure or the characteristic described are at least one embodiment of the present invention involved in conjunction with the embodiments.Therefore, run through the such phrase of this instructions or needn't point to identical embodiment in the appearance in various places.In addition specific feature, structure or characteristic hard-core combination in any suitable manner in one or more embodiments.
That term "or" used herein will be understood to comprise or mean any one or any combination.Thereby, " A, B or C " mean " below any one: A; B; C; A and B; A and C; B and C; A, B and C ".The exception of this definition only has generation when element, function, step or behavior are combined in some intrinsic mutually exclusive modes.
Embodiment as herein described relates to coded signal.Signal can be to be transformed to numerical information and the voice by wired or wireless communication or such as other audio frequency of music.
Turn to now accompanying drawing, wherein identical Reference numeral is indicated identical parts, and Fig. 1 is according to the block diagram of the wireless electronic communication apparatus 100 of some embodiment.Wireless electronic communication apparatus 100 represents the Wireless Telecom Equipment of numerous species, such as mobile cellular telephone, mobile personal communication equipment, cellular basestation and the personal computer that is equipped with radio communication function.According to some embodiment, wireless electronic communication apparatus 100 comprises radio system 199, man-machine interface system 120 and radio frequency (RF) antenna 108.
Man-machine interface system 120 is to comprise following system: the electronic unit of disposal system and this disposal system of support, such as exterior I/O circuit and power control circuit, the electronic unit docking with user in addition, such as microphone 102, demonstration/touch keyboard 104 and loudspeaker 106.Disposal system comprises CPU (central processing unit) (CPU) and storer.CPU processes the software instruction that is stored in the man-machine interface aspect that is chiefly directed to mobile communication equipment 100 in storer, the typing of people on the touch-surface of demonstration/keyboard 104 such as presentation information on demonstration/keyboard 104 (list, menu, figure etc.) and detection.These functions are shown as the set of human interface applications (HIA) 130.HIA130 can also pass through analog/digital (A/D) converter 125 and receive speech audio from microphone 102, carries out afterwards the speech recognition of these voice and the order of making in response to voice.HIA130 can also pass through the sound of digital-to-analog converter (D/A) 135 transmission such as the tinkle of bells to loudspeaker 106.Man-machine interface system 120 can comprise unshowned other human interface devices in Fig. 1, such as haptic apparatus and camera.
Radio system 199 is to comprise following system: disposal system and the electronic unit of supporting this disposal system, such as exterior I/O circuit and power control circuit, be docked to the electronic unit of antenna, in addition such as RF amplifier.Disposal system comprises CPU (central processing unit) (CPU) and storer.CPU processes the software instruction that is stored in the radio interface aspect that is chiefly directed to mobile communication equipment 100 in storer, such as transmitting, is encoded as the digitized signal (being depicted as transmitter system 170) of packet and the packet (being depicted as receiver system 140) that receipt decoding is digitized signal.But for some radio frequency interface part (not explicit illustrating in Fig. 1) of antenna 108 and receiver system 140 and transmitter system 170, wireless electronic communication apparatus 100 also will represent many wire communication facilities such as cable node.Some embodiment are below personal communication devices.
Receiver system 140 is couple to antenna 108.Antenna 108 is intercepted and captured radio frequency (RF) signal that can comprise the channel with digitally encoded signal.The signal of intercepting and capturing is couple to receiver system 140, this receiver system 140 these signals of decoding the and in these embodiments digital signal of recovery is couple to man-machine interface system 120, this man-machine interface system 120 is converted to simulating signal to drive loudspeaker by this signal.In other embodiments, the digital signal of recovery can be used to present image or video on the display of man-machine interface system 120.Transmitter system 170 is accepted digitized signal 126 from man-machine interface system 120, can be for example digitized voice signal, digital music signal, digital image signal or digitized video, it can be coupled, be stored in wireless electronic communication apparatus 100 from receiver system 140, or derives from the electronic equipment (not shown) that is couple to electronic communication equipment 100.Digitized signal is the signal being sampled with cycle digitizing sampling rate.Digitized sampling rate can be for example that 8KHz, 16KHz, 32KHz, 48KHz or other need not to be the sampling rate of 8KHz multiple.Should be understood that 1/2 little that the bandwidth of the signal being sampled can be than sampling rate.For example, in certain embodiments, the signal with 12KHz bandwidth can be sampled in the sampling rate of 48KHz.Transmitter system 170 is analyzed digitized signal 126 and be encoded to the digital packet of being launched by antenna 108 on RF channel.
Transmitter system 170 comprises audio frequency compilation facility 181, and the sampling of the analysis digitized signal in its cycle is also encoded to bandwidth efficient code word 182 by them.Code word 182 receives with the frequency analysis by digitized signal 126 and in the message from the network equipment and generates from the determined bit rate of bit-rates values 141 that receiver system 140 is couple to audio frequency compilation facility 181.In certain embodiments, the bit-rates values 141 receiving from network can define the bit rate that is transferred to the permission that the network equipment 100 cannot surpass, and it is determined based on current network business load by Virtual network operator or the network equipment conventionally.In certain embodiments, bit-rates values can define the bit rate of permission, and the bit rate that must meet this permission as mean value equipment 100 still has for example, instantaneous value in some tolerance limits (, being no more than the more than 10% of mean value).The example of the bit-rates values of this type can be the transmission bit rate being used by equipment 100 according to paying structural limitations.In certain embodiments, bit-rates values 141 can be from man-machine interface system 120 but not receiver system 140 coupled.Packet generator 187 is used code word 182 to be couple to the grouping of RF transmitter 190 for amplifying to form, and radiate afterwards by antenna 108.
With reference to figure 2, according to some embodiment, the block diagram of audio frequency compilation facility 181 is shown.Audio frequency compilation facility 181 comprises converter 205, Sub-band spectrum analysis function 210, voting logic function 215 and audio coding function 220.Can not use converter 205 in certain embodiments.Converter 205 is converted to digitized signal 126 signal being converted 206 that periodicity value of providing of constant is provided regardless of the sampling rate of digitized signal 126.For example, there is the signal 206 that can be converted into the conversion of 48KHz periodic law such as the digitized signal 126 of the different sampling rates of 8KHz, 12KHz and 16KHz.Can by such as with one perhaps the standard technique of many interpositionings carry out conversion.In certain embodiments, the sampling rate of digitized signal 126 can be constant, thereby make converter 205 optional.In these embodiments, digitized signal 126 can be directly coupled to Sub-band spectrum analysis function 210 and audio coding function 220.In certain embodiments, digitized signal 126 can be directly coupled to Sub-band spectrum analysis function 210 and audio coding function 220, and carries out in translation function can be among Sub-band spectrum analysis function 210 and audio coding function 220 one or both.Sub-band spectrum analysis function 210 is analyzed the energy in each of the ordered set of subband and is coupled sub belt energy result 211 to voting logic function 215, it is based on one in sub belt energy result 211 and the definite a plurality of agreements of bit-rates values 141, and each has the specific bandwidth that code word 182 is encoded.Definite agreement 216 (being also identified as selected bandwidth or selected agreement) is couple to audio coding function 220, and depends on sub belt energy result 211 and be couple to the bit-rates values 141 of Sub-band spectrum analysis function 210 and temporal evolution.Audio coding function 220 is used selected bandwidth 216 with coding the generated codeword 182 of combine digitalization 126 sound signals, thereby minimizes coding resource and reduce the required average bandwidth of transmit audio signals.Should be understood that, the low-frequency cutoff value (high-pass equipment) of a plurality of agreements is enough approaching numerically, makes the magnitude of the magnitude of upper cut off frequency and the bandwidth of agreement identical, that is, and and the higher higher upper cut off frequency of bandwidth association.
With reference to figure 3-5, according to some embodiment, Fig. 3 illustrates the block diagram of Sub-band spectrum analysis function 210, and Fig. 4 and Fig. 5 illustrate the sequential chart of some exemplary signal.Sub-band spectrum analysis function 210 comprises subframe Fast Fourier Transform (FFT) (FFT) function 305, energy spectrometer function 308, N is with the set of dividing function 310-326, and the set of N corresponding smoothing filter 330-345 and N correspondence have the set of hysteresis threshold function 350-365.Digitized signal 126 or the signal 206 being converted are couple to subframe FFT function 305, and it is with certain multiple of the frame rate of the speed corresponding to digitized signal 126 or the signal 206 that is converted, and for example 4, carry out Fast Fourier Transform (FFT).For example, 160 values of digitized signal 126 or the signal 206 that is converted can be contained in each frame or subframe.Routine techniques (for example, taper overlapping etc.) can be used to frame or subframe windowing and for carrying out FFT.The set of the value being generated by the FFT of every frame or subframe is couple to energy spectrometer function 308, its in a usual manner (for example, use FFT value absolute value square) each set of FFT value is converted to the set of corresponding energy spectrum Distribution Value.Energy spectrum for series of frames or subframe distributes, and as the set of FFT value, is with the frame in cycle or the distribution based on frequency of subframe speed generation.In one example, the quantitative value N of cutting apart 310-325, smoothing filter 330-345 and threshold value 350-365 for mark band is 4.In Fig. 4, the example of the signal 206 of digital audio signal 126 or conversion is shown as audio frequency and draws 405.Here, due to digital value (for example, digitized voltage sampling) in drawing relatively closely, so audio frequency to draw 405 seem continuous.At audio frequency, drawing below 405 is the drawing 410 that represents sound spectrum.Each perpendicular line comprise many representatives for frequency 0 and 24KHz between the gray-level value (pixel or point) of energy density of a frame.411 approximate by drawing with the crest frequency of non-zero energy value.For the maximum energy-density of every frame in 410 about half region of drawing preferably under peak value.An one example is the region 413 of drawing 410, and it is shown in the expanded view of Fig. 5.Other regions such as 410 the region 412 of drawing have more equally distributed energy.
Energy spectrometer is coupled to band dividing function 310-325, and it determines the total amount of energy in each subband.For the example of being used herein, it is 0-7KHz that subband scope is cut apart #1310 for band, and for band, cutting apart #2315 is 7-8KHz, and for band, cutting apart #3320 is 8-16KHz, and for band, to cut apart #4 (Fig. 3 is not shown) be 16-20KHz.Band is cut apart #1 and is identified as the frequency subband 415-418 in Fig. 4 to the exemplary frequencies range of #4.Should be understood that for the embodiment being represented by this example, the set of this subband is there is no overlapping covering 0 to the set of the whole frequency range subbands of 24KHz.In other embodiments, sets of subbands may not fill 0 to the whole bandwidth of 24KHz; Between subband, can there is gap.In certain embodiments, subband can be overlapping.Output with dividing function 310-325 is couple to smoothing filter 330-345, and it removes the variation high frequency effect too rapidly that will cause having the output of hysteresis threshold function 350-365.The output of smoothing filter 330-345 is couple to has hysteresis threshold function 350-365.Each has hysteresis threshold function 350-365 and is also couple to the threshold signal 371 from bias table 370.Threshold signal comprise by bit-rates values 141, determined for each, there is hysteresis and the biasing of hysteresis threshold function 350-365.Bit-rates values 141 is values in M value, and each in M value has the grade of hysteresis threshold function 350-365 for N is set, and this is used as selecting a factor for one of N agreement of coded signal 126,206.In certain embodiments, the different bandwidth of each protocol code signal 126,206.In example used herein, M be 3 and these 3 values be identified as low, in and high value.For each, have hysteresis threshold function 350-365, bit-rates values 141 is selected in M threshold value.Therefore, each possible M bit-rates values is selected the set corresponding to N threshold value of subband.Each has hysteresis threshold function 350-365 and generates the output valve as signal 211 parts.When input surpasses threshold value and surpasses duration of the first hysteresis, output valve is the first state (very), and when input is less than threshold value over duration of the second hysteresis, output valve is in the second state (vacation).Hysteresis can be identical for all subbands and can fix.In certain embodiments, for first and second hysteresis with hysteresis threshold function 350-365, can be 2N different value, in certain embodiments, the first and second N hysteresis can be selected by bit-rates values 141 from the set of M value.According to example as herein described, the first hysteresis is the 0 and second hysteresis does not have difference having between hysteresis threshold function 350-365, and in response to bit-rates values 141, changes.Yet (, threshold value does not change in response to bit-rates values 141.)
Referring back to Fig. 2, from the output signal 211 of Sub-band spectrum analysis function 210, be couple to voting logic function 215.Voting logic function 215 analytic signals 211 the value selection coding protocol based in output signal 211 the first state, N subband highest frequency of indication.For the object of input, the subband at this below frequency is also assumed to be at the first state.The bandwidth of selected coding protocol coded signal 126,206, that it comprises sound signal (digitized signal 126 or conversion signal 206) until there are those frequencies of the highest frequency sub-bands that surpasses corresponding threshold energy, and the lower frequency components of the sound signal on the high pass cut off frequency that is audio coding function 220 selected coding protocols.In certain embodiments, the low frequency component of all sound signals on high pass cut off frequency is comprised in the bandwidth of selected coding protocol.In certain embodiments, before Sub-band spectrum analysis 210 and/or audio coding 220, to input signal 126 application high passes or bandpass filtering, can be necessary or expectation, but this can not affect treatment step or processing logic significantly.In example as herein described, selected coding protocol is the agreement with the Last selected bandwidth of name in 7KHz bandwidth, 8KHz bandwidth, 12KHz bandwidth and 20KHz bandwidth, but this can actually correspond respectively at 10Hz to starting and upwards expand to the bandwidth of 7KHz between 500Hz, at 10Hz to the bandwidth that starts and upwards expand to 8KHz between 500Hz, at 10Hz to the bandwidth that starts and upwards expand to 12KHz between 500Hz, or at 10Hz to starting and upwards expand to the bandwidth of 20KHz between 500Hz.Other methods that identify selected coding protocol can be used apparently, its only two examples are coding bit rates, or the protocol value of index (for example 1 to 4).
Reference table 1, shows the set of threshold value according to some embodiment.This set is the set in the example that can be used to be described on herein, and can be included in bias table 370 (Fig. 3).For example, the maximal value of threshold value is 100, and the gross energy of signal 126,206 to have be 100 value.
Table 1
Should be understood that, when energy density is even, the gross energy from lowest sub-band to the highest each subband of subband will be respectively 35,5,20 and 40.When bit-rates values 141 is low and when energy density is even because the unique threshold value surpassing is the threshold value for 0-7KHz, so the corresponding output with hysteresis threshold function 350-365 from minimum to the highest will be true, false, false and false.Owing to being that really the highest subband is 0-7KHz subband for its threshold value, so selected bandwidth is 7KHz.When energy density is evenly and when bit rate is high, the corresponding output with hysteresis threshold function 350-365 from minimum to the highest will be true, true, false and true.Owing to being that really the highest subband is 12-20KHz subband for its threshold value, so voting logic function 215 selects to provide the agreement of 20KHz bandwidth.Drawing 405 in Fig. 4, below 410, show three and draw 420,425 and 430.Set for the threshold value similar to table 1, when input signal the 126, the 206th, during as the drawing 405 shown signal of Fig. 5, for three values (basic, normal, high) of bit-rates values 141, these drawing illustrate 216 reduced times of output of voting logic function 215.When bit-rates values generates and draws 420 while being low, when generating while being middle, bit-rates values draws 425, when generating while being high, bit-rates values draws 430.Can find out draw 420 with draw 425,430 compare higher proportion in time there is lowest-bandwidth value (7KHz), and draw and 430 420,425 compare the high bandwidth value of having of higher proportion in time with drawing.This difference can be exaggerated easily or reduce by suitably revising threshold value.The impact of the second hysteresis is significantly in the region 460 of drawing, and it illustrates the slow variation from high bandwidth to lower bandwidth, and the null value of the first hysteresis causes the quick variation from minimum to high bandwidth, and it is obvious in the region 450 of drawing.By have between the numerical value change that is being less than about 10 frames (energy density line) during in the incidence of output 216 (in the illustrated example from 420-430) very little this is true, the benefit of smoothing filter 330-345 execution filtering is obvious.
In certain embodiments, will be by using the maximum that arbitrary optional bandwidth surpasses to allow transmitted data rate if existed, so, transmitter system 170 can comprise logic has such bandwidth agreement to stop and used, by always keeping the data transfer rate of transmitting to allow the lower bandwidth agreement of transmitted data rate lower than maximum the selectional restriction of bandwidth.The indication receiving in protocol message based on being received by receiver system 140, this extra restriction can be merged in voting logic function 215.For example, this indication can be in order to select in several different value tables, wherein some have selected to get rid of the threshold value of the use of high bandwidth, if or selected bandwidth will cause excessive transmitted data rate, this indication can be the logic that is lower bandwidth by selected bandwidth change.
Should be understood that, by thering is definition by selecting the dirigibility of the selected threshold value of bit-rates values (and in certain embodiments corresponding hysteresis) set, according to channel status average emitted bit rate, can be lowered, simultaneously compare when forcing bit rate constraints in system using routine techniques and be more suitable for keeping audio quality.In certain embodiments, should be understood that, when the bandwidth temporal evolution of input signal, the audio bandwidth of coding protocol is expected with near as far as possible the mating of bandwidth of input signal.That is to say, definite threshold is so that the audio bandwidth of the coding protocol of Continuous Selection is followed the trail of the bandwidth of the variation of input signal during input signal by rule of thumb.The input signal using is the tonic train that one or more those typical expectation is encoded.Such configuration is suitable for reaching medium channel bit rate (so-called middle bit rate setting).For example, in certain embodiments, when can be limited for the channel bit rate of coding protocol and while producing the synthetic audio frequency of better sound during when input signal Bandwidth Reduction, Sub-band spectrum analysis function 210 can be biased to be conducive to compared with bass bandwidth coding protocol; So-called low bit rate setting.In certain embodiments, when higher channel bit rate can be for coding protocol, Sub-band spectrum analysis function 210 can be biased to be conducive to high audio bandwidth coding protocol; So-called high bit rate setting.In certain embodiments, during sound signal, the selection of the threshold value set from available set has been changed in the change of bit-rates values, as long as in fact in the restriction of the coding protocol using, this provides the change faster of average channel bit rate.This allows to use the better control of the aggregate bandwidth of some equipment of sharing bandwidth.
" be conducive to " mean that compared with bass bandwidth coding protocol threshold value is provided so that giving tacit consent to output will use bass bandwidth coding protocol to be encoded by rule of thumb, only for the limited time period, be switched to higher bandwidth coding protocol, it has similar to the channel bit rate of bass bandwidth coding protocol (for example,, in certain embodiments in 10%; Similarity tolerance limit can be up to 50% in other embodiments) channel bit rate.When the energy at higher subband enough the advantage of the perception of large to such an extent as to coding high audio bandwidth surpass while distributing to by minimizing cause compared with the quantity of the coded-bit of the sound signal in bass bandwidth deteriorated, will there is this switching.Bass bandwidth coding protocol encoded packets is containing lowest audio frequency subband and can comprise extremely and the bandwidth of (one or more) the higher subband that comprises specific high audio subband (but not being the highest subband).Bass bandwidth is determined based on the be encoded input signal of type of expectation, and can be based on theoretical method (for example, precision), empirical method (for example, expert listens to or mean opinion score (MOS) test) determine, or can be minimum coding protocol bandwidth available in special time system." be conducive to " high audio bandwidth and mean that threshold value is provided so that output will be used high audio bandwidth coding protocol to be encoded by rule of thumb, only for the following time period, be switched to lower bandwidth coding protocol, in this time period, high-frequency energy,, corresponding to the energy of the holder band in input signal, for the general hearer that listens, be for example imperceptible.High audio bandwidth coding protocol encoded packets is containing highest audio subband and can comprise down to and comprise specific for the bandwidth of (one or more) lower subband of bass subband.High audio bandwidth is determined based on the be encoded input signal of type of expectation, and can be based on theoretical method (for example, precision), empirical method (for example, expert listens to or mean opinion score (MOS) test) determine, or can be the highest available in special time system coding protocol bandwidth.For in above-mentioned, low and high bit rate, definite threshold value setting can, with the form of the correspondence table shown in table 1 (but have definite by rule of thumb value), be used in single embodiment by rule of thumb.For in single embodiment, low and high bit rate, can also determine by rule of thumb the first and second hysteresis.For in, low and high bit rate each in transition, the first and second hysteresis can be identical.
With reference to figure 6, according to some embodiment, some steps of the method 600 of coding audio signal are shown.Can be the personal communication devices such as cell phone or network flat board, or remote sensing equipment, or in fixed communication device, carry out coding.Needn't be according to the order execution step illustrating.In step 605, receive bit-rates values.This bit-rates values is in M bit-rates values set.This bit-rates values can have sign.When M is 3, the non-limiting example of such sign is: low, in and high, or index value (first, second etc.).In step 610, based on bit-rates values, select the set of energy threshold.The set of energy threshold is a plurality of, N, in energy threshold set one.The energy threshold of each set of energy threshold be take corresponding as basis one to one with the set of the subband of sound signal.(subband that therefore, also has N sound signal).In step 615, received audio signal.In step 620, determine the energy of each subband of the set of N subband.In step 625, determine the highest frequency sub-bands with the energy that surpasses corresponding threshold value.In step 630, the selected bandwidth of coding audio signal.Selected bandwidth is only included in has those frequencies that surpass the sound signal in the highest frequency sub-bands of corresponding threshold energy, and all lower frequencies of sound signal substantially.Should be understood that, step 605-610 can be with respect to step 615-620 before, afterwards or approximately carry out simultaneously.Step as herein described and be that step 615 and 620 can be carried out by Sub-band spectrum analysis function 210 with reference to the relation between the functional module described in figure 2; Step 605,610 and 625 can passing threshold logic function 215 be carried out, and step 630 can be carried out by audio coding function 220.
With reference to figure 7-9, according to some embodiment, some steps of the method 600 of coding audio signal are shown.At step 705 (Fig. 7), selected bandwidth is restricted to and can not causes surpassing the maximum bandwidth that allows the transmitted data rate of transmitted data rate.At step 805 (Fig. 8), based on bit-rates values, select hysteresis set.This value is corresponding to the subband of sound signal.Hysteresis comprises for changing into from lower selected bandwidth that the sluggishness of higher selected bandwidth postpones and for change at least one that the sluggishness of lower selected bandwidth postpones from higher selected bandwidth.At step 905 (Fig. 9), take the corresponding cycle as basis, one or more event responses in determining energy 620 for carrying out at least, determine the step of highest frequency sub-bands 625 and coding 630.Event can be the counting of interruption or other events.In certain embodiments, they can use the common cycle to be performed.In certain embodiments, periodic basis can be not all identical.For example, can carry out the step of determining energy 620 with the speed higher than definite highest frequency sub-bands 625.For some bandwidth decision-makings, this just will have the effect that increases time delay.In addition, step 615 received audio signal typically with than by Sub-band spectrum analysis function 210, carried out for determine each subband energy periodic basis (for example, audio frequency frame per second) larger periodic basis (for example, digitized audio samples rate) is carried out.
Illustrated processing in the literature, for example (but being not limited to), the method step of describing in Fig. 6-9 can be carried out with the instruction of having programmed being included on the computer readable medium that can be read by the processor of CPU.Computer readable medium can be any tangible medium that can store the instruction that will be carried out by microprocessor.This medium can be in CD dish, DVD dish, magnetic or CD, tape and the removable or non-removable storer based on silicon or comprise one or more in above-mentioned.Programming instruction can also be with the form carried of packetizing or the wired or wireless signal transmission of non-groupingization.
In instructions above, specific embodiment of the present invention has been described.Yet those of ordinary skill in the art should be understood that, in the situation that do not depart from the scope of the present invention as explained in claims below, can make various modifications and variations.As example, in certain embodiments, certain methods step can be carried out according to the order different from described order, the function of describing in functional block can differently be arranged (for example, bias table 370 and have hysteresis threshold piece 350-365 can be part voting logic function 215 rather than Sub-band spectrum analysis function 210).As another example, for the known any specific tissue of those skilled in the art and access technique, can be used in the table such as bias table 370.Correspondingly, instructions and picture are regarded as exemplifying and nonrestrictive meaning, and all such modifications are all intended within the scope of the present invention.Can cause any benefit, advantage or problem solution generation or become more important, necessary or essential characteristic or the element that significantly benefit, advantage, way to solve the problem and any (one or more) element should not be interpreted as any one or all authority requirement.The present invention, only by appended claim, is included in the application's any modification that timing is not made and as all equivalents of those claims of being issued, defines.

Claims (10)

1. for a method for coding audio signal, comprising:
Receive bit-rates values;
Based on described bit-rates values, select the set of energy threshold, wherein, the set of described energy threshold is in a plurality of energy threshold set, and wherein, the energy threshold of each set of energy threshold be take corresponding as basis one to one with the sets of subbands of described sound signal;
Receive described sound signal;
Determine the energy of each subband of described sets of subbands;
Determine the highest frequency sub-bands with the energy that surpasses corresponding threshold value;
Determine the selected bandwidth of described sound signal, described selected bandwidth is only included in those frequencies of the described sound signal in the described highest frequency sub-bands with the energy that surpasses described corresponding threshold value, and all lower frequencies of the described sound signal on high pass cut off frequency; And
The described selected bandwidth of encoding.
2. method according to claim 1, further comprises: the bandwidth by described selected limit bandwidth for not causing allowing the transmitted data rate of transmitted data rate over maximum.
3. method according to claim 1, further comprise: based on described bit-rates values, select hysteresis set, described hysteresis is corresponding to the described sets of subbands of described sound signal, wherein, described hysteresis comprises for changing into from lower selected bandwidth that the sluggishness of higher selected bandwidth postpones and for change at least one that the sluggishness of lower selected bandwidth postpones from higher selected bandwidth.
4. method according to claim 1, further comprises: during the described coding of described sound signal, take the corresponding cycle as the definite described energy of basis execution, determine the step of described highest frequency sub-bands and coding.
5. method according to claim 1, wherein, the described threshold value of two or more set of energy threshold is to make to exist two or more following conditions: be conducive to compared with bass bandwidth coding protocol, the audio bandwidth of selected described coding protocol is followed the trail of the bandwidth of the variation of input signal, and is conducive to high audio bandwidth coding protocol.
6. method according to claim 1, wherein, during described sound signal, the change of described bit-rates values has been changed from the selection of threshold value set described in a plurality of set.
7. for a device for coding audio signal, comprising:
Receiver, for receiving bit-rates values; And
Disposal system, for
Based on described bit-rates values, select the set of energy threshold, wherein, the set of described energy threshold is in a plurality of energy threshold set, and wherein, the energy threshold of each set of energy threshold be take corresponding as basis one to one with the sets of subbands of described sound signal;
Receive described sound signal;
Determine the energy of each subband of described sets of subbands;
Determine the highest frequency sub-bands with the energy that surpasses corresponding threshold value, and
Determine the selected bandwidth of described sound signal, described selected bandwidth is only included in those frequencies of the described sound signal in the described highest frequency sub-bands with the energy that surpasses described corresponding threshold value, and all lower frequencies of the described sound signal on high pass cut off frequency; And
The described selected bandwidth of encoding.
8. device according to claim 7, further comprises: the bandwidth by described selected limit bandwidth for not causing allowing the transmitted data rate of transmitted data rate over maximum.
9. device according to claim 7, further comprise: based on described bit-rates values, select hysteresis set, described hysteresis is corresponding to the described sets of subbands of described sound signal, wherein, described hysteresis comprises for changing into from lower selected bandwidth that the sluggishness of higher selected bandwidth postpones and for change at least one that the sluggishness of lower selected bandwidth postpones from higher selected bandwidth.
10. device according to claim 7, further comprises: during the described coding of described sound signal, take the corresponding cycle as the definite described energy of basis execution, determine the step of described highest frequency sub-bands and coding.
CN201280061303.3A 2011-12-12 2012-12-03 Apparatus and method for audio encoding Active CN103999154B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US13/316,895 US8666753B2 (en) 2011-12-12 2011-12-12 Apparatus and method for audio encoding
US13/316,895 2011-12-12
PCT/US2012/067532 WO2013090039A1 (en) 2011-12-12 2012-12-03 Apparatus and method for audio encoding

Publications (2)

Publication Number Publication Date
CN103999154A true CN103999154A (en) 2014-08-20
CN103999154B CN103999154B (en) 2015-07-15

Family

ID=47358302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201280061303.3A Active CN103999154B (en) 2011-12-12 2012-12-03 Apparatus and method for audio encoding

Country Status (7)

Country Link
US (1) US8666753B2 (en)
EP (1) EP2791936A1 (en)
JP (1) JP5775227B2 (en)
KR (1) KR101454581B1 (en)
CN (1) CN103999154B (en)
CA (1) CA2859013C (en)
WO (1) WO2013090039A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107408392A (en) * 2015-04-05 2017-11-28 高通股份有限公司 Audio bandwidth selects
CN108172239A (en) * 2013-09-26 2018-06-15 华为技术有限公司 The method and device of bandspreading
CN110024029A (en) * 2016-11-30 2019-07-16 微软技术许可有限责任公司 Audio Signal Processing

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6556473B2 (en) * 2015-03-12 2019-08-07 株式会社東芝 Transmission device, voice recognition system, transmission method, and program
KR20180040716A (en) 2015-09-04 2018-04-20 삼성전자주식회사 Signal processing method and apparatus for improving sound quality
US11037581B2 (en) 2016-06-24 2021-06-15 Samsung Electronics Co., Ltd. Signal processing method and device adaptive to noise environment and terminal device employing same
WO2018086972A1 (en) * 2016-11-08 2018-05-17 Koninklijke Philips N.V. Method for wireless data transmission range extension
CN112530444B (en) 2019-09-18 2023-10-03 华为技术有限公司 Audio coding method and device
CN112599140A (en) * 2020-12-23 2021-04-02 北京百瑞互联技术有限公司 Method, device and storage medium for optimizing speech coding rate and operand

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5285498A (en) * 1992-03-02 1994-02-08 At&T Bell Laboratories Method and apparatus for coding audio signals based on perceptual model
CN1659785A (en) * 2002-05-31 2005-08-24 沃伊斯亚吉公司 Method and system for multi-rate lattice vector quantization of a signal
CN1748443A (en) * 2003-03-04 2006-03-15 诺基亚有限公司 Support of a multichannel audio extension
EP1703493A2 (en) * 1994-08-10 2006-09-20 Qualcomm Incorporated Method and apparatus for selecting an encoding rate in a variable rate vocoder
CN1860526A (en) * 2003-09-29 2006-11-08 皇家飞利浦电子股份有限公司 Encoding audio signals
US20100324708A1 (en) * 2007-11-27 2010-12-23 Nokia Corporation encoder

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5115240A (en) 1989-09-26 1992-05-19 Sony Corporation Method and apparatus for encoding voice signals divided into a plurality of frequency bands
IT1281001B1 (en) * 1995-10-27 1998-02-11 Cselt Centro Studi Lab Telecom PROCEDURE AND EQUIPMENT FOR CODING, HANDLING AND DECODING AUDIO SIGNALS.
US6091723A (en) * 1997-10-22 2000-07-18 Lucent Technologies, Inc. Sorting networks having improved layouts
JP2006018023A (en) 2004-07-01 2006-01-19 Fujitsu Ltd Audio signal coding device, and coding program
CN101512639B (en) 2006-09-13 2012-03-14 艾利森电话股份有限公司 Method and equipment for voice/audio transmitter and receiver

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5285498A (en) * 1992-03-02 1994-02-08 At&T Bell Laboratories Method and apparatus for coding audio signals based on perceptual model
EP1703493A2 (en) * 1994-08-10 2006-09-20 Qualcomm Incorporated Method and apparatus for selecting an encoding rate in a variable rate vocoder
CN1659785A (en) * 2002-05-31 2005-08-24 沃伊斯亚吉公司 Method and system for multi-rate lattice vector quantization of a signal
CN1748443A (en) * 2003-03-04 2006-03-15 诺基亚有限公司 Support of a multichannel audio extension
CN1860526A (en) * 2003-09-29 2006-11-08 皇家飞利浦电子股份有限公司 Encoding audio signals
US20100324708A1 (en) * 2007-11-27 2010-12-23 Nokia Corporation encoder

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172239A (en) * 2013-09-26 2018-06-15 华为技术有限公司 The method and device of bandspreading
CN107408392A (en) * 2015-04-05 2017-11-28 高通股份有限公司 Audio bandwidth selects
CN110024029A (en) * 2016-11-30 2019-07-16 微软技术许可有限责任公司 Audio Signal Processing
CN110024029B (en) * 2016-11-30 2023-08-25 微软技术许可有限责任公司 audio signal processing

Also Published As

Publication number Publication date
CA2859013A1 (en) 2013-06-20
CA2859013C (en) 2016-01-26
US20130151260A1 (en) 2013-06-13
WO2013090039A1 (en) 2013-06-20
KR101454581B1 (en) 2014-10-28
EP2791936A1 (en) 2014-10-22
KR20140085596A (en) 2014-07-07
JP5775227B2 (en) 2015-09-09
US8666753B2 (en) 2014-03-04
JP2015505991A (en) 2015-02-26
CN103999154B (en) 2015-07-15

Similar Documents

Publication Publication Date Title
CN103999154B (en) Apparatus and method for audio encoding
CN105872253B (en) Live broadcast sound processing method and mobile terminal
CN103886857B (en) A kind of noise control method and equipment
US20130343560A1 (en) Method and Apparatus for Reducing Noise in Voices of Mobile Terminal
CN104364842A (en) Stereo audio signal encoder
CN103915098A (en) Audio signal encoder
US9799339B2 (en) Stereo audio signal encoder
WO2019129350A1 (en) Determination of spatial audio parameter encoding and associated decoding
US20200273467A1 (en) Determination of spatial audio parameter encoding and associated decoding
CN104285452A (en) Spatial audio signal filtering
WO2020016479A1 (en) Sparse quantization of spatial audio parameters
WO2019105575A1 (en) Determination of spatial audio parameter encoding and associated decoding
EP2834815A1 (en) Adaptive audio signal filtering
CN101202042A (en) Expandable digital audio encoding frame and expansion method thereof
CN101309085B (en) Method for dynamically adjusting audio decoding process and method for decoding audio information
WO2015126228A1 (en) Signal classifying method and device, and audio encoding method and device using same
CN102395097A (en) Method and system for down-mixing multi-channel audio signals
CN104038772A (en) Ring tone file generation method and device
WO2020146867A1 (en) High resolution audio coding
KR20040034442A (en) Apparatus and Method of Adapting Audio Signal According to User's Preference
WO2019197713A1 (en) Quantization of spatial audio parameters
RU2006134658A (en) SYSTEM, METHOD AND PROGRAM FOR SOURCE SOURCE
GB2574873A (en) Determination of spatial audio parameter encoding and associated decoding
CN108520761A (en) A kind of bright read apparatus and sound effect treatment method with sound regulatory function
JP6120206B2 (en) Acoustic code encoding / decoding device and acoustic code encoding / decoding method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20160406

Address after: American California

Patentee after: Technology Holdings Co., Ltd of Google

Address before: Illinois State

Patentee before: Motorola Mobility, Inc.