WO2006025798A1 - A method and system for monitoring of acoustic signals - Google Patents

A method and system for monitoring of acoustic signals Download PDF

Info

Publication number
WO2006025798A1
WO2006025798A1 PCT/SG2005/000294 SG2005000294W WO2006025798A1 WO 2006025798 A1 WO2006025798 A1 WO 2006025798A1 SG 2005000294 W SG2005000294 W SG 2005000294W WO 2006025798 A1 WO2006025798 A1 WO 2006025798A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
segment
section
frequency
acoustic
Prior art date
Application number
PCT/SG2005/000294
Other languages
French (fr)
Inventor
John Robert Potter
Eric Delory
Teong Beng Koay
Mandar A. Chitre
Sheldon Ruiz
Original Assignee
National University Of Singapore
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University Of Singapore filed Critical National University Of Singapore
Publication of WO2006025798A1 publication Critical patent/WO2006025798A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B11/00Transmission systems employing sonic, ultrasonic or infrasonic waves
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01HMEASUREMENT OF MECHANICAL VIBRATIONS OR ULTRASONIC, SONIC OR INFRASONIC WAVES
    • G01H3/00Measuring characteristics of vibrations by using a detector in a fluid
    • G01H3/04Frequency
    • G01H3/08Analysing frequencies present in complex vibrations, e.g. comparing harmonics present

Definitions

  • the invention relates broadly to a method of audible monitoring of acoustic signals outside human hearing range and to a real-time system for audible monitoring of acoustic signals outside human hearing range.
  • the prior art includes ultrasound detectors and bandwidth shifters, which detect the presence of ultrasound, but do not provide a very representative sense of what the original 'sounds like'. Either there is no bandwidth compression at all (as for heterodyning) or the output is significantly distorted (as for frequency dividers).
  • the first successful ultrasound detectors were of the superheterodyne type. In this approach, a narrow band of frequencies (e.g., 10kHz wide) from the microphone signal is amplified, converted to a higher frequency for additional amplification, converted again to an audible frequency, and finally used to drive a loudspeaker or headphones. This does not compress the bandwidth, it simply translates a small part of the ultrasound band into the human hearing band.
  • a major drawback to superheterodyne detectors is that they miss signals occurring outside the frequency band to which they happen to be tuned. As a result, several ways have been devised to produce a broadband response, thereby allowing the user to listen "at all frequencies at once".
  • An early envelope-detector design produced a signal that outlined the shape of each ultrasound pulse by following the peaks of their individual cycles.
  • a drawback in this case is that while the intensity and repetition rate of the input are captured, frequency information is lost. Longer sounds, like those produced by bats, do not produce clear output, and a continuous input signal is not registered at all. This type of detector is no longer widely used, and because it currently offers little economic advantage over superheterodyne circuitry, is found only in the very cheapest instruments.
  • Time-expansion methods are limited by the fact that the total information that can be expressed in a signal is inherently limited by the product of the time and frequency range involved (the time-bandwidth product). Since ultrasound detectors work in real-time with a narrow spectral bandwidth, this product is reduced compared with the original signal, so information potential must be discarded. In choosing a particular real-time detection method, the observer must choose what information is required and what can be dispensed with.
  • a third major area of development in frequency and bandwidth compression is in the correction of speech from divers at depth breathing Helium gas mixtures.
  • Helium mixtures have physiological advantages for deep divers, but under pressure
  • Helium significantly increases the resonant frequencies of a diver's vocal tract, so that the diver's speech is very high pitched and is difficult for untrained listeners to understand.
  • Many Helium decoders have been designed for commercial production. The requirement for real-time compression, is not to miss any important speech, so that time-expansion is inappropriate.
  • Modern digital approaches use algorithms that are very effective, but that rely on the structure of speech and exploit limitations in the time-frequency content.
  • An example of such an approach is the use of high order linear prediction to deconvolve and scale the vocal components.
  • a method of audible monitoring of acoustic signals outside human hearing range comprising the steps of dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section being representative of one corresponding segment and having a different duration than the corresponding segment; and deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
  • Each section in said intermediate signal may be derived using frequency domain compression of said segment.
  • Each said frequency domain compression may comprises obtaining a series of complex frequency estimates in the frequency domain for each segment.
  • the method may further comprises replacing the series of complex frequency estimates with one or more single estimates, each single estimate being derived from m sequential ones of the series of complex frequency estimates.
  • Each single estimate may be derived such that information corresponding to the amplitude and phase is preserved.
  • Each single estimate may comprises a centroid for the corresponding m sequential complex frequency estimates.
  • Deriving each single estimate may comprises weighting the m sequential estimates according to the proximity to said each single estimate.
  • Each section in said intermediate signal may be derived using time domain compression of said segment.
  • Said time domain compression may comprise forming each section by truncating the corresponding segment in the time domain and concatenating the sections.
  • f h i is the upper end of an original bandwidth in the acoustic signal
  • f ma x is the maximum audible frequency .
  • the method may further comprise down sampling the intermediate signal based on the compression factor to derive the audible output signal.
  • Each section in said intermediate signal may be derived using time domain expansion of said segment.
  • Each section in said intermediate may comprise a repetition of said segment according to a repetition factor and the sections are concatenated.
  • Said step of sampling the intermediate signal may be up-sampled based on the repetition factor to derive the audible output signal.
  • Said concatenation of the sections may comprise applying a smoothing function to each section.
  • the smoothing function may comprise a cosine bell window.
  • Dividing the acoustic signal may comprise searching for a portion of said acoustic signal above a selected power threshold and defining a retention portion of one of the segments of the acoustic signal at said portion.
  • the method may comprise a method of compressing and/or expanding signal bandwidth for real-time audible monitoring of acoustic signals outside human hearing range.
  • a system for real time audible monitoring of acoustic signals outside human hearing range comprising: an analog to digital converter or sampler for sampling the acoustic signal; a processor for dividing said acoustic sampled acoustic signal into segments of a selected duration and for deriving an intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and a digital to analog converter for deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
  • Each section in said intermediate signal may bederived by the processor using frequency domain compression of said segment.
  • Said frequency domain compression may comprise obtaining a series of complex frequency estimates in the frequency domain for each segment.
  • the processor may replaces the series of complex frequency estimates with one or more single estimates, each single estimate being derived from m sequential ones of the series of complex frequency estimates. Each single estimate may be derived by the processor such that information corresponding to the amplitude and phase is preserved.
  • Each single estimate may comprise a centroid for the corresponding m sequential complex frequency estimates.
  • Deriving each single estimate by the processor may comprise weighting the m sequential estimates according to the proximity to said each single estimate.
  • Each section in said intermediate signal may be derived by the processor using time domain compression of said segment.
  • Said time domain compression may comprise forming each section by truncating the corresponding segment in the time domain and concatenating the sections.
  • Said segment may be truncated according to a compression factor (C):
  • the sampler may down samples the intermediate signal based on the compression factor to derive the audible output signal.
  • Each section in said intermediate signal may be derived by the processor using time domain expansion of said segment.
  • Each section in said intermediate may comprise a repetition of said segment according to a repetition factor and the sections are concatenated.
  • the sampler may sample the intermediate signal is up-sampled based on the repetition factor to derive the audible output signal.
  • Said concatenation of the sections may comprise applying a smoothing function to each section by the processor.
  • the smoothing function may comprise a cosine bell window.
  • Dividing the acoustic signal may comprise searching for a portion of said acoustic signal above a selected power threshold and defining a retention portion of one of the segments of the acoustic signal at said portion.
  • the device may be implemented as a portable device.
  • the portable device may be incorporated into a diving harness.
  • a data storage medium having stored thereon code means for instructing a computer to execute a method of audible monitoring of acoustic signals outside human hearing range, the method comprising the steps of dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
  • FIG. 1 is a block diagram of the hardware modules according to an example embodiment of the present invention.
  • Figure 2 is a block diagram of the processing means according to an example embodiment of the present invention
  • Figure 3 is a front view of the hardware, installed and ready for use according to an example embodiment of the present invention
  • Figure 4 is a schematic diagram of the bandwidth compression algorithm according to an example embodiment
  • Figure 5 is a schematic diagram of the bandwidth expansion algorithm according to an example embodiment
  • Figure 6 is a graph of input output cross-correlation for variations in parameter values according to an example embodiment
  • Figure 7 is a flow chart of a method for real-time audible monitoring of acoustic signals according to an example embodiment
  • Figure 8 is process diagram according to an example embodiment of the present invention
  • Figure 9 is flow diagram according to an example embodiment of the present invention.
  • Figure 10 is flow diagram according to an example embodiment of the present invention.
  • Figures 11 to 21 are circuit diagrams of an implementation according to an example embodiment of the present invention.
  • Ultrasound monitoring is a way to detect electrical discharge and high-pitch structural noise produced by a system, and can be useful for monitoring one or more conditions of a system.
  • UM can be used to detect arcing in electrical circuits and in electrical cabling; leaks in boiler, pressure and vacuum vessel. Being able to identify 'signatures' from these ultrasounds may permit correlation of the signature to a system fault (or behaviour), rather than just detecting the fault, which could greatly increase productivity. Similarly it may also be useful in real-time to monitor other inaudible sounds for example infrasound.
  • One embodiment of the present invention includes a real-time processing engine that may translate at least an inaudible signal to an audible signal while preserving the 'signature' of the original. For example, a diver could swim along with a dolphin and interact with the dolphin based on the ultrasonic 'communication'.
  • An example embodiment of the present invention is now described with reference to Figure 1.
  • An input device or transducer (generally depicted as 100) provides an input signal 102 representative of an acoustic signal.
  • the input signal is provided as input to a processing unit 104.
  • the processing unit104 processes the input signal 102 and provides an output signal 106 including an audible representation of the detected input acoustic signal.
  • the output signal 106 is provided to an output device or transducer (generally depicted as 108).
  • the output device 108 generates an audible output acoustic signal representation of the detected input acoustic signal
  • input device 100 may comprise a hydrophone 110, a broadband microphone 112, a vibration sensor 114 or a microphone nozzle 116.
  • input device 100 has a broadband response to at least infrasonic, audible and ultrasonic acoustics.
  • input device 100 may be limited to a specific range within infrasonic, audible and ultrasound.
  • a hydrophone 110 may be applicable.
  • An example hydrophone may have sensitivity of -186+3dB re 1V/ ⁇ Pa and a directivity of 360°+2dB in the radial axis and
  • the input response might for example be 10-48OkHz for such a transducer.
  • Processing unit 104 may for example comprise an electronic circuit, a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP) or a computer. Generally the choice will depend on the application. In a miniature portable application for example, a DSP might be chosen that is capable of processing 4800MIPS.
  • FPGA Field Programmable Gate Array
  • DSP Digital Signal Processor
  • FIG. 2 shows one embodiment of the processing unit 104 in block diagram form.
  • a connector 200 provides connections to external components for example an input device 100 and an output device 108. The signals from the connector 200 are then provided internally of a casing 204, to components within.
  • the input signal 102 is provided first to an input signal-conditioning module 206 for conditioning.
  • the conditioned analogue signal 208 is then provided to an Analogue to Digital Converter (ADC) module 210 for conversion to the digital domain.
  • ADC Analogue to Digital Converter
  • the ADC 210 could provide 14bit input sampling with 1 input channel.
  • the ADC 210 generates a digital input signal 212 which is provided (via a series of interrupts and Enhanced Direct Memory Access (EDMA) transfers) to a DSP 214 for processing to generate a digital output 216.
  • EDMA Enhanced Direct Memory Access
  • the DSP 214 may include storage, for example Random Access Memory (RAM) and/or Read Only Memory (ROM).
  • RAM Random Access Memory
  • ROM Read Only Memory
  • the storage could be used to store and execute software, for example, any of the algorithms described herein implemented as software. Alternatively such software could be provided on an external data storage medium and interfaced with the DSP 214.
  • the digital output 216 is provided to a Digital to Analogue Converter (DAC) module 218 for conversion to the analogue domain.
  • the DAC 218 generates an analogue output signal 220 which is provided to an output-conditioning module 222 for conditioning.
  • the conditioned analogue output signal 106 is then conveyed externally of the casing 204 through the connector 200 to the output device 108, which then generates the audible sound, visible indication or vibration representing the original acoustic signal.
  • the various circuits within the casing 204 require a power source.
  • a battery 226 or other energy storage could be provided.
  • an AC supply could be converted to DC voltage, by way of example only.
  • the battery 226 is connected via a DC charging bus 228 to a battery-charging module 230.
  • the battery-charging module 230 may receive an AC or DC supply 232 from charging inlet 202 via connector 200.
  • the battery 226 may be selected depending on application, for example a Lithium-ion 7.2V battery pack.
  • the output supply from the battery 226 is supplied via a DC output bus 234 to a low drop off regulator (LDO) 236, which in turn supplies digital circuits via digital low voltage DC bus 238 and analogue circuits via analogue low voltage DC bus 240.
  • LDO low drop off regulator
  • the battery might range in voltage from 6-12V and have an expected life of 19 hours (based on 2W consumption in detection mode and 5OmW in standby mode).
  • DSP module 214 is programmed to provide an appropriate representation of the input depending on the application. For example a method is described later for a compression algorithm which provides an audible representation of a broad band of inaudible frequencies.
  • output device 108 may comprise a bone conduction piezo ⁇ electric transducer, an earphone/headphone 120, a speaker 122 or a visible indicator (not shown). Generally the choice will depend on the application. One skilled in the art will appreciate certain forms of transducer will be appropriate in a given application. In one embodiment output device 108 has a response of between 1 and 2OkHz for 44ksps processed signal. In a marine or underwater application, for example, a bone conduction piezo-electric transducer 118 may be applicable. Underwater application Referring to Figure 3 an example apparatus is shown for implementing an embodiment of the present invention in a marine or underwater application.
  • Hydrophone system 300 (including pre-amplifier and separate battery) is attached to a diver's BCD 304 or similar.
  • the system 300 can be configurable to allow a pair of hydrophones 302 to be held in a rigid relationship for directional mode operation.
  • a processing unit 304 forming part of the system 300 may be switched to directional mode.
  • the processing unit 304 may be placed in the pocket of the BCD 306, on a custom harness or similar and may have an LED signal strength display for directional information.
  • Underwater earphone 308 can be mounted on the diver's mask strap next to the bony region immediately in front of the ear to provide bone conduction hearing. It may be desirable in non-intrusive monitoring (such as monitoring the behavioural ultrasound communications between marine mammals with minimum disturbance) to minimise reflection disturbance to the animal. Practically such a system should be sensitive enough to detect animals before the animals detect the human, who is operating the equipment.
  • the received signal power (P R ) is given by
  • P ⁇ is transmitted signal power
  • SPG is the signal processing gain (for example the physiological of the animal might provide some detection advantage to reflections, if it exists)
  • TS is the Target Strength
  • are the Spreading Losses and absorption's associated with the outward and inward signal path.
  • the Target Strength of a diver might be around 0 dB with proper wet suit materials. Assuming the mammal is able to reconstruct the received signal in a near-perfect way, the Spreading Loss would behave approximately as 20 Log 10 (range) for the first 40 m or so, and then 10 Log 10 (range) from 40 m onwards.
  • the diver equipped with an embodiment of the present invention may have advantage of detecting the bio-acoustic signal of the target animal before the animal detects the diver. This removes the diver from being a possible acoustic disturbance as long as he/she stays far away.
  • FIGS 11 to 21 show circuit diagrams of an implementation of an example embodiment of the invention.
  • Figure 11 shows the CPLD connections, external switches and indicators of the example embodiment.
  • Figure 12 shows the Peripheral expension connectors of the example embodiment.
  • Figure 13 shows the Power and ground pin connections of DSP core of the example embodiment.
  • Figure 14 shows the SDRAM connections to the DSP core of the example embodiment.
  • Figure 15 shows the Boot Flash connections to the DSP core of the example embodiment.
  • Figure 16 shows the JTAG emulation port connections of the example embodiment.
  • Figure 17 shows the McBSP connections from DSP for ADC board of the example embodiment.
  • Figure 18 shows the buffering of crystal clocks for CPLD of the example embodiment.
  • Figure 19 shows the Connections of DSP to ADC and the codec used as DAC of the example embodiment.
  • Figure 20 shows the Clock connections to the DSP core of the example embodiment.
  • Figure 21 shows the on board DC/DC converter of the example embodiment.
  • a TMS320C6416 processor running at 600MHz is utilised.
  • the processor has a Flash that is connected to EMIFB CE2 as a boot Flash, a THS14F01 ADC configured at 500ksps connected to EMIFA, and an AIC23 codec connected to McBSP as output.
  • a built in power supply module steps down and regulates the Li-Ion supply voltage (7.2V) to ⁇ 5V, 5V and 3.3V.
  • An underwater headphone system is driven by the AIC23 codec output while the input signal is obtained from an ITC1042 hydrophone that is signal conditioned by high-impedance voltage follower in the analogue board.
  • the THS14F01 ADC digitises the data from the analogue board output.
  • DSP module 214 provides the processing from a wide band ultra and/or infra sonic signal to an audible signal which retains the essential characteristics or signature of the original signal in real-time.
  • DSP 214 is programmed according to a method described below. A compression algorithm is used for ultrasound and an expansion algorithm is used for infra sound.
  • Compression/expansion may either be in the frequency domain or the time domain.
  • Such programs or software including an algorithm can either be stored on internal
  • an example embodiment is shown in a flow chart for a method of audible monitoring of acoustic signals outside human hearing range.
  • the acoustic signal is divided into segments of a selected duration.
  • an intermediate signal comprising a concatenation of sections is derived, each section being representative of one corresponding segment and having a different duration than the corresponding segment.
  • an audible output signal is derived from the intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
  • Process step 800 includes two hardware interrupt (HWI) driven Enhance Direct Memory Access
  • Process step 802 includes a software interrupt (SWI) that will initiate a process to consolidate the data into the i/p buffer.
  • Process step 804 includes a software interrupt (SWI) that will initiate a process to consolidate the data either from the o/p buffer.
  • Process step 806 relates to the main compression algorithm, which repeatedly polls the data in the i/p buffer and places the result in the o/p buffer after compression.
  • Process step 808 runs on power up including initialisations and enabling the interrupts.
  • FIG 9 an example embodiment of the main compression algorithm is shown.
  • the sampled data is first transferred to a pair of ping-pong buffer, and the data is unpacked to 16-bit, separated channel data and placed in two input buffers (i/p buffer left channel and right channel) waiting to be processed.
  • step 902 the availability of the data in i/p buffer is checked.
  • step 904 if there is enough data, the input signal is compressed, and in step 908 the result placed into o/p buffer (one each for left-right channel).
  • step 904 there are two intermittent buffers used, one is used for windowing process (in step 904) and the other is for interpolation and low-pass filtering (in step 906).
  • step 910 the data in the o/p buffers is then repackaged and transferred into separate pairs of ping-pong buffers, and then sent to DAC for playback.
  • FIG. 10 shows an example embodiment of the time domain compression process in realtime.
  • a check is done for any data to be discarded (eg: a time domain algorithm that retains a window of data, discard a proportion of subsequent data, and then reduce the sampling rate).
  • step 1001 as many samples are discarded as possible if there is something to discard.
  • step 1002 compression only starts if the o/p buffer will not be overflowed by the processed data.
  • step 1004 compression starts whenever the i/p buffer contains enough samples. The rest of the steps are skipped if there is either not enough data left for processing or by processing the data will overflow the o/p buffer.
  • step 1006 the data section with high energy is located (because most likely there will be signal above ambient noise) and then the initial retained window is placed starting there.
  • step 1008 the end of the retained window is adjusted to a near zero value with a matching gradient to the start gradient of previous section.
  • step 1010 a smoothing widow is applied to the retained window.
  • step 1012 the signal is transferred to the o/p buffer, the number of samples to be discarded is calculated and the discard counter is updated. The steps are repeated and the respective samples to be discarded will then be removed at the beginning (step 1000). Any higher priority process (such as HWI and SWI that transport data from buffers to input/output peripherals) can pre-empt the compression process to ensure no data is missed and ensure continuous playback. The compression process is continued after any higher priority processes are completed.
  • the constraining assumption in an example embodiment means that each time sample is potentially as important as any other is. Similarly, in the frequency domain, each frequency sample is as important as any other is. Since one objective is to reduce the frequency bandwidth, not the temporal duration, one alternative is to operate the method in the frequency domain. Movement between the frequency and temporal domains by means of the forward and inverse Fourier Transforms is possible, since these are orthogonal. Use of the Discrete Windowed Fourier Transform (DWFT) and its inverse is appropriate in an example embodiment, since the input series is finite.
  • DWFT Discrete Windowed Fourier Transform
  • the example embodiment incorporates restrictions imposed by Shannon's sampling theorem and the Nyquist frequency limit. The inaudible time series is divided into segments, then each segment is compressed and the segments are concatenated to resemble a new but representative time series in audio band.
  • the compression is performed in the frequency domain. Initially the signal is segmented, each segment of the time series is transformed into the frequency domain, the frequency band is compressed, and transformed back to the time domain. This results in play back with the new (reduced) sampling rate.
  • the signal may be divided into short discrete sections Discrete Windowed Fourier Transforms (DWFT) and inverse-DWFT can be used in the example embodiment.
  • DWFT Discrete Windowed Fourier Transforms
  • inverse-DWFT can be used in the example embodiment.
  • complex frequency estimates are grouped and each group is replaced with a value that is representative of both the amplitude and phase information of the particular group.
  • the number of frequency estimates in each group may be determined by the desired frequency compression ratio. Since nothing is assumed about the signal in one embodiment, all the frequency estimates are treated with equal priority.
  • the frequency estimates are grouped evenly across the frequency band.
  • the result is that the high-resolution frequency structures are removed and the gross signature of the signal is retained.
  • This is achieved in the example embodiment by convolving the frequency estimates with a rectangular window, followed by reducing the sampling frequency accordingly.
  • the convolution process effectively low pass filters the frequency estimates, where length of the windows is related to the compression ratio desired.
  • An example series might include a zero-mean, detrended sequence of 2N data points in time regularly sampled at an input sampling frequency of fj with zero energy at frequencies above f/2 Hz.
  • the input sampled signal has a duration of 2N/fj seconds.
  • N complex frequency estimates are obtained.
  • One objective is to reduce bandwidth, so correspondingly the number of frequency estimates may be reduced or they may be placed closer together in the frequency domain. Placing them closer together increases the frequency resolution, which is equivalent to extending the time in the temporal domain (time dilation), which is less desirable (I.e. not real-time monitoring). Therefore the N estimates are replaced with n, where n ⁇ N and the desired compression factor is n/(n+N), in an example embodiment.
  • the assumption is that no particular frequency estimate is more important than any other.
  • the result is that frequency estimates are culled evenly across the bandwidth. Every sequential set of m of the N frequency estimates is replaced with a single one that best represents the m originals in some sense according to this assumption.
  • the centroid of each set of m complex numbers in the complex plane is used, which is mathematically equivalent to the arithmetic mean of these numbers. The centroid preserves both amplitude and phase information. This is mathematically equivalent to convolving the original N frequency estimates with a 'top-hat' square window filter of width m and then regularly sub-sampling by factor m.
  • the desired bandwidth compression can then be obtained by dividing the frequency of each retained estimate by m (equivalent to dividing the sampling frequency by m in the time domain) and applying the inverse DWFT to obtain a time series that can be played at f/m sampling rate in the same time as the original series occupied.
  • This example embodiment is effective at low-pass filtering in frequency space, so that highly-structured detail in frequency modulations are removed while the overall form of the frequency structure of the signal is retained.
  • the time-bandwidth product is reduced by a factor m, resulting in discarding of information.
  • the approach of regularly (in frequency space) replacing each consecutive set of m estimates with a single estimate in this example embodiment satisfies the underlying assumptions about the information distribution. It will be appreciated that the centroid is not the only possible choice for determining the single estimate.
  • Other methods of generating a single complex number to represent the set of m originals may used in different embodiment which preserve in some sense the amplitude and phase.
  • the example embodiment involves the convolution of the frequency estimates with a square 'top-hat' window.
  • the time domain algorithm may have the advantage of reduced computational load. As a result, it reduces signal processing power requirement; hence making it more suitable for a real-time, battery-operated system. It has been recognised by the inventors that the theoretically optimal solution of a low pass filter in the frequency domain corresponds to multiplication by a square 'top-hat' window in the time domain.
  • the acoustic signal is initially divided into a series of segments as for the frequency domain compression embodiments described above. In one embodiment, a section in each segment is retained in the time domain and the remaining portion of the segment is truncated. This is repeated throughout the entire time series. The ratio between the retained section and the discarded portions is related to the compression ratio.
  • the intermediate signal may be audibly replayed in the same time as the original signal occupied, thereby operating in real-time.
  • an example series 400 is shown in the time domain sampled at frequency f s over a period of one second, yielding f s samples or segments e.g. 401. For each segment e.g. 401, a section of data (t) is retained and the following section (T) is discarded according to
  • the retained windows (t) are concatenated yielding yc samples e.g. 402 lasting 1/C seconds at f s sampling rate.
  • the operation of compressing one block of data is implemented on multiple sequential blocks of data and the blocks are concatenated.
  • the length of retention window (t) preferably is at least as long as required to capture the lowest frequency of interest after bandwidth compression, and not so long as to risk undersampling the temporal modulation structure of the signal.
  • the retention window length is therefore in the region of a few milliseconds to tens of milliseconds in an example embodiment. It may also be selected interactively by the user. If left with coarse transitions, the concatenation process may produce spuribus noise due to discontinuity between the edges of subsequent sections that are joined together. This defect or artifact may be corrected.
  • one method in an example embodiment is to apply a smoothing window to each segment and introduce overlaps when concatenating them.
  • Another example method is to allow some flexibility when defining the retention sections in the time series to minimise the level difference. Such methods can significantly reduce spurious noise.
  • a smoothing window can be applied that tapers the data to zero at each end of the retention window.
  • the selected smoothing windows can be made to overlap or 'feather' the exiting and entry points together. This is analogous to the windowing in Welch's spectral estimation procedure for DWFT.
  • a cosine bell smoothing window is used, given by
  • This retention window has the property that
  • the smoothing window may be applied to the signal amplitude, rather than power. If incoherent frequency content is assumed in an example embodiment, the smoothing window is applied to the power, and therefore the square root of the amplitude.
  • the length of this smoothing window should be shorter than t/2, i.e. half the length of the sub-selected data block. As an example k might vary between 10-25% of t.
  • the choice of crossover window size may be operator-controlled in a real-time system embodiment.
  • Another alternative involves that, when the algorithm jumps forward, discarding T points, the start point of the retention window is provisionally selected. The start point is then stepped through the data until the signal value becomes negative. The start point is then stepped again through the data until the first positive value is encountered. This is used as the final selected starting point for the retention window.
  • the ending point for the retention window is then set to the starting point + 1 and the same stepping procedure adopted to find the first positive value after a negative value.
  • the end point is then stepped back one point.
  • This alternative can ensure that both the starting and ending points of retention windows have near zero values with positive gradients.
  • the retention windows are concatenated, there is therefore minimal step discontinuity, and reasonable gradient matching i.e. at least the gradients are of the same sign, may be obtained.
  • This can minise concatenation artifacts.
  • This alternative may be most efficient when the data has zero mean and the low frequency components that carry the absolute value of the signal up and down at the slowest rates are relatively insignificant. Physically realisable acoustic signals typically have as close to zero mean over a sufficiently long time period.
  • a non-zero mean can be considered as a non-zero amplitude at zero frequency.
  • the input signal is high-pass filtered at 50 x m in an example embodiment, so that after compression by a factor m the lowest frequency content will be at 50 Hz, the lower limit of human hearing response.
  • a cosine bell smoothing retention window may still be applied if required, but the width need not be larger than a few percent of the total retained window size.
  • One feature of the adaptive algorithms described above is that the retention windows are not of regular size. This results in slight fluctuations in actual compression rates.
  • the buffer state may be periodically checked to ensure that the algorithm is neither running out of input data nor overflowing the buffer and minor adjustments to the targeted compression rate may be made to compensate.
  • the fluctuations in achieved compression rate on the other hand have the advantage of avoiding undesirable coherent beating of the retention window repeat period and highly-regular signals that might otherwise result in odd behaviours and perhaps even missing entire signal streams.
  • An equivalent inverse matching algorithm is provided to perform bandwidth expansion for infrasound.
  • the input stream 500 is divided up into segments 502 in the time domain.
  • a one second sequence yields ⁇ samples or segments e.g. 502.
  • the segments e.g. 502 are each copied repeatedly and groups of the copied segments are concatenated into an intermediate signal 504. For example if each window is repeated m times at frequency f
  • the intermediate signal 504 is then upsampled by a factor m (i.e. equal to the repetition factor) to provide a higher- bandwidth audible signal 506 occupying the same time as the input signal 500.
  • m i.e. equal to the repetition factor
  • a cosine-smoothing window and zero crossing and gradient matching can be applied as described for the compression algorithm embodiment above.
  • a suitable high-pass filter e.g. with poles at 50/m Hz at the input stream
  • the time domain sample-compress- concatenate process with regular interval may miss pulses.
  • an example embodiment searches for regions within the time series that contain an energy level above a threshold and define the position of the retention window there. Regions with high energy content are therefore candidates for compression. This can not only minimise the chances of missing short pulses, but also reduces the computation effort when the input signal is relatively 'quiet'.
  • An example application is detecting and bandwidth conversion of echolocation-like clicks or pulses. Choosing which samples contain valuable information in the form of pulses can be achieved by performing an envelope detection (via a Hubert transform in digital space or by rectifying and low-pass filtering in the analogue domain) and thresholding, with an adaptive level for example.
  • dolphin echolocation clicks are usually no more than 200 ⁇ s long and, in the wild, generally repeat at no more than 200 Hz (associated with a minimum range of 3.75m).
  • retention windows of length 250 ⁇ s may be appropriate without risk of losing important information.
  • a 10:1 bandwidth compression can then be achieved by discarding the following 2.25 ms of data without significant risk of missing the next pulse. Parameters The assumption of no prior knowledge of the inaudible signal when compressing the bandwidth in example embodiment does not preclude tuning the compression parameters for the algorithm to work optimally and safely.
  • the compression parameters preferably produce audible signals with energy levels that are detectable by the human ear without damaging it.
  • Such parameters might include the compression factor (C), gain factor (G), the length of the retention section (t k ), and the length of the smoothing window (t c ).
  • the compression factor determines the original bandwidth that the user wants to map into human hearing range. As the compression factor determines how much high-resolution frequency structure is to be discarded, the compression factor may be kept to a minimum. Hence the compression in one embodiment maps the highest frequency of interest into the higher end of the user's audible frequency range.
  • the length of the output audio block may be at least equal to the human hearing integration time. ⁇ . This can ensure that the user could integrate the acoustic power over the period and gain enough energy to sense the sound even when the acoustic level is smaller than detection threshold.
  • the retention window length, t k may be kept as small as possible to avoid loss of original transient patterns.
  • the length of the smoothing window, t c may be kept smaller or equal to t k , to ensure that the main energy content of compressed signal is from the retention window.
  • the gain setting is preferably large enough so that the energy content of the compressed signal produced from the weakest detectable (by hardware) ultrasound is at least the minimum energy detectable by the human ear over the integration time.
  • the input pulse width is smaller than hearing integration time even after compression.
  • the smallest acoustic signal of interest is preferably amplified to the minimum human hearing threshold. Therefore the gain can be written as,
  • P a is the minimum audible sound level for the human ear Pio is the minimum audio sound level that the hardware can detect E 0 is the minimum audible energy in hearing integration time ⁇ t is the minimum detectable ultrasound pulse width
  • the performance statistic used was a 2- dimensional zero-lag correlation of the spectrograms of the original and compressed signals, which is normalised by the energy content of both signal as shown below
  • R AB is the correlation coefficient
  • A, B are spectrogram matrix (absolute value) of original signal and compressed signals respectively m, n is the column and row size of the matrix
  • a , B are the mean of the spectrogram matrix A and B respectively
  • the compression parameters were maintained at constant values of retention window length of 10ms, smoothing window length of 0.5ms, and compression factor of 10 times except the particular parameter under investigation (one at a time).
  • the gain is kept constant throughout the tests.
  • the tests provided an idea of how an embodiment of the system behaves at different values of the parameters but does not signify the absolute performance of the algorithm. This is because the human brain can be efficient at picking up the audio patterns in highly noisy environment. For example, the human ear may still be able to audibly recognise the presence of the acoustic pattern in compressed signals with the value of normalised cross-correlation less than 0.2.
  • FIG. 6 illustrates the performance of embodiments of the system according to variations in the parameters.
  • the dotted lines 600, 602, 604 are the respective linear fits to the curves. It is observed that reducing both the compression factor 602 and the length of retention window 600 would yield better performance than increasing it. This may be due to larger values in both parameters resulting in larger data section to be discarded each time, thus introducing more mismatch between the original and compressed signals. It is noted that the fit to the retention window length 604 tests ends at around 1ms. This may be because the algorithm requires a minimum number of samples in the retention window to perform the compression and applying smoothing windows. On the other hand, the larger the length of smoothing window 604, the more overlaps of time series will be included and lesser discontinuity exist in the compressed signal. This reduces the overall mismatch between original and compressed signal. Therefore it may be desirable to keep the retention window 600 and compression factor 602 as small as possible but use larger smoothing window 604.
  • One application of embodiments of the present invention is to provide support for bio-acoustic research (such as birds, mammals, insects etc.) by potentially reducing the analysis time and opening up new research possibilities.
  • bio-acoustic research such as birds, mammals, insects etc.
  • non-audio band bio-acoustic studies involve recording the sound, and processing the sound 'off-line' in order to analyse its detailed signature.
  • Embodiments of the present invention may enable the researcher to detect and 'classify' ultrasound in-situ and real-time by 'hearing 1 . This may open up possibilities for the researcher to correlate the behaviours of the animals and their ultrasound acoustic behaviours in real-time, and perhaps interact with them.
  • Vocal communications of human beings are based not only on the linguistic information but also the emotion of the sentence. These 'emotional' hints also exist in the animal communication, either being a gesture or vocal (or rather acoustic) presentation. .
  • Embodiments of the present invention may allow the user to pick up and experience such cue when studying the ultrasound bio-acoustic of animals.
  • embodiments of the present invention may allow ultrasound within an underwater area to be detected without introducing significant acoustic disturbance.
  • Embodiments of the present invention may also be useful in diagnosing faults and characteristics of, for example, high-voltage power lines, structural defects, ice cracking, electronic device noise sources, imminent failure of electronic circuit components, detection of marine mammal echolocation and infrasound produced by animals.
  • infrasound applications include air conditioning in buildings that may cause ill health, baleen whale and large terrestrial mammal (such as elephant) communication, seismic events on land and in the sea and distant traffic detection.
  • baleen whale and large terrestrial mammal such as elephant

Abstract

A method and system for real-time audible monitoring of acoustic signals. The method comprises dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and sampling the intermediate signal to derive an audible output signal such that portions of the audible output signal sampled from respective sections of the intermediate signal have the selected duration.

Description

A METHOD AND SYSTEM FOR MONITORING OF ACOUSTIC SIGNALS
FIELD OF INVENTION
The invention relates broadly to a method of audible monitoring of acoustic signals outside human hearing range and to a real-time system for audible monitoring of acoustic signals outside human hearing range.
BACKGROUND OF THE INVENTION
There has been a demand to create devices that translate "ultrasound" (i.e. sounds at frequencies higher than can be heard by the human ear) or "infrasound" " (i.e. sounds at frequencies lower than can be heard by the human ear) to within the range of human hearing for many years.
The prior art includes ultrasound detectors and bandwidth shifters, which detect the presence of ultrasound, but do not provide a very representative sense of what the original 'sounds like'. Either there is no bandwidth compression at all (as for heterodyning) or the output is significantly distorted (as for frequency dividers). The first successful ultrasound detectors were of the superheterodyne type. In this approach, a narrow band of frequencies (e.g., 10kHz wide) from the microphone signal is amplified, converted to a higher frequency for additional amplification, converted again to an audible frequency, and finally used to drive a loudspeaker or headphones. This does not compress the bandwidth, it simply translates a small part of the ultrasound band into the human hearing band.
The majority of commercial detectors use superheterodyne technology. Superheterodyne design offers a quick, sensitive, and reliable way to measure frequencies.
A major drawback to superheterodyne detectors is that they miss signals occurring outside the frequency band to which they happen to be tuned. As a result, several ways have been devised to produce a broadband response, thereby allowing the user to listen "at all frequencies at once". An early envelope-detector design produced a signal that outlined the shape of each ultrasound pulse by following the peaks of their individual cycles. A drawback in this case is that while the intensity and repetition rate of the input are captured, frequency information is lost. Longer sounds, like those produced by bats, do not produce clear output, and a continuous input signal is not registered at all. This type of detector is no longer widely used, and because it currently offers little economic advantage over superheterodyne circuitry, is found only in the very cheapest instruments.
These designs gave way to the frequency divider. In this approach, incoming waveform cycles are counted as the signal voltage swings across the zero, or baseline level. A new waveform is then generated at some ratio of the original frequency, and is given the amplitude characteristics of the input signal. The new waveform emerges as a lower-frequency, audible version of the original sound that retains both its frequency and amplitude patterns. This approach has two significant drawbacks. First, the new waveform tends to track the frequency of the most intense energy component of the input signal and the instrument's performance may therefore be disturbed by the presence of multiple harmonics. Second, some pulses are of such short duration that that they cannot be effectively transformed to a lower frequency. In a different approach, called automatic frequency scanning, the detector "searches" for signals through continuous frequency-response changes. This automatic tuning process sweeps repeatedly across the entire frequency-band of interest, stopping only when sounds are detected. This approach saves the investigator the tedium of persistently turning the detector's tuning knob by hand, but does not overcome the inherent problem of missing signals that might be occurring simultaneously at different frequencies.
Time-expansion methods are limited by the fact that the total information that can be expressed in a signal is inherently limited by the product of the time and frequency range involved (the time-bandwidth product). Since ultrasound detectors work in real-time with a narrow spectral bandwidth, this product is reduced compared with the original signal, so information potential must be discarded. In choosing a particular real-time detection method, the observer must choose what information is required and what can be dispensed with.
A radically different approach to this general problem, however, is to extend the ' time-scale of the recorded signal in order to maintain the time-bandwidth product and thereby represent the entire signal in full detail. Since about 1960, instrumentation recorders have been used in this manner. This technique is well established and can give excellent results if used by informed operators. However, because of the delay that necessarily occurs between recording and playback, it is only suited to signal analysis and not to signal detection. Additionally, the playback takes much longer than the length of the original recording, so that this can not be applied in a real-time application.
A third major area of development in frequency and bandwidth compression is in the correction of speech from divers at depth breathing Helium gas mixtures. Helium mixtures have physiological advantages for deep divers, but under pressure Helium significantly increases the resonant frequencies of a diver's vocal tract, so that the diver's speech is very high pitched and is difficult for untrained listeners to understand. Many Helium decoders have been designed for commercial production. The requirement for real-time compression, is not to miss any important speech, so that time-expansion is inappropriate. Modern digital approaches use algorithms that are very effective, but that rely on the structure of speech and exploit limitations in the time-frequency content. An example of such an approach is the use of high order linear prediction to deconvolve and scale the vocal components.
SUMMARY OF INVENTION
According to one aspect of the present invention it is provided a method of audible monitoring of acoustic signals outside human hearing range, the method comprising the steps of dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section being representative of one corresponding segment and having a different duration than the corresponding segment; and deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
Each section in said intermediate signal may be derived using frequency domain compression of said segment. Each said frequency domain compression may comprises obtaining a series of complex frequency estimates in the frequency domain for each segment.
The method may further comprises replacing the series of complex frequency estimates with one or more single estimates, each single estimate being derived from m sequential ones of the series of complex frequency estimates. Each single estimate may be derived such that information corresponding to the amplitude and phase is preserved. Each single estimate may comprises a centroid for the corresponding m sequential complex frequency estimates.
Deriving each single estimate may comprises weighting the m sequential estimates according to the proximity to said each single estimate. Each section in said intermediate signal may be derived using time domain compression of said segment.
Said time domain compression may comprise forming each section by truncating the corresponding segment in the time domain and concatenating the sections.
Said segment may be truncated according to a compression factor (C): C= fhi/fmax. where fhi is the upper end of an original bandwidth in the acoustic signal, and fmax is the maximum audible frequency .
The method may further comprise down sampling the intermediate signal based on the compression factor to derive the audible output signal.
Each section in said intermediate signal may be derived using time domain expansion of said segment.
Each section in said intermediate may comprise a repetition of said segment according to a repetition factor and the sections are concatenated. Said step of sampling the intermediate signal may be up-sampled based on the repetition factor to derive the audible output signal.
Said concatenation of the sections may comprise applying a smoothing function to each section.
The smoothing function may comprise a cosine bell window. Dividing the acoustic signal may comprise searching for a portion of said acoustic signal above a selected power threshold and defining a retention portion of one of the segments of the acoustic signal at said portion.
The method may comprise a method of compressing and/or expanding signal bandwidth for real-time audible monitoring of acoustic signals outside human hearing range.
According to a second aspect of the present invention it is provided a system for real time audible monitoring of acoustic signals outside human hearing range, the system comprising: an analog to digital converter or sampler for sampling the acoustic signal; a processor for dividing said acoustic sampled acoustic signal into segments of a selected duration and for deriving an intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and a digital to analog converter for deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
Each section in said intermediate signal may bederived by the processor using frequency domain compression of said segment. Said frequency domain compression may comprise obtaining a series of complex frequency estimates in the frequency domain for each segment.
The processor may replaces the series of complex frequency estimates with one or more single estimates, each single estimate being derived from m sequential ones of the series of complex frequency estimates. Each single estimate may be derived by the processor such that information corresponding to the amplitude and phase is preserved.
Each single estimate may comprise a centroid for the corresponding m sequential complex frequency estimates.
Deriving each single estimate by the processor may comprise weighting the m sequential estimates according to the proximity to said each single estimate.
Each section in said intermediate signal may be derived by the processor using time domain compression of said segment.
Said time domain compression may comprise forming each section by truncating the corresponding segment in the time domain and concatenating the sections. Said segment may be truncated according to a compression factor (C):
C= fhfimax- where ft,/ is the upper end of an original bandwidth in the acoustic signal, and fmax is the maximum audible frequency . The sampler may down samples the intermediate signal based on the compression factor to derive the audible output signal.
Each section in said intermediate signal may be derived by the processor using time domain expansion of said segment.
Each section in said intermediate may comprise a repetition of said segment according to a repetition factor and the sections are concatenated. The sampler may sample the intermediate signal is up-sampled based on the repetition factor to derive the audible output signal.
Said concatenation of the sections may comprise applying a smoothing function to each section by the processor. The smoothing function may comprise a cosine bell window.
Dividing the acoustic signal may comprise searching for a portion of said acoustic signal above a selected power threshold and defining a retention portion of one of the segments of the acoustic signal at said portion. The device may be implemented as a portable device. The portable device may be incorporated into a diving harness.
According to a third aspect of the present invention it is provided a data storage medium having stored thereon code means for instructing a computer to execute a method of audible monitoring of acoustic signals outside human hearing range, the method comprising the steps of dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
BRIEF DESCRIPTION OF THE DRAWINGS
One preferred form of the present invention will now be described with reference to the accompanying drawings in which;
Figure 1 is a block diagram of the hardware modules according to an example embodiment of the present invention;
Figure 2 is a block diagram of the processing means according to an example embodiment of the present invention; Figure 3 is a front view of the hardware, installed and ready for use according to an example embodiment of the present invention;
Figure 4 is a schematic diagram of the bandwidth compression algorithm according to an example embodiment;
Figure 5 is a schematic diagram of the bandwidth expansion algorithm according to an example embodiment; Figure 6 is a graph of input output cross-correlation for variations in parameter values according to an example embodiment;
Figure 7 is a flow chart of a method for real-time audible monitoring of acoustic signals according to an example embodiment; Figure 8 is process diagram according to an example embodiment of the present invention;
Figure 9 is flow diagram according to an example embodiment of the present invention;
Figure 10 is flow diagram according to an example embodiment of the present invention; and
Figures 11 to 21 are circuit diagrams of an implementation according to an example embodiment of the present invention.
DETAILED DESCRIPTION OF THE EMBODIMENTS Ultrasound monitoring (UM) is a way to detect electrical discharge and high-pitch structural noise produced by a system, and can be useful for monitoring one or more conditions of a system. For example, UM can be used to detect arcing in electrical circuits and in electrical cabling; leaks in boiler, pressure and vacuum vessel. Being able to identify 'signatures' from these ultrasounds may permit correlation of the signature to a system fault (or behaviour), rather than just detecting the fault, which could greatly increase productivity. Similarly it may also be useful in real-time to monitor other inaudible sounds for example infrasound.
One embodiment of the present invention includes a real-time processing engine that may translate at least an inaudible signal to an audible signal while preserving the 'signature' of the original. For example, a diver could swim along with a dolphin and interact with the dolphin based on the ultrasonic 'communication'. An example embodiment of the present invention is now described with reference to Figure 1. An input device or transducer (generally depicted as 100) provides an input signal 102 representative of an acoustic signal. The input signal is provided as input to a processing unit 104. The processing unit104 processes the input signal 102 and provides an output signal 106 including an audible representation of the detected input acoustic signal. The output signal 106 is provided to an output device or transducer (generally depicted as 108). The output device 108 generates an audible output acoustic signal representation of the detected input acoustic signal Referring to Figure 1 input device 100 may comprise a hydrophone 110, a broadband microphone 112, a vibration sensor 114 or a microphone nozzle 116.
Generally the choice will depend on the application. In one embodiment input device 100 has a broadband response to at least infrasonic, audible and ultrasonic acoustics. Alternatively input device 100 may be limited to a specific range within infrasonic, audible and ultrasound. For example in marine or underwater conditions a hydrophone 110 may be applicable. An example hydrophone may have sensitivity of -186+3dB re 1V/μPa and a directivity of 360°+2dB in the radial axis and
270°+2dB along the pointing direction. The input response might for example be 10-48OkHz for such a transducer.
Processing unit 104 may for example comprise an electronic circuit, a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP) or a computer. Generally the choice will depend on the application. In a miniature portable application for example, a DSP might be chosen that is capable of processing 4800MIPS.
Figure 2 shows one embodiment of the processing unit 104 in block diagram form. A connector 200 provides connections to external components for example an input device 100 and an output device 108. The signals from the connector 200 are then provided internally of a casing 204, to components within. The input signal 102 is provided first to an input signal-conditioning module 206 for conditioning. The conditioned analogue signal 208 is then provided to an Analogue to Digital Converter (ADC) module 210 for conversion to the digital domain. For example the ADC 210 could provide 14bit input sampling with 1 input channel. The ADC 210 generates a digital input signal 212 which is provided (via a series of interrupts and Enhanced Direct Memory Access (EDMA) transfers) to a DSP 214 for processing to generate a digital output 216.
The DSP 214 may include storage, for example Random Access Memory (RAM) and/or Read Only Memory (ROM). The storage could be used to store and execute software, for example, any of the algorithms described herein implemented as software. Alternatively such software could be provided on an external data storage medium and interfaced with the DSP 214.
The digital output 216 is provided to a Digital to Analogue Converter (DAC) module 218 for conversion to the analogue domain. The DAC 218 generates an analogue output signal 220 which is provided to an output-conditioning module 222 for conditioning. The conditioned analogue output signal 106 is then conveyed externally of the casing 204 through the connector 200 to the output device 108, which then generates the audible sound, visible indication or vibration representing the original acoustic signal.
The various circuits within the casing 204 require a power source. In a portable embodiment, a battery 226 or other energy storage could be provided. In a static embodiment an AC supply could be converted to DC voltage, by way of example only.
In one embodiment the battery 226 is connected via a DC charging bus 228 to a battery-charging module 230. In turn the battery-charging module 230 may receive an AC or DC supply 232 from charging inlet 202 via connector 200. The battery 226 may be selected depending on application, for example a Lithium-ion 7.2V battery pack. The output supply from the battery 226 is supplied via a DC output bus 234 to a low drop off regulator (LDO) 236, which in turn supplies digital circuits via digital low voltage DC bus 238 and analogue circuits via analogue low voltage DC bus 240. For example the battery might range in voltage from 6-12V and have an expected life of 19 hours (based on 2W consumption in detection mode and 5OmW in standby mode).
In one embodiment, DSP module 214 is programmed to provide an appropriate representation of the input depending on the application. For example a method is described later for a compression algorithm which provides an audible representation of a broad band of inaudible frequencies.
Referring to Figure 1 output device 108 may comprise a bone conduction piezo¬ electric transducer, an earphone/headphone 120, a speaker 122 or a visible indicator (not shown). Generally the choice will depend on the application. One skilled in the art will appreciate certain forms of transducer will be appropriate in a given application. In one embodiment output device 108 has a response of between 1 and 2OkHz for 44ksps processed signal. In a marine or underwater application, for example, a bone conduction piezo-electric transducer 118 may be applicable. Underwater application Referring to Figure 3 an example apparatus is shown for implementing an embodiment of the present invention in a marine or underwater application. In such case it may be useful to detect acoustic signals from the high audio band (10 kHz) to ultrasound band limited to approximately 200 kHz. This covers the frequency ranges of most marine animal generated ultrasound signals. Hydrophone system 300 (including pre-amplifier and separate battery) is attached to a diver's BCD 304 or similar. The system 300 can be configurable to allow a pair of hydrophones 302 to be held in a rigid relationship for directional mode operation.
In case of underwater environment, diver can hold a pair of hydrophones (not shown) in rigid relationship to each other and determine the rough direction of the signal source. The two hydrophones might then be mounted some 200mm apart on a rigid rod, for example. When the rod is held horizontally and across the diver's body, a processing unit 304 forming part of the system 300 may be switched to directional mode. The processing unit 304, may be placed in the pocket of the BCD 306, on a custom harness or similar and may have an LED signal strength display for directional information. There may be facilities to provide for atmospheric rather than underwater usage by replacing the hydrophones and the underwater earphone to a high bandwidth microphone and conventional headphone. Underwater earphone 308, can be mounted on the diver's mask strap next to the bony region immediately in front of the ear to provide bone conduction hearing. It may be desirable in non-intrusive monitoring (such as monitoring the behavioural ultrasound communications between marine mammals with minimum disturbance) to minimise reflection disturbance to the animal. Practically such a system should be sensitive enough to detect animals before the animals detect the human, who is operating the equipment. The received signal power (PR) is given by
PR =PT +SPG + TS- (SL0 + SLi)
Where Pτ is transmitted signal power, SPG is the signal processing gain (for example the physiological of the animal might provide some detection advantage to reflections, if it exists), TS is the Target Strength and SL0 and SL| are the Spreading Losses and absorption's associated with the outward and inward signal path. The received power at the hydrophone will be
PR = PT -(SLO)
so that the hydrophone receives a signal with a power greater than the back scattered signal as received by the mammal by a factor PR excess = SL, - (SPG + TS)
which gives the hydrophone a detection advantage over the marine mammal providing
SLi >(SPG + TS)
The Target Strength of a diver might be around 0 dB with proper wet suit materials. Assuming the mammal is able to reconstruct the received signal in a near-perfect way, the Spreading Loss would behave approximately as 20 Log10(range) for the first 40 m or so, and then 10 Log10(range) from 40 m onwards.
Therefore, the diver equipped with an embodiment of the present invention may have advantage of detecting the bio-acoustic signal of the target animal before the animal detects the diver. This removes the diver from being a possible acoustic disturbance as long as he/she stays far away.
Using the receive power equation above and a spreading law of 20 Log10i0(range) for the first 40 m or so, and then 15 Log10(range) from 40 m onwards (assuming no ability to coherently process multiple arrivals). Absorption at 100 kHz is approximately 40 dB at 1 km. At this range a power level of approximately 90 dB re
1 μPa is likely. Even allowing listening over a wide bandwidth of 10 - 200 kHz, a total ambient noise power of more than 70 dB is unlikely over this bandwidth, giving a workable detection signal to noise ratio of 20 dB. This allows the diver to monitor the ultrasound without any risk of approaching too close so as to acoustically disturb the animal.
Figures 11 to 21 show circuit diagrams of an implementation of an example embodiment of the invention.
Figure 11 shows the CPLD connections, external switches and indicators of the example embodiment. Figure 12 shows the Peripheral expension connectors of the example embodiment.
Figure 13 shows the Power and ground pin connections of DSP core of the example embodiment.
Figure 14 shows the SDRAM connections to the DSP core of the example embodiment. Figure 15 shows the Boot Flash connections to the DSP core of the example embodiment.
Figure 16 shows the JTAG emulation port connections of the example embodiment.
Figure 17 shows the McBSP connections from DSP for ADC board of the example embodiment.
Figure 18 shows the buffering of crystal clocks for CPLD of the example embodiment.
Figure 19 shows the Connections of DSP to ADC and the codec used as DAC of the example embodiment. Figure 20 shows the Clock connections to the DSP core of the example embodiment.
Figure 21 shows the on board DC/DC converter of the example embodiment.
In one embodiment, a TMS320C6416 processor running at 600MHz is utilised. The processor has a Flash that is connected to EMIFB CE2 as a boot Flash, a THS14F01 ADC configured at 500ksps connected to EMIFA, and an AIC23 codec connected to McBSP as output. A built in power supply module steps down and regulates the Li-Ion supply voltage (7.2V) to ±5V, 5V and 3.3V. An underwater headphone system is driven by the AIC23 codec output while the input signal is obtained from an ITC1042 hydrophone that is signal conditioned by high-impedance voltage follower in the analogue board. The THS14F01 ADC digitises the data from the analogue board output.
Processing Algorithms
In Figure 2 DSP module 214 provides the processing from a wide band ultra and/or infra sonic signal to an audible signal which retains the essential characteristics or signature of the original signal in real-time. In an example embodiment DSP 214 is programmed according to a method described below. A compression algorithm is used for ultrasound and an expansion algorithm is used for infra sound.
Compression/expansion may either be in the frequency domain or the time domain. Such programs or software including an algorithm can either be stored on internal
Referring to Figure 7, an example embodiment is shown in a flow chart for a method of audible monitoring of acoustic signals outside human hearing range. At step 700 the acoustic signal is divided into segments of a selected duration. At step 702, an intermediate signal comprising a concatenation of sections is derived, each section being representative of one corresponding segment and having a different duration than the corresponding segment. At step 704, an audible output signal is derived from the intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
In a further example embodiment shown in Figure 8 an example algorithm is shown implemented as software in a DSP. The processes run simultaneously and their priority is shown in descending priority from top to bottom. Process step 800 includes two hardware interrupt (HWI) driven Enhance Direct Memory Access
(EDMA) channels that automatically transfer data from an alanogue to digital converter (ADC) to the pair of input ping-pong buffer and sends the data within the output ping-pong buffer to a digital to analogue converter (DAC) respectively. The channels are setup to be triggered by the availability of data from ADC (FIFO half full), and depletion of the codec's output data register. The interrupt sub-routine performs some house keeping and data repackaging at the same time. Process step 802 includes a software interrupt (SWI) that will initiate a process to consolidate the data into the i/p buffer. Process step 804 includes a software interrupt (SWI) that will initiate a process to consolidate the data either from the o/p buffer. Process step 806 relates to the main compression algorithm, which repeatedly polls the data in the i/p buffer and places the result in the o/p buffer after compression. Process step 808 runs on power up including initialisations and enabling the interrupts. Referring to Figure 9 an example embodiment of the main compression algorithm is shown. In step 900 the sampled data is first transferred to a pair of ping-pong buffer, and the data is unpacked to 16-bit, separated channel data and placed in two input buffers (i/p buffer left channel and right channel) waiting to be processed. In step 902 the availability of the data in i/p buffer is checked. In step 904 if there is enough data, the input signal is compressed, and in step 908 the result placed into o/p buffer (one each for left-right channel). During the compression, there are two intermittent buffers used, one is used for windowing process (in step 904) and the other is for interpolation and low-pass filtering (in step 906). In step 910 the data in the o/p buffers is then repackaged and transferred into separate pairs of ping-pong buffers, and then sent to DAC for playback.
Figure 10 shows an example embodiment of the time domain compression process in realtime. In step 1000 a check is done for any data to be discarded (eg: a time domain algorithm that retains a window of data, discard a proportion of subsequent data, and then reduce the sampling rate). In step 1001 as many samples are discarded as possible if there is something to discard. In step 1002 compression only starts if the o/p buffer will not be overflowed by the processed data. In step
1004 compression starts whenever the i/p buffer contains enough samples. The rest of the steps are skipped if there is either not enough data left for processing or by processing the data will overflow the o/p buffer. In step 1006 the data section with high energy is located (because most likely there will be signal above ambient noise) and then the initial retained window is placed starting there. In step 1008 the end of the retained window is adjusted to a near zero value with a matching gradient to the start gradient of previous section. In step 1010 a smoothing widow is applied to the retained window. In step 1012 the signal is transferred to the o/p buffer, the number of samples to be discarded is calculated and the discard counter is updated. The steps are repeated and the respective samples to be discarded will then be removed at the beginning (step 1000). Any higher priority process (such as HWI and SWI that transport data from buffers to input/output peripherals) can pre-empt the compression process to ensure no data is missed and ensure continuous playback. The compression process is continued after any higher priority processes are completed.
Example Bandwidth Compression Algorithm
The constraining assumption in an example embodiment (unknown information distribution in time and frequency spaces) means that each time sample is potentially as important as any other is. Similarly, in the frequency domain, each frequency sample is as important as any other is. Since one objective is to reduce the frequency bandwidth, not the temporal duration, one alternative is to operate the method in the frequency domain. Movement between the frequency and temporal domains by means of the forward and inverse Fourier Transforms is possible, since these are orthogonal. Use of the Discrete Windowed Fourier Transform (DWFT) and its inverse is appropriate in an example embodiment, since the input series is finite. The example embodiment incorporates restrictions imposed by Shannon's sampling theorem and the Nyquist frequency limit. The inaudible time series is divided into segments, then each segment is compressed and the segments are concatenated to resemble a new but representative time series in audio band. Frequency Domain
In one example embodiment, the compression is performed in the frequency domain. Initially the signal is segmented, each segment of the time series is transformed into the frequency domain, the frequency band is compressed, and transformed back to the time domain. This results in play back with the new (reduced) sampling rate. As the signal may be divided into short discrete sections Discrete Windowed Fourier Transforms (DWFT) and inverse-DWFT can be used in the example embodiment. In order to compress the frequency band, complex frequency estimates are grouped and each group is replaced with a value that is representative of both the amplitude and phase information of the particular group. The number of frequency estimates in each group may be determined by the desired frequency compression ratio. Since nothing is assumed about the signal in one embodiment, all the frequency estimates are treated with equal priority. Therefore, the frequency estimates are grouped evenly across the frequency band. The result is that the high-resolution frequency structures are removed and the gross signature of the signal is retained. This is achieved in the example embodiment by convolving the frequency estimates with a rectangular window, followed by reducing the sampling frequency accordingly. The convolution process effectively low pass filters the frequency estimates, where length of the windows is related to the compression ratio desired. An example series might include a zero-mean, detrended sequence of 2N data points in time regularly sampled at an input sampling frequency of fj with zero energy at frequencies above f/2 Hz. The input sampled signal has a duration of 2N/fj seconds. In the frequency domain, after applying a DWFT, N complex frequency estimates are obtained. One objective is to reduce bandwidth, so correspondingly the number of frequency estimates may be reduced or they may be placed closer together in the frequency domain. Placing them closer together increases the frequency resolution, which is equivalent to extending the time in the temporal domain (time dilation), which is less desirable (I.e. not real-time monitoring). Therefore the N estimates are replaced with n, where n<N and the desired compression factor is n/(n+N), in an example embodiment.
In this example embodiment the assumption is that no particular frequency estimate is more important than any other. The result is that frequency estimates are culled evenly across the bandwidth. Every sequential set of m of the N frequency estimates is replaced with a single one that best represents the m originals in some sense according to this assumption. In one embodiment, the centroid of each set of m complex numbers in the complex plane is used, which is mathematically equivalent to the arithmetic mean of these numbers. The centroid preserves both amplitude and phase information. This is mathematically equivalent to convolving the original N frequency estimates with a 'top-hat' square window filter of width m and then regularly sub-sampling by factor m. The desired bandwidth compression can then be obtained by dividing the frequency of each retained estimate by m (equivalent to dividing the sampling frequency by m in the time domain) and applying the inverse DWFT to obtain a time series that can be played at f/m sampling rate in the same time as the original series occupied.
This example embodiment is effective at low-pass filtering in frequency space, so that highly-structured detail in frequency modulations are removed while the overall form of the frequency structure of the signal is retained. The time-bandwidth product is reduced by a factor m, resulting in discarding of information. The approach of regularly (in frequency space) replacing each consecutive set of m estimates with a single estimate in this example embodiment satisfies the underlying assumptions about the information distribution. It will be appreciated that the centroid is not the only possible choice for determining the single estimate. Other methods of generating a single complex number to represent the set of m originals may used in different embodiment which preserve in some sense the amplitude and phase. The example embodiment involves the convolution of the frequency estimates with a square 'top-hat' window. From Fourier theory, convolving with a square window on one space is equivalent to multiplying by a sine function in the transformed space, i.e. in this case the time domain. This process may therefore introduce undesirable spurious oscillations into the time-based data. One option to avoiding such spurious oscillations in an example embodiment is to adapt the simple centroid estimator in the frequency domain to provide more nearly 'square' selections in the time domain. For example, this might be achieved by weighting the m complex estimates according to how near they are to the frequency position of the single replacement estimate. The weighting might take on the appearance of a Gaussian or Hanning window, for example, and should lead to more optimal characteristics in the time domain. Time Domain
The time domain algorithm may have the advantage of reduced computational load. As a result, it reduces signal processing power requirement; hence making it more suitable for a real-time, battery-operated system. It has been recognised by the inventors that the theoretically optimal solution of a low pass filter in the frequency domain corresponds to multiplication by a square 'top-hat' window in the time domain. The acoustic signal is initially divided into a series of segments as for the frequency domain compression embodiments described above. In one embodiment, a section in each segment is retained in the time domain and the remaining portion of the segment is truncated. This is repeated throughout the entire time series. The ratio between the retained section and the discarded portions is related to the compression ratio.
By retaining a section of each segment and truncating in between the retained sections it is possible to achieve a desired compression in the example embodiment, and by reducing the sampling frequency the intermediate signal may be audibly replayed in the same time as the original signal occupied, thereby operating in real-time.
Referring to Figure 4, an example series 400 is shown in the time domain sampled at frequency fs over a period of one second, yielding fs samples or segments e.g. 401. For each segment e.g. 401, a section of data (t) is retained and the following section (T) is discarded according to
T=t(C-1)
where C is the desired compression factor.
The retained windows (t) are concatenated yielding yc samples e.g. 402 lasting 1/C seconds at fs sampling rate. The sampling frequency is then decreased to fs_new= yc to yield one second of audible output data 404 at compression factor C. One advantage of recognising the time-domain equivalent algorithm is that the forward and inverse DWFT are not required, reducing computational effort.
In a real-time system embodiment, the operation of compressing one block of data is implemented on multiple sequential blocks of data and the blocks are concatenated. The length of retention window (t) preferably is at least as long as required to capture the lowest frequency of interest after bandwidth compression, and not so long as to risk undersampling the temporal modulation structure of the signal. The retention window length is therefore in the region of a few milliseconds to tens of milliseconds in an example embodiment. It may also be selected interactively by the user. If left with coarse transitions, the concatenation process may produce spuribus noise due to discontinuity between the edges of subsequent sections that are joined together. This defect or artifact may be corrected. For example, one method in an example embodiment is to apply a smoothing window to each segment and introduce overlaps when concatenating them. Another example method is to allow some flexibility when defining the retention sections in the time series to minimise the level difference. Such methods can significantly reduce spurious noise.
To suppress artifacts, a smoothing window can be applied that tapers the data to zero at each end of the retention window. The selected smoothing windows can be made to overlap or 'feather' the exiting and entry points together. This is analogous to the windowing in Welch's spectral estimation procedure for DWFT. In one embodiment, in order to preserve the power of the signal (so as not to generate artificially quiet or loud points at window crossover) a cosine bell smoothing window is used, given by
(1+ cos[faι/f|)
where wk are the window weights and k varies between 1 and t, the length of the retention window. This retention window has the property that
(1 + cos[kπ/t]) (1 + cos[(t-k)π/f\) wk + wn.k = +
(1+ cos[kπ/t] + (1+ cos[(tk)π/f])
2 + cos[kπ/t] + cos[π]cos[-kπ/f\ - sin[π]sin[-/cπ/f] ~2
so that if applied to the signal power it will not distort the incoherent summed power. If the signal were to be a single coherent frequency (such as a sine wave) the smoothing window may be applied to the signal amplitude, rather than power. If incoherent frequency content is assumed in an example embodiment, the smoothing window is applied to the power, and therefore the square root of the amplitude. The length of this smoothing window should be shorter than t/2, i.e. half the length of the sub-selected data block. As an example k might vary between 10-25% of t. The choice of crossover window size may be operator-controlled in a real-time system embodiment. Another alternative involves that, when the algorithm jumps forward, discarding T points, the start point of the retention window is provisionally selected. The start point is then stepped through the data until the signal value becomes negative. The start point is then stepped again through the data until the first positive value is encountered. This is used as the final selected starting point for the retention window.
The ending point for the retention window is then set to the starting point + 1 and the same stepping procedure adopted to find the first positive value after a negative value. The end point is then stepped back one point. This alternative can ensure that both the starting and ending points of retention windows have near zero values with positive gradients. When the retention windows are concatenated, there is therefore minimal step discontinuity, and reasonable gradient matching i.e. at least the gradients are of the same sign, may be obtained. This can minise concatenation artifacts. This alternative may be most efficient when the data has zero mean and the low frequency components that carry the absolute value of the signal up and down at the slowest rates are relatively insignificant. Physically realisable acoustic signals typically have as close to zero mean over a sufficiently long time period. A non-zero mean can be considered as a non-zero amplitude at zero frequency. To remove low frequencies that might disturb the zero-crossing stepping process, the input signal is high-pass filtered at 50 x m in an example embodiment, so that after compression by a factor m the lowest frequency content will be at 50 Hz, the lower limit of human hearing response.
This can ensure that compressed windows of 20 ms or longer have both positive and negative values and a zero crossing point with positive gradient. A cosine bell smoothing retention window may still be applied if required, but the width need not be larger than a few percent of the total retained window size. One feature of the adaptive algorithms described above is that the retention windows are not of regular size. This results in slight fluctuations in actual compression rates. The buffer state may be periodically checked to ensure that the algorithm is neither running out of input data nor overflowing the buffer and minor adjustments to the targeted compression rate may be made to compensate. The fluctuations in achieved compression rate on the other hand have the advantage of avoiding undesirable coherent beating of the retention window repeat period and highly-regular signals that might otherwise result in odd behaviours and perhaps even missing entire signal streams. Bandwidth Expansion algorithm
An equivalent inverse matching algorithm is provided to perform bandwidth expansion for infrasound. Referring to Figure 5, to mimic an inverse process as closely as possible, the input stream 500 is divided up into segments 502 in the time domain. For example in Figure 5 if the data is at sampling frequency ή then a one second sequence yields ή samples or segments e.g. 502. The segments e.g. 502 are each copied repeatedly and groups of the copied segments are concatenated into an intermediate signal 504. For example if each window is repeated m times at frequency f| this results in mf| samples in the intermediate signal 504. Similar issues of concatenation artifacts may be encountered, with more joining points in any given time interval compared to the compression embodiment described above. The intermediate signal 504 is then upsampled by a factor m (i.e. equal to the repetition factor) to provide a higher- bandwidth audible signal 506 occupying the same time as the input signal 500. For example in Figure 5 the sample frequency is increased to fo=mf| which results in a 1 second output stream.
A cosine-smoothing window and zero crossing and gradient matching can be applied as described for the compression algorithm embodiment above. A suitable high-pass filter (e.g. with poles at 50/m Hz at the input stream) may be applied to the data for the zero-crossing and gradient-matching step. Click or pulse detection
When the input signal contains sparse pulses, the time domain sample-compress- concatenate process with regular interval may miss pulses. In order to avoid this, an example embodiment searches for regions within the time series that contain an energy level above a threshold and define the position of the retention window there. Regions with high energy content are therefore candidates for compression. This can not only minimise the chances of missing short pulses, but also reduces the computation effort when the input signal is relatively 'quiet'. An example application is detecting and bandwidth conversion of echolocation-like clicks or pulses. Choosing which samples contain valuable information in the form of pulses can be achieved by performing an envelope detection (via a Hubert transform in digital space or by rectifying and low-pass filtering in the analogue domain) and thresholding, with an adaptive level for example.
Depending on the type of sonar or echolocation pulse being sought, one skilled in the art may select an appropriate retention window and compression rate. For example, dolphin echolocation clicks are usually no more than 200 μs long and, in the wild, generally repeat at no more than 200 Hz (associated with a minimum range of 3.75m). Thus retention windows of length 250 μs may be appropriate without risk of losing important information. A 10:1 bandwidth compression can then be achieved by discarding the following 2.25 ms of data without significant risk of missing the next pulse. Parameters The assumption of no prior knowledge of the inaudible signal when compressing the bandwidth in example embodiment does not preclude tuning the compression parameters for the algorithm to work optimally and safely. For example, the compression parameters preferably produce audible signals with energy levels that are detectable by the human ear without damaging it. Such parameters might include the compression factor (C), gain factor (G), the length of the retention section (tk), and the length of the smoothing window (tc). In the following, examples of the tuning of the parameters in example embodiments will be described. The compression factor determines the original bandwidth that the user wants to map into human hearing range. As the compression factor determines how much high-resolution frequency structure is to be discarded, the compression factor may be kept to a minimum. Hence the compression in one embodiment maps the highest frequency of interest into the higher end of the user's audible frequency range. Thus,
where, fhi is the upper end of the original bandwidth fmax is the maximum frequency the user can hear When determining the length of the retention window and smoothing window, the length of the output audio block may be at least equal to the human hearing integration time.τ. This can ensure that the user could integrate the acoustic power over the period and gain enough energy to sense the sound even when the acoustic level is smaller than detection threshold. The retention window length, tk, may be kept as small as possible to avoid loss of original transient patterns. The length of the smoothing window, tc, may be kept smaller or equal to tk, to ensure that the main energy content of compressed signal is from the retention window.
(tk + tc )C ≥ τ
The gain setting is preferably large enough so that the energy content of the compressed signal produced from the weakest detectable (by hardware) ultrasound is at least the minimum energy detectable by the human ear over the integration time. On the other hand, there may be occasions when the input pulse width is smaller than hearing integration time even after compression. The smallest acoustic signal of interest is preferably amplified to the minimum human hearing threshold. Therefore the gain can be written as,
Figure imgf000024_0001
where,
Pa is the minimum audible sound level for the human ear Pio is the minimum audio sound level that the hardware can detect E0 is the minimum audible energy in hearing integration time Δt is the minimum detectable ultrasound pulse width An ultrasound sweep was used to evaluate the algorithm performance at different parameter settings in different example embodiments. The performance statistic used was a 2- dimensional zero-lag correlation of the spectrograms of the original and compressed signals, which is normalised by the energy content of both signal as shown below
ΣmΣMmn -4^ -B)
Figure imgf000025_0001
where
Figure imgf000025_0002
RAB is the correlation coefficient
A, B are spectrogram matrix (absolute value) of original signal and compressed signals respectively m, n is the column and row size of the matrix
A , B are the mean of the spectrogram matrix A and B respectively
A single 12-second frequency sweep from 80 kHz to 120 kHz at a signal to noise ratio of 55dB was used. The compression parameters were maintained at constant values of retention window length of 10ms, smoothing window length of 0.5ms, and compression factor of 10 times except the particular parameter under investigation (one at a time). The gain is kept constant throughout the tests. The tests provided an idea of how an embodiment of the system behaves at different values of the parameters but does not signify the absolute performance of the algorithm. This is because the human brain can be efficient at picking up the audio patterns in highly noisy environment. For example, the human ear may still be able to audibly recognise the presence of the acoustic pattern in compressed signals with the value of normalised cross-correlation less than 0.2. Fig. 6 illustrates the performance of embodiments of the system according to variations in the parameters. The dotted lines 600, 602, 604 are the respective linear fits to the curves. It is observed that reducing both the compression factor 602 and the length of retention window 600 would yield better performance than increasing it. This may be due to larger values in both parameters resulting in larger data section to be discarded each time, thus introducing more mismatch between the original and compressed signals. It is noted that the fit to the retention window length 604 tests ends at around 1ms. This may be because the algorithm requires a minimum number of samples in the retention window to perform the compression and applying smoothing windows. On the other hand, the larger the length of smoothing window 604, the more overlaps of time series will be included and lesser discontinuity exist in the compressed signal. This reduces the overall mismatch between original and compressed signal. Therefore it may be desirable to keep the retention window 600 and compression factor 602 as small as possible but use larger smoothing window 604. Applications
One application of embodiments of the present invention is to provide support for bio-acoustic research (such as birds, mammals, insects etc.) by potentially reducing the analysis time and opening up new research possibilities. Traditionally, non-audio band bio-acoustic studies involve recording the sound, and processing the sound 'off-line' in order to analyse its detailed signature. Embodiments of the present invention on the other hand, may enable the researcher to detect and 'classify' ultrasound in-situ and real-time by 'hearing1. This may open up possibilities for the researcher to correlate the behaviours of the animals and their ultrasound acoustic behaviours in real-time, and perhaps interact with them.
Vocal communications of human beings are based not only on the linguistic information but also the emotion of the sentence. These 'emotional' hints also exist in the animal communication, either being a gesture or vocal (or rather acoustic) presentation. . Embodiments of the present invention may allow the user to pick up and experience such cue when studying the ultrasound bio-acoustic of animals. For the underwater ultrasound application, embodiments of the present invention may allow ultrasound within an underwater area to be detected without introducing significant acoustic disturbance.
Embodiments of the present invention may also be useful in diagnosing faults and characteristics of, for example, high-voltage power lines, structural defects, ice cracking, electronic device noise sources, imminent failure of electronic circuit components, detection of marine mammal echolocation and infrasound produced by animals.
Examples of infrasound applications include air conditioning in buildings that may cause ill health, baleen whale and large terrestrial mammal (such as elephant) communication, seismic events on land and in the sea and distant traffic detection. To those skilled in the art to which the invention relates, many changes in construction and widely differing embodiments and applications of the invention will suggest themselves without departing from the scope of the invention as defined in the appended claims. The disclosures and the descriptions herein are purely illustrative and are not intended to be in any sense limiting.

Claims

CLAIMS:
1. A method of audible monitoring of acoustic signals outside human hearing range, the method comprising the steps of dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section being representative of one corresponding segment and having a different duration than the corresponding segment; and deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
2. A method as claimed in claim 1 , wherein each section in said intermediate signal is derived using frequency domain compression of said segment.
3. A method as claimed in claim 2, wherein said frequency domain compression comprises obtaining a series of complex frequency estimates in the frequency domain for each segment.
4. A method as claimed in claim 3, further comprising replacing the series of complex frequency estimates with one or more single estimates, each single estimate being derived from m sequential ones of the series of complex frequency estimates.
5. A method as claimed in claim 4, wherein each single estimate is derived such that information corresponding to the amplitude and phase is preserved.
6. A method as claimed in claim 5, wherein each single estimate comprises a centroid for the corresponding m sequential complex frequency estimates.
7. A method as claimed in any one of claims 4 to 6, wherein deriving each single estimate comprises weighting the m sequential estimates according to the proximity to said each single estimate. 8. A method as claimed in claim 1 , wherein each section in said intermediate signal is derived using time domain compression of said segment.
9. A method as claimed in claim 8 wherein said time domain compression comprises forming each section by truncating the corresponding segment in the time domain and concatenating the sections.
10. A method as claimed in claim 9 wherein said segment is truncated according to a compression factor (C): C= fhilfmax. where fM is the upper end of an original bandwidth in the acoustic signal, and /max is the maximum audible frequency .
11. A method as claimed in claims 9 or 10, further comprising down sampling the intermediate signal based on the compression factor to derive the audible output signal.
12. A method as claimed in claim 1, wherein each section in said intermediate signal is derived using time domain expansion of said segment.
13. A method as claimed in claim 12, wherein each section in said intermediate comprises a repetition of said segment according to a repetition factor and the sections are concatenated.
14. A method as claimed in claim 10, wherein said step of sampling the intermediate signal is up-sampled based on the repetition factor to derive the audible output signal.
15. A method as claimed in claims 9 or 13, wherein said concatenation of the sections comprises applying a smoothing function to each section.
16. A method as claimed in claim 15, wherein the smoothing function comprises a cosine bell window. 17. A method as claimed in any one claims 8 to 11, wherein dividing the acoustic signal comprises searching for a portion of said acoustic signal above a selected power threshold and defining a retention portion of one of the segments of the acoustic signal at said portion. '
18. A method as claimed in any one of the preceding claims said method comprises a method of compressing and/or expanding signal bandwidth for real¬ time audible monitoring of acoustic signals outside human hearing range.
19. A system for real time audible monitoring of acoustic signals outside human hearing range, the system comprising: an analog to digital converter or sampler for sampling the acoustic signal; a processor for dividing said acoustic sampled acoustic signal into segments of a selected duration and for deriving an intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and a digital to analog converter for deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
20. A system as claimed in claim 19, wherein each section in said intermediate signal is derived by the processor using frequency domain compression of said segment.
21. A system as claimed in claim 20, wherein said frequency domain compression comprises obtaining a series of complex frequency estimates in the frequency domain for each segment.
22. A system as claimed in claim 21, wherein the processor replaces the series of complex frequency estimates with one or more single estimates, each single estimate being derived from m sequential ones of the series of complex frequency estimates. 23. A system as claimed in claim 22, wherein each single estimate is derived by the processor such that information corresponding to the amplitude and phase is preserved.
24. A system as claimed in claim 23, wherein each single estimate comprises a centroid for the corresponding m sequential complex frequency estimates.
25. A system as claimed in any one of claims 22 to 24, wherein deriving each single estimate by the processor comprises weighting the m sequential estimates according to the proximity to said each single estimate.
26. A system as claimed in claim 19, wherein each section in said intermediate signal is derived by the processor using time domain compression of said segment.
27. A system as claimed in claim 26, wherein said time domain compression comprises forming each section by truncating the corresponding segment in the time domain and concatenating the sections.
28. A system as claimed in claim 27 wherein said segment is truncated according to a compression factor (C):
C= fhJfmax- where fhi is the upper end of an original bandwidth in the acoustic signal, and fmax is the maximum audible frequency .
29. A system as claimed in claims 27 or 28, wherein the sampler down samples the intermediate signal based on the compression factor to derive the audible output signal.
30. A system as claimed in claim 19, wherein each section in said intermediate signal is derived by the processor using time domain expansion of said segment.
31. A system as claimed in claim 30, wherein each section in said intermediate comprises a repetition of said segment according to a repetition factor and the sections are concatenated. 32. A system as claimed in claim 31, wherein sampler samples the intermediate signal is up-sampled based on the repetition factor to derive the audible output signal.
33. A system as claimed in claims 27 or 31 , wherein said concatenation of the sections comprises applying a smoothing function to each section by the processor.
34. A system as claimed in claim 33, wherein the smoothing function comprises a cosine bell window.
35. A system as claimed in any one claims 26 to 29, wherein dividing the acoustic signal comprises searching for a portion of said acoustic signal above a selected power threshold and defining a retention portion of one of the segments of the acoustic signal at said portion.
37. A system as claimed in any one of claims 19 to 36, wherein the device is implemented as a portable device.
38. A system as claimed claim 37, wherein the portable device is incorporated into a diving harness.
39. A data storage medium having stored thereon code means for instructing a computer to execute a method of audible monitoring of acoustic signals outside human hearing range, the method comprising the steps of dividing said acoustic signal into segments of a selected duration; deriving a intermediate signal comprising a concatenation of sections, each section representative of one corresponding segment and having a different duration than the corresponding segment; and deriving an audible output signal from said intermediate signal such that portions of the audible output signal derived from respective sections of the intermediate signal have the selected duration.
PCT/SG2005/000294 2004-08-30 2005-08-30 A method and system for monitoring of acoustic signals WO2006025798A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US60515804P 2004-08-30 2004-08-30
US60/605,158 2004-08-30
US60553304P 2004-08-31 2004-08-31
US60/605,533 2004-08-31

Publications (1)

Publication Number Publication Date
WO2006025798A1 true WO2006025798A1 (en) 2006-03-09

Family

ID=36000342

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2005/000294 WO2006025798A1 (en) 2004-08-30 2005-08-30 A method and system for monitoring of acoustic signals

Country Status (1)

Country Link
WO (1) WO2006025798A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102009005890A1 (en) * 2009-01-23 2010-07-29 Audi Ag Method for analyzing vibration of e.g. wheels, of motor vehicle, involves transforming mechanical vibration into acoustic vibration that is in audible frequency range of human ear, where mechanical vibration is in specific frequency range
US8983677B2 (en) 2008-10-01 2015-03-17 Honeywell International Inc. Acoustic fingerprinting of mechanical devices
EP3217392A1 (en) * 2016-03-11 2017-09-13 Sonotec Ultraschallsensorik Halle GmbH Rendering wideband ultrasonic signals audible
US20180375592A1 (en) * 2015-12-21 2018-12-27 Hoseo University Academic Cooperation Foundation Underwater communication method
CN115204754A (en) * 2022-09-15 2022-10-18 山东西曼克技术有限公司 Heating power supply and demand information management platform based on big data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4792145A (en) * 1985-11-05 1988-12-20 Sound Enhancement Systems, Inc. Electronic stethoscope system and method
US5012452A (en) * 1972-05-01 1991-04-30 The United States Of America As Represented By The Secretary Of The Navy Pulse transformation sonar

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5012452A (en) * 1972-05-01 1991-04-30 The United States Of America As Represented By The Secretary Of The Navy Pulse transformation sonar
US4792145A (en) * 1985-11-05 1988-12-20 Sound Enhancement Systems, Inc. Electronic stethoscope system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KOAY T B ET AL, OCEANS, 9 November 2004 (2004-11-09), Retrieved from the Internet <URL:http://www.oceans-technoocean2004.com> *
SUKITTANON S ET AL: "Non stationary signal classification using joint frequency analysis", IEEE ICASSP, 6 April 2003 (2003-04-06) *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8983677B2 (en) 2008-10-01 2015-03-17 Honeywell International Inc. Acoustic fingerprinting of mechanical devices
DE102009005890A1 (en) * 2009-01-23 2010-07-29 Audi Ag Method for analyzing vibration of e.g. wheels, of motor vehicle, involves transforming mechanical vibration into acoustic vibration that is in audible frequency range of human ear, where mechanical vibration is in specific frequency range
DE102009005890B4 (en) * 2009-01-23 2012-11-15 Audi Ag Method for analyzing vibrations
US20180375592A1 (en) * 2015-12-21 2018-12-27 Hoseo University Academic Cooperation Foundation Underwater communication method
US11601205B2 (en) * 2015-12-21 2023-03-07 Hoseo University Academic Cooperation Foundation Underwater communication method
EP3217392A1 (en) * 2016-03-11 2017-09-13 Sonotec Ultraschallsensorik Halle GmbH Rendering wideband ultrasonic signals audible
US10658996B2 (en) 2016-03-11 2020-05-19 Sonotec Ultraschallsensorik Halle Gmbh Rendering wideband ultrasonic signals audible
CN115204754A (en) * 2022-09-15 2022-10-18 山东西曼克技术有限公司 Heating power supply and demand information management platform based on big data
CN115204754B (en) * 2022-09-15 2022-12-09 山东西曼克技术有限公司 Heating power supply and demand information management platform based on big data

Similar Documents

Publication Publication Date Title
Bhat et al. A real-time convolutional neural network based speech enhancement for hearing impaired listeners using smartphone
US11043210B2 (en) Sound processing apparatus utilizing an electroencephalography (EEG) signal
Popov et al. Noise-induced temporary threshold shift and recovery in Yangtze finless porpoises Neophocaena phocaenoides asiaeorientalis
KR102118411B1 (en) Systems and methods for source signal separation
CN108464015A (en) Microphone array signals processing system
Nemala et al. A multistream feature framework based on bandpass modulation filtering for robust speech recognition
CN101023469A (en) Digital filtering method, digital filtering equipment
WO2006025798A1 (en) A method and system for monitoring of acoustic signals
Wang et al. Attention-based fusion for bone-conducted and air-conducted speech enhancement in the complex domain
Do et al. On the recognition of cochlear implant-like spectrally reduced speech with MFCC and HMM-based ASR
McLoughlin Super-audible voice activity detection
Hanson et al. Real-time embedded implementation of the binary mask algorithm for hearing prosthetics
CN114927141B (en) Method and system for detecting abnormal underwater acoustic signals
Lian et al. Underwater acoustic target recognition based on gammatone filterbank and instantaneous frequency
Basha et al. Real-time enhancement of electrolaryngeal speech by spectral subtraction
Song et al. Study on Optimal Position and Covering Pressure of Wearable Neck Microphone for Continuous Voice Monitoring
Pratapwar et al. Reduction of background noise in alaryngeal speech using spectral subtraction with quantile based noise estimation
US20060029249A1 (en) Loudspeaker with hair leather diaphragm
Ishimitsu et al. A noise-robust speech recognition system making use of body-conducted signals
Wu et al. Robust target feature extraction based on modified cochlear filter analysis model
Liu et al. Analog cochlear model for multiresolution speech analysis
Guzewich et al. Cross-Corpora Convolutional Deep Neural Network Dereverberation Preprocessing for Speaker Verification and Speech Enhancement.
López et al. Single channel reverberation suppression based on sparse linear prediction
Au et al. Transmission beam pattern of a false killer whale
Liu et al. Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase