US20110213476A1 - Method and Device for Processing Audio Data, Corresponding Computer Program, and Corresponding Computer-Readable Storage Medium - Google Patents

Method and Device for Processing Audio Data, Corresponding Computer Program, and Corresponding Computer-Readable Storage Medium Download PDF

Info

Publication number
US20110213476A1
US20110213476A1 US13/036,690 US201113036690A US2011213476A1 US 20110213476 A1 US20110213476 A1 US 20110213476A1 US 201113036690 A US201113036690 A US 201113036690A US 2011213476 A1 US2011213476 A1 US 2011213476A1
Authority
US
United States
Prior art keywords
parameters
user
analysis
audio
conversion module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/036,690
Inventor
Gunnar Eisenberg
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of US20110213476A1 publication Critical patent/US20110213476A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H7/00Instruments in which the tones are synthesised from a data store, e.g. computer organs
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • G10H2240/171Transmission of musical instrument data, control or status information; Transmission, remote access or control of music data for electrophonic musical instruments
    • G10H2240/281Protocol or standard connector for transmission of analog or digital data to or from an electrophonic musical instrument
    • G10H2240/311MIDI transmission
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/311Neural networks for electrophonic musical instruments or musical processing, e.g. for musical recognition or control, automatic composition or improvisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/471General musical sound synthesis principles, i.e. sound category-independent synthesis methods

Definitions

  • the present application relates to a method and a device for processing audio data, as well as, to a corresponding computer program and a corresponding computer-readable storage medium, which may be implemented, in particular, in the field of audio processing.
  • a special field of the recording studio technology is, for instance, the so-called resynthesis.
  • an input signal e.g. a sound or noise
  • a mathematical transformation norm in an analyzing step
  • a weighted sum of base functions In a consecutive resynthesizing step, the original signal can be reassembled from this weighted sum.
  • Manipulation of the analysis results allows for individual signal aspects to be modified specifically, which is why resynthesis is useful.
  • the methods are adapted for filtering, but also for equalizers or noise suppression.
  • the existing technique for resynthesizing sounds is based either on filter banks or on FFT or wavelet transformation. Standard techniques in this respect being vocoders, phase vocoders, as well as sine models, respectively with or without a transient/noise component.
  • the program Live® by Ableton® is a music sequencer with integrated synthesizers and effect equipment.
  • the association of individual parameters into macro parameters is only possible to a limited extent. In particular, the entire parameter conversion is done purely manually.
  • the resynthesis functionality existing in the program is realized as a black box, so that there is no possibility of intervention for the user.
  • the systems Kore 1® and Kore 2® by Native Instruments® are synthesizers and effect equipment, the technical parameters of which can also be controlled by eight macro parameters.
  • the internal technical parameters may be associated manually via any type of network. Again, such systems have no automation. There is no possibility for resynthesis.
  • the program Alchemy® by Camel Audio® is a synthesizer and effect equipment, the technical parameters of which can in principle be managed by macro parameters just like by the Kore® systems by Native Instruments®. With the extensive resynthesis options, it is indeed possible to edit the technical analysis/resynthesis parameters created during the resynthesis process, but only manually and directly as technical parameters.
  • the program Spectral Delay® by Native Instruments® is a piece of effect equipment performing resynthesis by FFT. During the process of resynthesis, thereby 6144 technical analysis/resynthesis parameters are created as spectral data, which can be edited via a graphical user interface. However, herein, processing is done individually and purely manually for each parameter.
  • the Neuron® synthesizer by Hartman Music® allows for resynthesis of sounds by means of neural networks.
  • the neural networks are used as a transformation norm in order to store the sounds.
  • the individual parameters required for resynthesis are represented directly in the user interface, so that the system may indeed be operated by neural network specialists, but hardly by the average musician.
  • the system does not have any macro parameters or automation to help the user with control of the core technique.
  • FIG. 1 shows the principle of parameter conversion according to the invention.
  • FIG. 2 shows a resynthesis device based on parameter conversion of FIG. 1 .
  • mapping M user parameters onto N technical parameters is achieved by means of artificial intelligence.
  • a conversion module based on artificial intelligence is provided between the user interface, which will also be called user module hereafter, and the audio equipment as such.
  • any type of parameter in the user module, however, specifically technical parameters, and/or musical parameters (tone pitches, loudness levels, tone colors, note values, harmonies, transpositions, etc.) and/or subjective parameters (sad/cheerful, languid/vivid, classical/progressive, etc.) can be used as user parameters.
  • musical parameters tone pitches, loudness levels, tone colors, note values, harmonies, transpositions, etc.
  • subjective parameters sin/cheerful, languid/vivid, classical/progressive, etc.
  • M user parameters are then selected, and these M user parameters are transformed by parameter conversion into the N technical parameters.
  • any kind of parameter may be chosen, it is also possible to choose exotic parameters, which are not from the field of music or recording studio technique.
  • examples could be parameters from biology or color values from an RGB color space.
  • a plant corresponding to certain biological parameters could be represented, or a color via RGB parameters, which a synesthete will assign to a certain sound. Which specific sound from the audio equipment will eventually be assigned to these parameters can be taught to the artificial intelligence of the conversion module.
  • the present application allows for any type of user parameter to be used in any number for controlling any equipment.
  • the M parameters of the user module can be specified by the manufacturer of the synthesizer or effect equipment. If the equipment to be connected to the parameter conversion is also specified, then the training of the artificial intelligence can be entirely factory-set, so that the user does not have to be confronted therewith. If the equipment to be connected to the parameter conversion is to be chosen freely, then the artificial intelligence has to be trained for each piece of equipment at user level.
  • this process may be automated, so that the user does not necessarily have to have professional knowledge about the internal sequences of the training.
  • the N dimensional space formed by the N technical parameters will be scanned, with each point in the space corresponding to one parameter set.
  • the sound generated by each parameter set in the audio equipment is then assigned by the method of sound classification to a sound class, which in turn is fixedly associated with a set of M user parameters specified at factory level.
  • This set of M user parameters can then be associated during training of the artificial intelligence with the matching parameter set of the N technical parameters.
  • the M parameters of the user module can also be chosen and designated by the user himself regarding number and type, however, thereafter, the artificial intelligence has to be retrained with the newly defined parameters.
  • the present application also allows for user modules to be provided uniformly for different equipment, as by means of the artificial intelligence implemented according to the present application, standardized parameter conversion can take place.
  • the user parameters for all of the equipment used could be the same, so that for instance a single user module may be used for all available synthesizers.
  • a conversion module based on artificial intelligence will then be used for standardized parameter conversion.
  • the conversion module is then provided between the analysis module and a resynthesis module, so that user parameters and time-variant analysis parameters are input into the conversion module.
  • the process of resynthesis can be influenced easily by a few M user parameters (e.g. about 10 to 20) in a user module, in that the artificial intelligence will associate the M user parameters with K time-variant analysis parameters into N resynthesis parameters, and thereby transform the same.
  • M user parameters e.g. about 10 to 20
  • the artificial intelligence will associate the M user parameters with K time-variant analysis parameters into N resynthesis parameters, and thereby transform the same.
  • the present application allows for existing resynthesis algorithms to be controlled with few parameters which in principle may be chosen freely.
  • an analysis signal input signal
  • a guitar sound could be used for instance, from which the analysis module will determine K time-variant synthesis parameters.
  • the conversion module can by means of the K synthesis parameters and the M user parameters perform an appropriate synthesis, which makes use of artificial intelligence. Thereby, it is possible to alienate the original guitar sound for instance so that it turns into a mix of piano and flute.
  • AI artificial intelligence
  • the input and output are generally done in the form of vectors.
  • Each AI system has to be trained prior to meaningful usage. For this purpose, for one set of input vectors, the respectively correct output vectors must be known.
  • the exact training algorithm depends on the respective structure of the AI system. Upon successful training, the AI system is basically capable of generating the correct output vectors even for unknown input vectors.
  • Modular systems for synthesizers and effect equipment can be simplified significantly as to the control thereof, in that each individually created piece of audio equipment is standardized by the parameter conversion of the invention.
  • control data in sequencing programs, such as e.g. Logic®, Cubase®, or Live®.
  • Any type of sound can be reduced by the resynthesis-assisted AI of the present invention to models and edited and transformed via uniform, simple user parameters.
  • FIG. 1 illustratively shows a piece of equipment of the recording studio technique, or part thereof, composed of three modules.
  • the three modules can be realized as different hardware or in one piece of hardware, in which the three modules are logically separated from each other.
  • a user module (user interface) 10 provides a user with selection of user parameters from which the user selects M parameters. These M user parameters are then supplied to a conversion module 11 , which maps the M user parameters by means of artificial intelligence onto N technical parameters. These N technical parameters, the number of which, according to a preferred embodiment of the invention, is notably greater than the number of the M user parameters, are entered into some audio equipment 12 .
  • the audio equipment processes audio data and/or audio control data with the N technical parameters into an audio signal 13 and outputs the same.
  • the audio data may already be stored in the audio equipment 12 . It is also possible that audio control data, such as e.g. MIDI data, from one or more pieces of external equipment (not shown), such as e.g. MIDI keyboards, is entered into the audio equipment 12 for manipulating the audio data stored therein. Furthermore, it is possible that the audio data or part of the audio data from one or more pieces of external equipment (not shown), such as e.g. other synthesizers, is entered into the audio equipment 12 .
  • the so-called external equipment may be contained inside the audio equipment itself and be realized as logically separate modules, as for instance in keyboard work stations, or as stand-alone hardware devices be separated from the audio equipment.
  • the audio equipment may for instance be a stand-alone rack synthesizer or a software plug-in.
  • the resynthesis device may for instance be part of some equipment of the recording studio technique or be embodied as some stand-alone equipment.
  • the resynthesis device has an analysis module 14 , into which an input signal 15 is entered.
  • This input signal may be single-channel (mono), dual-channel (stereo), or multi-channel (e.g. Dolby Surround®, DTS®).
  • the input signal 15 is analyzed by the analysis module 14 in order to determine K time-variant analysis parameters therefrom. For instance, the input signal is subjected to a specific transformation, resulting in the K time-variant analysis parameters.
  • K time-variant analysis parameters are in addition to the M user parameters from the user module 10 entered into the conversion module 11 .
  • the conversion module 11 will then map the M user parameters and the K analysis parameters by means of artificial intelligence onto N technical parameters, which in this particular case of resynthesis may also be called resynthesis parameters.
  • These N resynthesis parameters are then used in a resynthesis module 16 for generating an output signal 17 .
  • This output signal may be single-channel (mono), dual-channel (stereo) or multi-channel (e.g. Dolby Surround®, DTS®).
  • the method of the present application may be carried out by a data processing system having a microprocessor, memory, and a storage means, and a computer program loaded into the storage means, wherein at least the mapping the M user parameters onto N technical parameters is carried out by the computer program.
  • the steps and procedures of the present application may be performed manually or automatically in response to selected criteria.
  • the method of the present application may be utilized in the form of a computer-readable storage medium on which a computer program is stored that enables a data processing system, such as the data processing system described above.

Abstract

In a method, device, computer program, and computer-readable storage medium for processing audio data, which may be implemented in particular in the field of audio processing, M user parameters are entered into a conversion module, the M user parameters are mapped onto N technical parameters by means of artificial intelligence in the conversion module, the N technical parameters are delivered to some audio equipment, audio data is processed in the audio equipment with the N technical parameters into an output signal, and the output signal is delivered from the audio equipment.

Description

  • This application claims priority under 35 U.S.C. §119 to German Patent Application No. DE 10 2010 009745.4, filed on 1 Mar. 2010, which is incorporated herein by reference for all purposes.
  • BACKGROUND
  • 1. Field of the Application
  • The present application relates to a method and a device for processing audio data, as well as, to a corresponding computer program and a corresponding computer-readable storage medium, which may be implemented, in particular, in the field of audio processing.
  • 2. Description of Related Art
  • Known equipment in the recording studio technique, such as synthesizers and audio effect equipment, have user interfaces, which are designed individually depending on the equipment. At such user interfaces, parameters of algorithms used for audio processing are accessible directly as technical parameters (frequencies, amplitudes, spectra, durations, factors, addends, etc.). However, this established concept has the disadvantage that for the control, a user has to muster a high degree of technical understanding, as he or she will be confronted with a multitude of technical parameters (usually within the range from 50 to 150), the effect of which is often predictable only with in-depth technical knowledge. Here, it is to be noted that equipment of the recording studio technique is very frequently to be controlled by musicians and not only technicians. Also, due to the individual design of the user interfaces, the user respectively has to get acquainted again with the control of each piece of equipment, which may be very tedious and time-consuming.
  • A special field of the recording studio technology is, for instance, the so-called resynthesis. In resynthesis, an input signal (e.g. a sound or noise) is reduced via a mathematical transformation norm in an analyzing step to a weighted sum of base functions. In a consecutive resynthesizing step, the original signal can be reassembled from this weighted sum. Manipulation of the analysis results allows for individual signal aspects to be modified specifically, which is why resynthesis is useful.
  • As base functions, for instance simple sine waves of different frequencies can be chosen, the amplitudes of which are possibly manipulated, so as to amplify or attenuate individual frequencies.
  • As base functions, it is also possible for instance to use simple grains or wavelets of different extent and structure so as to amplify or attenuate individual characteristics in the frequency and time domain of the signal.
  • In recording studio technique, the methods are adapted for filtering, but also for equalizers or noise suppression. The existing technique for resynthesizing sounds is based either on filter banks or on FFT or wavelet transformation. Standard techniques in this respect being vocoders, phase vocoders, as well as sine models, respectively with or without a transient/noise component.
  • Inherent to all of these techniques for resynthesis is the problem that once a sound has been analyzed, a multitude of parameters (e.g. about 100 to 9000) are available as time-variant signals required for resynthesis. This multitude of parameters can hardly be edited manually anymore, so that most resynthesis systems are closed systems. This is also one of the reasons why the algorithms extensively studied in research are hardly put into practice.
  • The following known systems deal with the above-mentioned fields of the recording studio technique.
  • The program Live® by Ableton® is a music sequencer with integrated synthesizers and effect equipment. In order to keep the user interface simple, there is the possibility of assigning to each piece of audio equipment eight macro parameters mapping prominent technical parameters. The association of individual parameters into macro parameters is only possible to a limited extent. In particular, the entire parameter conversion is done purely manually. The resynthesis functionality existing in the program is realized as a black box, so that there is no possibility of intervention for the user.
  • The systems Kore 1® and Kore 2® by Native Instruments® are synthesizers and effect equipment, the technical parameters of which can also be controlled by eight macro parameters. For this purpose, the internal technical parameters may be associated manually via any type of network. Again, such systems have no automation. There is no possibility for resynthesis.
  • The program Alchemy® by Camel Audio® is a synthesizer and effect equipment, the technical parameters of which can in principle be managed by macro parameters just like by the Kore® systems by Native Instruments®. With the extensive resynthesis options, it is indeed possible to edit the technical analysis/resynthesis parameters created during the resynthesis process, but only manually and directly as technical parameters.
  • The program Spectral Delay® by Native Instruments® is a piece of effect equipment performing resynthesis by FFT. During the process of resynthesis, thereby 6144 technical analysis/resynthesis parameters are created as spectral data, which can be edited via a graphical user interface. However, herein, processing is done individually and purely manually for each parameter.
  • The Neuron® synthesizer by Hartman Music® allows for resynthesis of sounds by means of neural networks. Here, the neural networks are used as a transformation norm in order to store the sounds. The individual parameters required for resynthesis are represented directly in the user interface, so that the system may indeed be operated by neural network specialists, but hardly by the average musician. The system does not have any macro parameters or automation to help the user with control of the core technique.
  • Thus, in the processing of audio data, as it is implemented for instance in the recording studio technique, very frequently the problem arises that a user is confronted with a multitude of parameters, which are not directly obvious for him, as for this purpose, specific technical knowledge is required. Also, it is often the large number of parameters as such which prevents the user from doing an efficient, purposeful job.
  • Although great strides have been made in the area of processing audio data, many shortcomings remain.
  • DESCRIPTION OF THE DRAWINGS
  • FIG. 1 shows the principle of parameter conversion according to the invention.
  • FIG. 2 shows a resynthesis device based on parameter conversion of FIG. 1.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
  • According to the preferred embodiment of the present application, mapping M user parameters onto N technical parameters is achieved by means of artificial intelligence. For this purpose, a conversion module based on artificial intelligence is provided between the user interface, which will also be called user module hereafter, and the audio equipment as such. Thereby, it is possible to present the user with a clearly arranged number of parameters.
  • In the user module, any type of parameter, however, specifically technical parameters, and/or musical parameters (tone pitches, loudness levels, tone colors, note values, harmonies, transpositions, etc.) and/or subjective parameters (sad/cheerful, languid/vivid, classical/progressive, etc.) can be used as user parameters. Moreover, it is possible at the user interface to provide preferably only musical and/or subjective parameters to choose from, so that control will be clearly simplified for less technically inclined users. For operation, M user parameters are then selected, and these M user parameters are transformed by parameter conversion into the N technical parameters.
  • As in the user module, in principle any kind of parameter may be chosen, it is also possible to choose exotic parameters, which are not from the field of music or recording studio technique. In this respect, examples could be parameters from biology or color values from an RGB color space. Thus, in the user module, a plant corresponding to certain biological parameters could be represented, or a color via RGB parameters, which a synesthete will assign to a certain sound. Which specific sound from the audio equipment will eventually be assigned to these parameters can be taught to the artificial intelligence of the conversion module.
  • The present application allows for any type of user parameter to be used in any number for controlling any equipment. Here, it is possible preferably to choose a few, meaningful parameters (about 10 to 20), so that the user is not overwhelmed by too many technical parameters (about 50 to 150). Therefore, according to a preferred embodiment of the present application, M<N.
  • The M parameters of the user module can be specified by the manufacturer of the synthesizer or effect equipment. If the equipment to be connected to the parameter conversion is also specified, then the training of the artificial intelligence can be entirely factory-set, so that the user does not have to be confronted therewith. If the equipment to be connected to the parameter conversion is to be chosen freely, then the artificial intelligence has to be trained for each piece of equipment at user level.
  • However, this process may be automated, so that the user does not necessarily have to have professional knowledge about the internal sequences of the training. For this purpose, the N dimensional space formed by the N technical parameters will be scanned, with each point in the space corresponding to one parameter set. The sound generated by each parameter set in the audio equipment is then assigned by the method of sound classification to a sound class, which in turn is fixedly associated with a set of M user parameters specified at factory level. This set of M user parameters can then be associated during training of the artificial intelligence with the matching parameter set of the N technical parameters.
  • In principle, the M parameters of the user module can also be chosen and designated by the user himself regarding number and type, however, thereafter, the artificial intelligence has to be retrained with the newly defined parameters.
  • It would also be possible to envisage a user interface, which regarding the parameters thereof would be fundamentally defined by the manufacturer, however, from which any parameter could be displayed or masked by the user. Thus, training of the artificial intelligence by the manufacturer would be possible, but the user would still be able to configure the user interface largely himself, without having to retrain the artificial intelligence.
  • In particular, the present application also allows for user modules to be provided uniformly for different equipment, as by means of the artificial intelligence implemented according to the present application, standardized parameter conversion can take place. In other words, the user parameters for all of the equipment used could be the same, so that for instance a single user module may be used for all available synthesizers. This also applies for instance for effect equipment and other equipment of the recording studio technique. According to the present application, a conversion module based on artificial intelligence will then be used for standardized parameter conversion.
  • In the case of resynthesis, the conversion module is then provided between the analysis module and a resynthesis module, so that user parameters and time-variant analysis parameters are input into the conversion module. Thereby, the process of resynthesis can be influenced easily by a few M user parameters (e.g. about 10 to 20) in a user module, in that the artificial intelligence will associate the M user parameters with K time-variant analysis parameters into N resynthesis parameters, and thereby transform the same. The present application allows for existing resynthesis algorithms to be controlled with few parameters which in principle may be chosen freely.
  • The same applies for synthesizing entirely new sounds, without a known target signal being played, i.e. resynthesized. As an analysis signal (input signal), a guitar sound could be used for instance, from which the analysis module will determine K time-variant synthesis parameters. Next, the conversion module can by means of the K synthesis parameters and the M user parameters perform an appropriate synthesis, which makes use of artificial intelligence. Thereby, it is possible to alienate the original guitar sound for instance so that it turns into a mix of piano and flute.
  • In practice, systems of artificial intelligence (AI) will accept one or more input parameters and will deliver in response one or more output parameters. The input and output are generally done in the form of vectors. Each AI system has to be trained prior to meaningful usage. For this purpose, for one set of input vectors, the respectively correct output vectors must be known. The exact training algorithm depends on the respective structure of the AI system. Upon successful training, the AI system is basically capable of generating the correct output vectors even for unknown input vectors.
  • The following techniques are used for realizing AI systems.
  • Symbolic AI
      • In a descriptive language (e.g. predicate logic or propositional logic), known properties of the system are described with binding rules.
      • During training, the rules are transformed manually or via a predictive programming language, such as Prolog, so that explicit propositions regarding the treatment of the input data are created.
  • Statistical AI
      • Instead of the binding rules of a descriptive language, statistical models (e.g. Gaussian mixture model, hidden Markov model, k nearest neighbor) are used.
      • The discrete logical values of a descriptive language are replaced by probabilities. They are determined in the training phase by observing the statistical properties of the input vectors.
  • Neural AI
      • As a model of biological neurons, artificial neurons are built from simple mathematical operators and associated into very large networks.
      • The treatment of the parameters entered is mapped via the linking strength of the individual neurons among each other.
      • The standard structures used here are feed forward, Hopfield and winner takes all, which are mainly trained via the back propagation method.
  • Modular systems for synthesizers and effect equipment, such as e.g. Reaktor®, SynthMaker® or Tassman®, can be simplified significantly as to the control thereof, in that each individually created piece of audio equipment is standardized by the parameter conversion of the invention. The same applies for the control data in sequencing programs, such as e.g. Logic®, Cubase®, or Live®.
  • Any type of sound can be reduced by the resynthesis-assisted AI of the present invention to models and edited and transformed via uniform, simple user parameters. This gives musicians access to the field of complex mathematical transformations because the control is similar as for known samplers, such as e.g. Kontakt® or Logic EXS24®, with the resulting sounds of the invention however largely exceeding those of known samplers.
  • Hereafter, the invention will be explained more in detail with reference to the figures using various sample embodiments.
  • By means of FIG. 1, the principle of parameter conversion, on which the invention is based, will be described. FIG. 1 illustratively shows a piece of equipment of the recording studio technique, or part thereof, composed of three modules. In this case, the three modules can be realized as different hardware or in one piece of hardware, in which the three modules are logically separated from each other.
  • A user module (user interface) 10 provides a user with selection of user parameters from which the user selects M parameters. These M user parameters are then supplied to a conversion module 11, which maps the M user parameters by means of artificial intelligence onto N technical parameters. These N technical parameters, the number of which, according to a preferred embodiment of the invention, is notably greater than the number of the M user parameters, are entered into some audio equipment 12. The audio equipment processes audio data and/or audio control data with the N technical parameters into an audio signal 13 and outputs the same.
  • The audio data may already be stored in the audio equipment 12. It is also possible that audio control data, such as e.g. MIDI data, from one or more pieces of external equipment (not shown), such as e.g. MIDI keyboards, is entered into the audio equipment 12 for manipulating the audio data stored therein. Furthermore, it is possible that the audio data or part of the audio data from one or more pieces of external equipment (not shown), such as e.g. other synthesizers, is entered into the audio equipment 12. The so-called external equipment may be contained inside the audio equipment itself and be realized as logically separate modules, as for instance in keyboard work stations, or as stand-alone hardware devices be separated from the audio equipment. The audio equipment may for instance be a stand-alone rack synthesizer or a software plug-in.
  • With reference to FIG. 2, a resynthesis device will be described, which is based on the principle of parameter conversion according to the present application. The resynthesis device may for instance be part of some equipment of the recording studio technique or be embodied as some stand-alone equipment.
  • The resynthesis device has an analysis module 14, into which an input signal 15 is entered. This input signal may be single-channel (mono), dual-channel (stereo), or multi-channel (e.g. Dolby Surround®, DTS®). The input signal 15 is analyzed by the analysis module 14 in order to determine K time-variant analysis parameters therefrom. For instance, the input signal is subjected to a specific transformation, resulting in the K time-variant analysis parameters. These K time-variant analysis parameters are in addition to the M user parameters from the user module 10 entered into the conversion module 11. The conversion module 11 will then map the M user parameters and the K analysis parameters by means of artificial intelligence onto N technical parameters, which in this particular case of resynthesis may also be called resynthesis parameters. These N resynthesis parameters are then used in a resynthesis module 16 for generating an output signal 17.
  • In the two sample embodiments described for parameter conversion and resynthesis, respectively one audio signal 13, 17 is output. This output signal may be single-channel (mono), dual-channel (stereo) or multi-channel (e.g. Dolby Surround®, DTS®).
  • It will be appreciated that the method of the present application, including one or more of the steps, may be carried out by a data processing system having a microprocessor, memory, and a storage means, and a computer program loaded into the storage means, wherein at least the mapping the M user parameters onto N technical parameters is carried out by the computer program. In addition, the steps and procedures of the present application may be performed manually or automatically in response to selected criteria.
  • Furthermore, the method of the present application may be utilized in the form of a computer-readable storage medium on which a computer program is stored that enables a data processing system, such as the data processing system described above.
  • It is apparent that an invention with significant advantages has been described and illustrated. The particular embodiments disclosed above are illustrative only, as the invention may be modified and practiced in different but equivalent manners apparent to those skilled in the art having the benefit of the teachings herein. It is therefore evident that the particular embodiments disclosed above may be altered or modified, and all such variations are considered within the scope and spirit of the invention. Accordingly, the protection sought herein is as set forth in the description. Although the present application is shown in a limited number of forms, it is not limited to just these forms, but is amenable to various changes and modifications without departing from the spirit thereof.

Claims (14)

1. A method for processing audio data, comprising:
inputting M user parameters as an input signal into a conversion module;
mapping the M user parameters onto N technical parameters by means of artificial intelligence in the conversion module;
delivering the N technical parameters to audio equipment;
processing audio data in the audio equipment with the N technical parameters into an audio output signal; and
delivering the audio output signal from the audio equipment.
2. The method according to claim 1, wherein M<N.
3. The method according to claim 1, wherein the audio data is entered into the audio equipment.
4. The method according to claim 1, further comprising:
an analysis module;
wherein the input signal is entered into the analysis module, the analysis module determines K analysis parameters from the input signal, and the K analysis parameters are entered into the conversion module.
5. The method according to claim 4, wherein the conversion module maps the M user parameters and the K analysis parameters onto the N technical parameters.
6. The method according to claim 5, wherein the N technical parameters are synthesis parameters, and the audio equipment performs a synthesis.
7. The method according to claim 5, wherein the analysis module performs a transformation of the input signal, resulting in K analysis parameters, the conversion module transforms the K analysis parameters based on the M user parameters into N resynthesis parameters, and the audio equipment generates the audio output signal based on the N resynthesis parameters.
8. The method according to claim 1, wherein the conversion module is trained in an automated process.
9. The method according to claim 1, further comprising:
a data processing system having a microprocessor, memory, and a storage means;
a computer program loaded into the storage means;
wherein at least the mapping the M user parameters onto N technical parameters is carried out by the computer program.
10. A device for processing audio data, comprising:
a user module for providing user parameters from which a user may choose;
a conversion module for receiving M user parameters from the user module and for mapping the M user parameters by means of artificial intelligence onto N technical parameters; and
audio equipment for receiving the N technical parameters from the conversion module, for processing audio data with the N technical parameters into an output signal and for delivering the output signal.
11. The device according to claim 10, wherein M<N.
12. The device according to claim 10, further comprising:
one or more pieces of external equipment for providing the audio equipment with the audio data.
13. The device according to claim 10, further comprising:
an analysis module for determining from an input signal K analysis parameters and for inputting the K analysis parameters into the conversion module.
14. The device according to claim 10, wherein the conversion module is based on algorithms from at least one of symbolic artificial intelligence, neural artificial intelligence, and statistical artificial intelligence.
US13/036,690 2010-03-01 2011-02-28 Method and Device for Processing Audio Data, Corresponding Computer Program, and Corresponding Computer-Readable Storage Medium Abandoned US20110213476A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102010009745A DE102010009745A1 (en) 2010-03-01 2010-03-01 Method and device for processing audio data
DE102010009745.4 2010-03-01

Publications (1)

Publication Number Publication Date
US20110213476A1 true US20110213476A1 (en) 2011-09-01

Family

ID=44501992

Family Applications (1)

Application Number Title Priority Date Filing Date
US13/036,690 Abandoned US20110213476A1 (en) 2010-03-01 2011-02-28 Method and Device for Processing Audio Data, Corresponding Computer Program, and Corresponding Computer-Readable Storage Medium

Country Status (2)

Country Link
US (1) US20110213476A1 (en)
DE (1) DE102010009745A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130103173A1 (en) * 2010-06-25 2013-04-25 Université De Lorraine Digital Audio Synthesizer
US20140379333A1 (en) * 2013-02-19 2014-12-25 Max Sound Corporation Waveform resynthesis
CN111145723A (en) * 2019-12-31 2020-05-12 广州酷狗计算机科技有限公司 Method, device, equipment and storage medium for converting audio

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5740260A (en) * 1995-05-22 1998-04-14 Presonus L.L.P. Midi to analog sound processor interface
US6236966B1 (en) * 1998-04-14 2001-05-22 Michael K. Fleming System and method for production of audio control parameters using a learning machine
US20050248476A1 (en) * 1997-11-07 2005-11-10 Microsoft Corporation Digital audio signal filtering mechanism and method
US20060147068A1 (en) * 2002-12-30 2006-07-06 Aarts Ronaldus M Audio reproduction apparatus, feedback system and method
US7138575B2 (en) * 2002-07-29 2006-11-21 Accentus Llc System and method for musical sonification of data
US20070025566A1 (en) * 2000-09-08 2007-02-01 Reams Robert W System and method for processing audio data
US7212640B2 (en) * 1999-11-29 2007-05-01 Bizjak Karl M Variable attack and release system and method
US7301093B2 (en) * 2002-02-27 2007-11-27 Neil D. Sater System and method that facilitates customizing media
US20070288410A1 (en) * 2006-06-12 2007-12-13 Benjamin Tomkins System and method of using genetic programming and neural network technologies to enhance spectral data
US20080021851A1 (en) * 2002-10-03 2008-01-24 Music Intelligence Solutions Music intelligence universe server
US20090157575A1 (en) * 2004-11-23 2009-06-18 Koninklijke Philips Electronics, N.V. Device and a method to process audio data , a computer program element and computer-readable medium
US7555715B2 (en) * 2005-10-25 2009-06-30 Sonic Solutions Methods and systems for use in maintaining media data quality upon conversion to a different data format
US20090182736A1 (en) * 2008-01-16 2009-07-16 Kausik Ghatak Mood based music recommendation method and system
US7580839B2 (en) * 2006-01-19 2009-08-25 Kabushiki Kaisha Toshiba Apparatus and method for voice conversion using attribute information
US20100138220A1 (en) * 2008-11-28 2010-06-03 Fujitsu Limited Computer-readable medium for recording audio signal processing estimating program and audio signal processing estimating device
US20100180224A1 (en) * 2009-01-15 2010-07-15 Open Labs Universal music production system with added user functionality
US20100183161A1 (en) * 2007-07-06 2010-07-22 Phonak Ag Method and arrangement for training hearing system users
US20100220879A1 (en) * 2007-10-16 2010-09-02 Phonak Ag Hearing system and method for operating a hearing system
US7842874B2 (en) * 2006-06-15 2010-11-30 Massachusetts Institute Of Technology Creating music by concatenative synthesis
US20110191101A1 (en) * 2008-08-05 2011-08-04 Christian Uhle Apparatus and Method for Processing an Audio Signal for Speech Enhancement Using a Feature Extraction
US8086448B1 (en) * 2003-06-24 2011-12-27 Creative Technology Ltd Dynamic modification of a high-order perceptual attribute of an audio signal

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5740260A (en) * 1995-05-22 1998-04-14 Presonus L.L.P. Midi to analog sound processor interface
US20050248476A1 (en) * 1997-11-07 2005-11-10 Microsoft Corporation Digital audio signal filtering mechanism and method
US20050248474A1 (en) * 1997-11-07 2005-11-10 Microsoft Corporation GUI for digital audio signal filtering mechanism
US7257452B2 (en) * 1997-11-07 2007-08-14 Microsoft Corporation Gui for digital audio signal filtering mechanism
US6236966B1 (en) * 1998-04-14 2001-05-22 Michael K. Fleming System and method for production of audio control parameters using a learning machine
US7212640B2 (en) * 1999-11-29 2007-05-01 Bizjak Karl M Variable attack and release system and method
US20070025566A1 (en) * 2000-09-08 2007-02-01 Reams Robert W System and method for processing audio data
US7301093B2 (en) * 2002-02-27 2007-11-27 Neil D. Sater System and method that facilitates customizing media
US7138575B2 (en) * 2002-07-29 2006-11-21 Accentus Llc System and method for musical sonification of data
US20080021851A1 (en) * 2002-10-03 2008-01-24 Music Intelligence Solutions Music intelligence universe server
US20060147068A1 (en) * 2002-12-30 2006-07-06 Aarts Ronaldus M Audio reproduction apparatus, feedback system and method
US8086448B1 (en) * 2003-06-24 2011-12-27 Creative Technology Ltd Dynamic modification of a high-order perceptual attribute of an audio signal
US20090157575A1 (en) * 2004-11-23 2009-06-18 Koninklijke Philips Electronics, N.V. Device and a method to process audio data , a computer program element and computer-readable medium
US7555715B2 (en) * 2005-10-25 2009-06-30 Sonic Solutions Methods and systems for use in maintaining media data quality upon conversion to a different data format
US7580839B2 (en) * 2006-01-19 2009-08-25 Kabushiki Kaisha Toshiba Apparatus and method for voice conversion using attribute information
US20070288410A1 (en) * 2006-06-12 2007-12-13 Benjamin Tomkins System and method of using genetic programming and neural network technologies to enhance spectral data
US7842874B2 (en) * 2006-06-15 2010-11-30 Massachusetts Institute Of Technology Creating music by concatenative synthesis
US20100183161A1 (en) * 2007-07-06 2010-07-22 Phonak Ag Method and arrangement for training hearing system users
US20100220879A1 (en) * 2007-10-16 2010-09-02 Phonak Ag Hearing system and method for operating a hearing system
US20090182736A1 (en) * 2008-01-16 2009-07-16 Kausik Ghatak Mood based music recommendation method and system
US20110191101A1 (en) * 2008-08-05 2011-08-04 Christian Uhle Apparatus and Method for Processing an Audio Signal for Speech Enhancement Using a Feature Extraction
US20100138220A1 (en) * 2008-11-28 2010-06-03 Fujitsu Limited Computer-readable medium for recording audio signal processing estimating program and audio signal processing estimating device
US20100180224A1 (en) * 2009-01-15 2010-07-15 Open Labs Universal music production system with added user functionality
US20100179674A1 (en) * 2009-01-15 2010-07-15 Open Labs Universal music production system with multiple modes of operation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130103173A1 (en) * 2010-06-25 2013-04-25 Université De Lorraine Digital Audio Synthesizer
US9170983B2 (en) * 2010-06-25 2015-10-27 Inria Institut National De Recherche En Informatique Et En Automatique Digital audio synthesizer
US20140379333A1 (en) * 2013-02-19 2014-12-25 Max Sound Corporation Waveform resynthesis
CN111145723A (en) * 2019-12-31 2020-05-12 广州酷狗计算机科技有限公司 Method, device, equipment and storage medium for converting audio

Also Published As

Publication number Publication date
DE102010009745A1 (en) 2011-09-01

Similar Documents

Publication Publication Date Title
Cheuk et al. nnaudio: An on-the-fly gpu audio to spectrogram conversion toolbox using 1d convolutional neural networks
JP7243052B2 (en) Audio extraction device, audio playback device, audio extraction method, audio playback method, machine learning method and program
US10564923B2 (en) Method, system and artificial neural network
Canadas-Quesada et al. Percussive/harmonic sound separation by non-negative matrix factorization with smoothness/sparseness constraints
Barchiesi et al. Reverse engineering of a mix
Ramírez et al. Differentiable signal processing with black-box audio effects
Garcia Growing sound synthesizers using evolutionary methods
Miron et al. Monaural score-informed source separation for classical music using convolutional neural networks
Macret et al. Automatic design of sound synthesizers as pure data patches using coevolutionary mixed-typed cartesian genetic programming
CN109979428B (en) Audio generation method and device, storage medium and electronic equipment
US20110213476A1 (en) Method and Device for Processing Audio Data, Corresponding Computer Program, and Corresponding Computer-Readable Storage Medium
Martínez-Ramírez et al. Automatic music mixing with deep learning and out-of-domain data
Hoffman et al. Feature-Based Synthesis: Mapping Acoustic and Perceptual Features onto Synthesis Parameters.
Masuda et al. Quality-diversity for Synthesizer Sound Matching
Yee-King Automatic sound synthesizer programming: techniques and applications
Rodriguez-Serrano et al. A score-informed shift-invariant extension of complex matrix factorization for improving the separation of overlapped partials in music recordings
Macret et al. Automatic calibration of modified fm synthesis to harmonic sounds using genetic algorithms
Gounaropoulos et al. Synthesising timbres and timbre-changes from adjectives/adverbs
Loiseau et al. A model you can hear: Audio identification with playable prototypes
García Automatic generation of sound synthesis techniques
Gabrielli et al. A multi-stage algorithm for acoustic physical model parameters estimation
Tachibana et al. Comparative evaluations of various harmonic/percussive sound separation algorithms based on anisotropic continuity of spectrogram
Pereira et al. Musikverb: A harmonically adaptive audio reverberation
CN114667563A (en) Modal reverberation effect of acoustic space
Caetano et al. Interactive Control of Evolution Applied to Sound Synthesis.

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION