WO2001054060A2 - A method and apparatus of using a neural network to train a neural network - Google Patents

A method and apparatus of using a neural network to train a neural network Download PDF

Info

Publication number
WO2001054060A2
WO2001054060A2 PCT/US2001/002426 US0102426W WO0154060A2 WO 2001054060 A2 WO2001054060 A2 WO 2001054060A2 US 0102426 W US0102426 W US 0102426W WO 0154060 A2 WO0154060 A2 WO 0154060A2
Authority
WO
WIPO (PCT)
Prior art keywords
neural network
radon
function
model
trauung
Prior art date
Application number
PCT/US2001/002426
Other languages
French (fr)
Other versions
WO2001054060A3 (en
Inventor
Hawley K. Rising, Iii
Original Assignee
Sony Electronics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Electronics, Inc. filed Critical Sony Electronics, Inc.
Priority to AU2001231139A priority Critical patent/AU2001231139A1/en
Publication of WO2001054060A2 publication Critical patent/WO2001054060A2/en
Publication of WO2001054060A3 publication Critical patent/WO2001054060A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present invention relates generally to image compression. More particularly, the present invention relates to training a neural network.
  • Wavelet transforms are widely used in analysis, where they are known as “multires ⁇ lution analysis", and in image and audio compression, where they are used as a pyramid coding method for lossy compression.
  • the wavelets used are generally from a very small set of analytically designed wavelets, such as Daubechies wavelets, or quadrature mirror filters ( "QMF"). For some applications, designing specific wavelets with special coding properties would be beneficial.
  • a method and apparatus of training a neural network include creating a model for a desired function as a multi-dimensional function, dete ⁇ _ining if the created model fit a simple finite geometry model, and generating a Radon transform to fit the simple finite geometry model.
  • the desired function is fed tlirough the Radon transform to generate weights.
  • a multilayer perceptron of the neural network is trained usinsi the weights.
  • Figure 1 is a diagram of one embodiment of a multilayer percept ⁇ _n .
  • Figures 2a and 2b are illustrations of a unit square and a torus.
  • Figure 3a illustrates one embodiment of geodesies on a sphere.
  • Figure 3b is an illustration ot a leaf showmg one embodiment of the overlapping segments of the geodesic of a half-sphere.
  • Figure 4 is an illustration ot one embodiment of the mapping of half-sphere geodesies to a plane in a continuum.
  • Figure 5 is an illustration of one embodiment of budding dimension.
  • Figure 6 is a block diagram ot one embodiment ot a computer system.
  • Figure 7 is a flow diagram of one embodiment ot a method of designing a set ot wavelet basis.
  • Figure 8 is a block diagram of one embodiment of system for designing a set of wavelet basis.
  • Figure 9 is a flow diagram of one embodiment of a method of compressmg images
  • Figure 10 is a block diagram of one embodiment of system for compressmg images
  • Figure 11 is a flow diagram of one embodiment of a method of reconstructing audio/video/image data from lugher moment data.
  • Figure 12 is a block diagram of one embodiment ot system for reconstructing audio/video/image data from lugher moment data
  • Figure 13 is a flow diagram of one embodiment of a method of us g a neural network to tram a neural network
  • Figure 14 is a block diagram of one embodiment of system tor usmg a neural network to tram a neural network.
  • registers or other such mformaf on storage, transmission or display devices are examples of registers or other such mformaf on storage, transmission or display devices.
  • the present invention also reL es to apparatus for performing the operations
  • This apparatus may be specially constructed for the required purposes, or it may
  • Such a computer program may be stored a computer
  • readable storage medium such as, but is not limited to, any type of disk including floppy
  • RAMs random access memories
  • EPROMs EPROMs
  • EEPROMs electrically erasable programmable read-only memory
  • magnetic or optical cards or
  • any type ot media suitable for storing electronic instructions and each coupled to a
  • Wavelet transforms convert a signal mto a series of wavelets. Wavelets are convenient for data transformation because they are finite m nature and contam frequency information. Since most actual waves have a finite duration and their frequencies change abruptly, wavelet transforms are better approxunations for actual waveforms than other transforms, such as the Fourier transform. Signals processed by wavelet transforms are stored more efficiently than those processed by Fourier transforms.
  • Imagery may be created by inverting one or more layers ot unction m neural networks. Such a reversal of visual system processing may take place stages or all at once. Finite Radon transforms on certam geometries are used to accomplish the reversal of the visual system processmg. A dual system is created to certam feed forward network models of visual processmg, and its application to visual processmg and to non-image processmg applications is shown.
  • each line on a plane is assigned a value by adding up the values of all the pomts along the line (i e take the mtegral of the pomts along the une)
  • the value ot a particular pomt on a plane is calculated usmg the values of all the hues that go through the particular pomt
  • a neural network (“neural net”) is a collection of nodes and weighted connections between the nodes The nodes are configured in layers At each node, all of the mputs mto the node are summed, a no n- linear function is applied to the sum ot the mputs and the iesult is transmitted to the next layer of the neural network
  • a neural network may be used to budd a radon transform.
  • Multilayer perceptrons are a frequent tool m artificial neural networks MLP have been used to model some brain functions It has been shown that MLP with one hidden layer may be used to model any contmuous function Thus, given a layer that models a function of sufficient dimension, the ability to form Radon transform mverses implies the ability to fo ⁇ n all continuous functions
  • An MLP with one hidden layer is capable of representmg a contmuous function. It can be shown that the function it represents is the function that results from backprojectmg whatever function is represented at the hidden layer In Older to budd a neui al network (MLP) for a particular function, a trauung method is used (usually backprojection ) to set the weights at the hidden layer If the function is discrete, then it should be set to the Radon transform of the desired function, with a sharpening filter unposed to get rid of the blurring from the average If there is no average, then there is no blurring and no sharpening is needed
  • mput is put through a different kmd of neural network .
  • this network is set to create the Radon transform of the desired function, then it can be used to set the weights needed by the MLP.
  • So tlus neural netwo I (afferent on the hidden layer
  • the MLP Tlus is quicker than b ⁇ ckpropog ⁇ tion. and unlike traditional techniques such as backpropogation, it allows the calculation of additional weights to add neurons to the MLP ludden la /er
  • Figure 1 is a diagram ot a multdayer perceptron Multilayer perceptron 100 includes an mput 101 , a ludden layer 102, an afferentizik to dual network 103 and output 104
  • An mput 101 is received by MLP 100 and processed through ludden layer 102
  • Hidden layer 102 includes nodes 102a-f The nodes 102 ⁇ -t are shown tor illustration purposes
  • a ludden layer 102 may include greater or fewer nodes dependmg on the design of the MLP
  • Neural networks ot arbitral y complexity may be constructed using discrete and finite Radon transforms
  • a discrete and finite Radon transform involves takmg the values of hue segments instead of hues
  • the values of all the line segments on a plane are taken for the discrete Radon transform and the backprojection ot the Radon ti anstorm is accomplished usmg hue segments through a particular pomt
  • GeneraUy a backprojection is not the mverse of a Radon transform because there is some blur ⁇ ng
  • typically a filter is used to make the inverse sharpei
  • the function is transferred to a new function on pomts so that the backprojection ot a radon transform is the radon transform s invei se, there is no blur ⁇ ng
  • the transformation ot the function that causes the backprojection to be the vei se is a wavelet ti anstormation because it satisfies "the wavelet condition ' (that the average value ot the function is zero )
  • the central equation for constructing the neural networks the Guidikm or Bolker equation, mvolves backprojectmg the Radon transform and subtracting a global ( to the pomt m question ) average function
  • the nature of the average function to be subtracted is dependent on the transform geometry, and can be varied by varying the interconnect structure of the neural network
  • the transform is dual to the network
  • the transform may be weighted to a desued template function
  • Hidden layer 102 is repre sented as a Radon backpr ojection
  • mput 101 is the stored sum of the values of line s egments gomg through the pomt
  • a function representing ⁇ radon tra ist ir.n is performed on i le mput 101
  • ludden layer 102 receives mput 101 and afferent mputs 103 a-f
  • Afferent mputs 103a-f bemg transmitted to hidden layer 102 represent the back propagation of the MLP 100
  • MLP 100 represents ⁇ radon transform
  • afferent mputs 103 ⁇ -t are the mversions of the radon transform. Tie back propagation is used to adjust the weights of the function at ludden layer 102 so that the inversion 103 ⁇ -f is equal to the mput 101
  • Tlie sum of the mputs received at each of nodes 102a-f is processed by applying a function, such as a radon transform backprojection
  • Tlie afferent mputs are received through afferent hide 103 to a dual network (not shown) Tlie afferent mputs are mversions of the radon transforms The results of ludden layer 102 processmg are summed usmg a weighting to produce output 104
  • the wavelet prototype is fed through a network and its back propagation
  • the wavelet prototype is generally a function which is close to the desued shape, if that is known, although it is arbitrary
  • Tlie output is then used to modify the mput function by subtracting output from the mput function to obtam a difference and moving the mput function m the opposite direction from the difference Tlie process converges to zero difference between the mput and the output, which satisfies the wavelet condition Tlie resultmg function is then a "mother wavelet" from which a wavelet basis local to that pomt may be formed
  • Equation 1 where Z p is the rmg w ith p elements, with addition beuig addition modulo p, and multiplication likewise This is stund ⁇ id notation Tlie superscnpt 2 indicates that tlus is the rmg of ordered paus ot two members of this rmg, with addition and multiplication done componentwise It is the rmg ot pairs ( a. b) where a and b are in the rmg Z Tlus is known to one skilled m the art In equation 1 , the sum ( ⁇ ) is taken over the mcidence set
  • G ot lines the group which intersect ⁇ and the average, represented by ⁇ ( f), is taken over the whole group See F Matus and J Flussei . 'Image Repiesentations via a Finite
  • FIGs 2a and 2b are illustration ot a uiut square and a torus Unit square 200 mcludes sides 205a-d
  • torus 210 is formed by jouung opposite sides of a uiut square 200 such as, for example, sides 205a and 205c Tlus operation is an isometry, so that the size and shape ot a volume element does not change from that tor R ⁇ Consequently, the geodesies on the torus map to straight hues m R " . and they pack tightly , forming uniform approximations to the av erages on that surface
  • geodesies 201b and 202b on torus 210 map to straight hues 201 a and 202a unit square 200
  • Figure 3a illustrates one embodurient of geodesies on a sphei e Spheie 300 mcludes geodesies 30 l ⁇ and 301 b, for example On the sphere, geodesies are "great circles”, meaiung that foi S", any copy of S" ' shares the same center and radius as the sphere itself. . n antipodal map, which takes x to its opposmg pomt on the other side of the sphere, may be denoted by A(x). However, an mvertable transform on a sphere, usmg integration along geo Iesics, can not be obtamed because the geodesies through pomt x are identical to those through A(x)
  • the transform is essentiaUy restricted to a half-sphere transform.
  • An mversion over geodesies on the sphere may be obtamed as follows Assuming that for each pomt x () a half-sphere may be used, k + 1 geodesies 301a,b through n are chosen, divided into k sections including sections 302a,b, tor example On each section 302a.b of geodesic #, a representative . ⁇ tone ⁇ s chosen A discrete Radon transform is calculated by takmg the average ot the k sections usmg the formula
  • the sample on each geodesic contauung ⁇ is mdexed ⁇ , I
  • the sample picked for tlus segment on all geodesies is X Constramts on segmenting the geodesies insure that this is reasonable as k gets large.
  • Tlie backprojection at pomt x (> is defined to be the sum of the values on the geodesies through x 0 ,
  • Equation 5 is a limit value for the formula given bv Bolker in the case of sets m which theie aie DCtain block design constiamts
  • the constraints aie satisfied above b ⁇ notmg that, given two points on the half-spheie theie is exactly one geodesic passing through them, and bv the use of the index L guaranteeing that theie aie equal numbeis of geodesies through each point m the disci etization formula Specihcalh using Bolker s notation, a - L + 1 and ⁇ 1, so that the formula reads
  • Fust segments of geodesies are taken on the half-sphere If a pattern of finite segments is allowed, then one possible arrangement is to allow that each segment is incident on pomts only along the geodesic on which it lies, that each segment is incident on the same number ot points, and that there is a segment centered at each sample point
  • each segment is k. and theie is a sample centered at each ⁇ , ; , then there are k segments mcident on the trial sample point ⁇ ique
  • These A. segments comprise k ⁇ samples, counting repetition, so that an "average" over these segments would require a factor of 1/ k ⁇
  • the rest of the analysis proceeds as with the half-sphere analysis, except that there is a different average value calculation, and, a different wavelet condition Tlie average is replaced with a weighted average
  • each local set along a geodesic on the half-spheie will be referred to as a segment. and it will be assumed that each segment contains k samples Furthermoi e, it is assumed that the segments are centered at samples spaced one sample apart, so that along a given geodesic, the segment centered at ⁇ 7 contams ⁇ V, if, and oidy if, the distance between the two samples is less than ( A D/2.
  • Figure 3b is an illustration of a leaf showmg one embodiment of the overlapping segments of the geodesic of a half-sphere.
  • Point 310 represents a point AV.O ⁇ a half- sphere.
  • Segments 312, 314, 316, 318 and 320 overlap along a geodesic covering point AT..
  • Segments 312-320 form a leaf.
  • the weighted average ⁇ (/ ) needs to be expressed as a function of the Radon transform of/, not / itself See Bolker If the incidence structure of the points and segments is uruform, tlus is no problem because then every pomt ends up with k segments incident on it. and the weight g formula may be defined on the Radon transform segments by defi nuisanceg a distance d between the segment and x () to be the distance from x to the center of the segment
  • tlus leads to an average over all segments, weighted by distance and divided by a factor of A0 for the overlap
  • the same exercise may be done usmg different packing formulas, which amount to specifying the connectivity between pomts in the model of the visual system
  • Figure 4 is an illustration ot one embodiment of the mappmg ot half- sphere geodesies to a plane m a continuum.
  • Half- spheie 400 has geodesies 401a.b.c Tlie geodesies 401a.b,c are mapped 41 l ⁇ ,b,c to plane 420 including a grid 430
  • the shape shown for the grid 430 is a function ot the kmd of connectivity the contmuum has
  • a specific example wdl be used to illustrate the use of partial backprojections On the surface of a half-sphere with the same geodesies through point x ( > , ⁇ large number ot objects are desired to be formed by tak g paus ot geodesies through point ,;
  • correlations are forming, specifically junctions and end stoppmg cells ot a particular variety
  • Tlie correlations may be made more like end stoppmg cells by takmg half arcs jomed at the pomt x formulation . Since the two cases are conceptually identical, the latter formulation will be taken
  • Tlie coirelation may be built from the structures generated by a g ⁇ d of half-spheres
  • Tlie constructs are parameterized as follows At each point A, sets ore parameterized to be g ( ⁇ , ⁇ , x ) where ⁇ is the angle of the first h ⁇ lf-geodesic.
  • Tlus representation is correct is correct m the following way A set ot geodesies through pomts is assumed at the start. Tlie values on these geodesies are given by the Radon transform m the usual sense If sets of these structures, characterized by fixed angle ⁇ , are added, a different average v alue formula is obtamed, but the backprojection is of the same general form. Consequently, the iesult ot the transformation may be inverted m a single step
  • sets of geodesies may be taken from the set ot leaves m GAO, the set of all segments intersecting x . Because any set of these segments contams copes ot n, and because by rot ational symmetry, all rotations ot such sets may be taken as a basis at each pomt, the san t construct may be generated in formmg an mverse.
  • Such cor .tructs are referred to as 'pi rti d o ⁇ ckprojections "
  • Partial backprojections are unportant for tw o l easons
  • the first reason is that there are unportant examples ot these sets that correspond to human visual system constructs
  • the entue algorithmic formula comes mto play Supposmg a localized Radon transform of a color space, when vertmg the color space, the backprojection may be adjusted or the backprojection may be filtered to render no dispersion m the pomt spread function Then net effect is that edge information has been extracted at the expense ot level set mformation, and the level set information has been replaced with a new value Tlus is identical to a gray world assumption m the retinex or simdar algorithms
  • Figure 5 is an illustration of one embodiment of building dimension.
  • Two points 501 , 502 fo ⁇ n a l ne 51 1.
  • Two lines 51 1 , 512 fo ⁇ n a plane 521.
  • Two planes 521.522 fo ⁇ u a volume 531.
  • Figure 6 shows a diagrammatic representation of machine m the exemplary form
  • the machine may comprise a network router, a network switch, a network
  • PDA Personal Digital Assistant
  • the computer system 600 includes a processor 602, a mam memory 604 and a static memory 606, which communicate with each other vi i a bus 608
  • Tlie computer system 600 may further mclude a video display unit 610 (e g., a liquid crystal display (LCD) or ⁇ cathode ray tube (CRT)) Tlie computer syste>n 600 also mcludes an alphanumeric mput device 612 (e g , a keyboard), a cursor control device 614 (e g., a mouse), a disk drive uiut 616, a signal generation device 620 (e g , a speaker) and a network mterface device 622.
  • a video display unit 610 e g., a liquid crystal display (LCD) or ⁇ cathode ray tube (CRT)
  • Tlie computer syste>n 600 also mcludes an alphanumeric mput device 612 (e g ,
  • the disk drive unit 616 mcludes a computer- readable medium 624 on which is stored a set ot instructions (i e., software) 626 embodying any one, or ad, of the methodologies described above
  • Tlie software 626 is also shown to reside, completely or at least partially, within the mam memory 604 and/or within the processor 602
  • the software 626 may further be transnutted or received via the network mterface device 622
  • the term "computer-readable medium” shall be taken to mclude any medium that is capable ot stormg or encoding a sequence of instructions for execution by the computer and that cause the computer to perform any one of the methodologies of the present invention
  • Tlie term "computer-readable medium” shad accordmgly be taken to uicluded, but not be lunited to, solid-state memories, optical and magnetic disks, and ca ⁇ ier wave signals
  • Figure 7 is a flow diagram of one embodunent of a method of designing a set of wavelet basis to fit a particular problem
  • a neural network of arbitrary complexity is constructed usmg a discrete and finite Radon transform.
  • Central equation for doing the Radon transform may mclude the Gmdikm equation or the Bolker equation referenced above
  • Construction of the neural network wdl m clude back projectmg the Radon transform to a pomt and subtractmg a global average iun t ⁇ on of the pomt.
  • the global average function is dependent on the transfo ⁇ n geometry and may be v aried by varyuig the mterconnect structure by the neural network, as described above
  • the Radon transform may be w eighted to a desued template function Tlie neu al network may be budt to have a particular geometry so that, given a particular pomt, the size and shape ot the lme segments chosen for the Radon transto ⁇ ri and its back projection form the particular geometry
  • Tlie geometry may be any geometry such as. for example, a sphere or hyperbolic, etc
  • the Radon transform is duel to the network because the neural network performs the Radon transform and mveits the Radon transto ⁇ u
  • an mput wavelet prototype designed to tit ⁇ particular problem is ted through the neural network and its back propagation to u.se an output
  • the wavelet prototype may be a function w luch is close to the desired shape, if that is known Tlie wavelet is used to train the neural network to be specific to ⁇ certain set ot images
  • the mput function ot the neural netwoi k is modified usmg the output.
  • Tlie mput function is moddied by subtractmg the difference between the output of the neural net, wluch is the inverse of the Radon transfomi.
  • the mput ot the neural network which is the o ⁇ gmal data, such as tor example, an image Tlie difference between the output and the input is used by the neural network to modify the input function by moving the mput function in the opposite direction from the ditfei ence between the output and the input Tlus is then a "mother w avelet " from w Inch ⁇ wavelet basis local to that pomt to be to ⁇ ned
  • the neural network will produce wavelets that are capable of processmg the images with little loss Tlie trauung ot the neui al network may contmue untd the difference between the output and the input reaches a predete ⁇ nmed value, which may be an error value tor the neural network
  • a predete ⁇ nmed value which may be an error value tor the neural network
  • the trauung will cease so that the neural network is not overtramed Tlus method of construct ng wavelets is optimized for massively paradel unage processmg and distribution It optimizes around the unage or template bemg processed, and does not require ti at the exa ;t ⁇ h ⁇ racte ⁇ stics ot the template be known ⁇ nalyticady Tlie method of constructing wavelets also works tor any dimension, and can work on data that comes from experunent. when a template is not known, by using the template as ⁇ block design
  • Tlie method is adaptive m paradel. and could be used to geneiate w avelet basis tuned to very specdic templates, such as, tor example, to measure differences
  • Tlie method also allows wavelets be built tor unage analysis functions specdically . and adows "picture-centric " wavelet bases
  • Picture-centric wavelet bases includes wav elet bases that are specdic to a certain type ot image
  • the wavelet bases may be constructed for images of houses, which have a large number of parallel and horizontalizies
  • the wavelet basis may also be constructed to be an edge detector, as described above
  • the method of constructmg wavelets generalizes to many dimensions, and may be used to compress multi-dimensional data Tlie method, with another dimension, may be appropriate to spatio temporal data such as, tor example, video
  • the method of constructmg wavelets models the human visual system, and could be important to computer vision tasks
  • FIG 8 is a block diagram ot one embodiment of system 800 tor uesigi ⁇ ng a set of wavelet basis
  • System 800 includes a designing module 810 coupled to ⁇ feeder 820
  • the desigrung module 810 designs an input wavelet to fit a paiticular problem and this mput wavelet prototype is fed through a neural network 840 and its backpropagation to produce an output Tlie wavelet prototype may be a function wluch is close to the desued shape, if that is known
  • the wavelet is used to tram the neural network 840 to be specdic to a certam set ot unages
  • the neural network 840 is of arbitrary complexity and is constructed by a neural network constructor 830 usmg a discrete and fuute Radon transfomi Central equation tor domg the Radon transfomi may mclude the Gmdik i equation or the Bolker equation as refe ⁇ ed to above
  • Construction ot the neural network 840 wn include back projectmg the Radon transform to a pomt and subtractmg a global average function ot the point Tlie global average function is dependent on the transfo ⁇ n geometry and may be varied by varymg the interconnect structure by the neural network 840. as described above.
  • Tlie neural network 840 may be built to have a particular geometry so that, given a particular pomt, the size and shape of the line segments chosen tor the Radon ti anstorm and its back projection fo ⁇ n the particular geometry
  • Tie geometry may be any geometry such as, for example, ⁇ sphere or hyperbolic, etc
  • the Radon transfomi is duel to the netwoi k 840 because the neural netwoi k 840 perto ⁇ ns the Radon transfomi and inverts the Radon transfomi.
  • the neural network 840 is also coupled to a moddier module 850 that modifies an mput function of the neural network 840 using the output.
  • Tlie input function is moddied by subtracting the difference between the output of the neural network 840. which is the mverse of the Radon transform, and the input ot the neural network 840. hich is the original data, such as tor example, an image
  • the difference between the output and the input is used by the neural netwoi k 840 to modify the input function by moving the input function in the opposite direction from the difference between the output and the input Tlus is then a "mother wavelet" from wluch a wavelet basis local to that pomt to be fo ⁇ ned.
  • the neural network 840 wdl produce wavelets that are capable of processing the images ith little loss
  • the training of the neural network 840 may continue untd the difference between the output and the mput reaches a predete ⁇ mned alue, which may be an e ⁇ or value tor the neural network 840
  • a predete ⁇ mned alue which may be an e ⁇ or value tor the neural network 840
  • the trauung will cease so that the neural network 840 is not overtramed
  • ⁇ neural network h ⁇ v g ⁇ specific geometry is constructed usmg a discrete and finite Radon transfomi Tlie construction of the neural network is based on an analysis of the geometry ot the desued netwoi k Tlie specdic geometry chosen may depend on the simplicity ot the encoding, the simplicity of the decodmg, the natural geometry suggested by the subject mattei to be compressed, and/or the natural geometries suggested by the network architecture
  • the data to be compressed is fed tlirough the network to produce ⁇ transfo ⁇ n data stream
  • Data is passed through a neui l network that produces the i adon transfo ⁇ n of the data Passmg it through the MLP stage produces the backprojection of the Radon transfo ⁇ n
  • R Radon transform
  • R* backprojection of the Radon transfo ⁇ n
  • the whole system performs R*R on an mput Output is compared to mput and weights are set for the mput stage such that output minus mput equals zero
  • the resultmg mput stage is ⁇ wavelet transfomi Data passed through the input process is wavelet transto ⁇ ned by this stage That constitutes the "ti ansto ⁇ ned stream"
  • the mput stage becomes a wavelet transfomi Passmg data through this stage results m a transformed (by
  • the transform data stream is thresholded Thresholdmg the data stream may include thresholding the data based on predetermmed criteria
  • the predetermmed criteria may mclude quality of features to be preserved, such as. for example, outlines, or a criteria such as desued compression ratio Tlie thresholdmg process may also include removing components of the data stream above a predetermmed maximum fre ⁇ ency
  • frequencies that, for example, would normady not be seen by the human eye may be removed to reduce the amount ot data to be compressed
  • the thresholded data stream is entropy encoded to compress the data stream Tlie thresholded data stream may be divided into a plurality ot data streams if compressed data is to be stored m ⁇ distributed mode
  • the compressed stream may also be zero-tree encoded or bitplane encoded This produces the compressed stream Whether the transformed stream should be thresholded and/or zero-tree or bitplane encoded depends on the geomet ⁇ c design ot the Radon transfomi.
  • the mverse is the inverse ot the entropy and bitplane encoding plus the neural net express g R ⁇ R
  • the entropy and bitplane oi zeio-tree encoding is inverted ( standard) and passed tluough R ' R which produces the o ⁇ guial. decoded
  • the wavelet used to transform data is designed by the shape of the oriented filters and the geometry of the neural network
  • the wavelets may be generated to fit extraordmary forms of compression demands, or specdic material
  • Tlie method of compression also provides a method of cleaiung the data while compiessmg it
  • tlus is accomplished by usmg threshold functions which are soft (i.e., graduated ra .her than bmary) for compression geometry that have multiple resolutions.
  • tlus embodunent adows storage of the compressed data m a form which could be used for texture detection, some curvature and three-dunensional information, without decompressmg Tie partial backprojections may be done by the use of co ⁇ elation, such as the correlation ot neighbormg data, and adows image compression which is compatible with feature detection and query by content
  • Tlie method of compression ado s a very general, but very analytical, method toi designing unage compression.
  • Tlie method allows image compression which munmizes concentration of activity on a network, trauung of specialized wavelet compression methods for special data, and the creation of compression methods consistent with image querymg.
  • Figure 10 is a block diagram ot one embodiment ot system 1000 tor compressmg images
  • System 1000 mcludes a data repository 1010 coupled to a feeder 1020
  • Data repository 1010 contains data to be ted through a neural network 1040 by the feeder 1020.
  • the neural network 1040 has a specdic geometry and is constructed by a neural network constructor 1030 by usmg a fuute and discrete Radon transfomi
  • Tlie construction ot the neural network 1040 is based on an analysis of the geometry of the desued network Tlie specific geometry chosen may depend on the simplicity of the encodmg, the simphcity of the decodmg, the natural geometry suggested by the subject matter to be compressed, and/or the natural geometries suggested by the network architecture
  • the data is fed through the neural network 1040, and the neural network 1040 produces a transformed data stream Tie transformed data stream moves through a thresholder 1050 wluch thiesholds the data stream Tluesholdmg the J ⁇ t ⁇ siteam may mclude thresholdmg the data based on predete ⁇ med criteria
  • Tlie predetermmed criteria may include quality of features to be preserv ed, such as. for example, outlines, or a criteria such as desired compression ratio
  • Tlie tliresholdmg process may also mclude removing components of the data stream above a predetemimed niaxunum frequencv
  • frequencies that, foi example, would nonnully not be seen by the human ey e. may be remov ed to i educe the amount ot data to be compi essed
  • a fixed input signal feeder 1060 feeds a fixed input signal through the neural netwoi k 1040 to generate a decoding calculation ot an av erage v alue
  • the av erage value will be used to invert the Radon transtoim to recover the transfonned data Refemng back to Figuie 1
  • the feedback connections eliminate the av erage, w hich causes blumng Tlus is a function only ot the geometry ot the netwoi ⁇ signal mav be input that is fixed and constant ov er the netwoi inputs This produces the blui pai t of the output It the blur part ot the output is fed back to the weights on the network, tins can be used to tune the weights to make the output and mput match to tune the network
  • An entropy encoder 1070 is coupled to the thresholder 1050, and the entropy encoder 1070 encodes the thiesholded data stream coming out ot the thi esholdei 1050 Tlus compresses the data stream
  • the thresholded data siteam mav be div ided into a plurality ot data streams if compressed data is to be stoi ed in a distnubbed mode
  • Figure 11 is a flow diagram ot one embodiment ot a method ot reconstructuig audio/video/nnage data from lugher moment data
  • a finite Radon transfomi is pei formed on ⁇ lugher moment data At processuig block 1 102.
  • an average function is generated to adow mversion of the Radon transto ⁇ n in one step
  • Tlie average function may be calculated only based on the geometiy and used foi multiple i econstructions
  • the Radon transfonn at each pomt is conelated When ⁇ Radon transto ⁇ n of a
  • ⁇ resultant set ot duplications is calculated usmg the correlation process m order to generate a new average function
  • the sum is taken ot the partial backprojections ot the Radon transfomi at each point Tlie new average function tor each point is subtracted ft-om the sum ot the partial backprojections at that point 1 106 Tlie mverse to each step is representative of the Guidikm formula.
  • the general form tor discrete Radon transforms is explicitly given, in new cases, specifically tor the case in which balanced resolved block designs are not present
  • the solution is not a relaxation method Tlie solution is consistent with moments generated m unage analysis
  • the solution takes geometry mto account, significantly generalizing the moment method ot describing unage data
  • the method disclosed when executed m paradel, is potentially faster, smce it requires oidy a suigle step
  • the average function may be calculated only based on the geometry and used for multiple reconstructions.
  • Tlie solution can also model many different experimental designs and co ⁇ elation statistics.
  • the method can be trained for geometries with no closed form by backprojectmg a constant function
  • Figure 12 is a block diagram of one embodiment ot system 1200 for reconstructmg audio/video/image data from lugher moment data
  • System 1200 includes a lugher moment data repository 1210 coupled to a Radon transfomi module 1220 Tlie lugher moment data repository contauis lugher moment data that is used by the Radon transto ⁇ n module 1220 to perfo ⁇ n a fuute Radon transfo ⁇ n 1230
  • An average function generator 1240 generates an average function to adow mversion of the Radon transfomi 1230 m one step 1250 Tlie average function may be calculated only based on the geometry and used for multiple reconstructions
  • a correlation module 1260 co ⁇ elates the Radon ransfo ⁇ n 1230 at each pomt.
  • a Radon transfomi 1230 of a dunension one lugher t tan the o ⁇ guial transform is created by takmg correlations at ' ach pomt ot the transto ⁇ ned data, a partial backprojection is produced.
  • a summuig module 1280 sums partial backprojections of the Radon transform 1230 at each pomt.
  • a subtractmg module 1290 is coupled to both the calculator 1270 and the summuig module 1280. Tlie subtractmg module 1290 subtracts the new average function for each pomt from the sum of the partial backprojections at that pomt. Tlie mverse to each step is representative ot the Guidikm to ⁇ nula.
  • Tlie solution is not a relaxation method
  • the solution is consistent with moments generated ui unage analysis. Also, the solution takes geometry into account, significantly generauz g the moment method of desc ⁇ buig unage data.
  • Figure 13 is a flow diagram of one embodunent of a method of usmg a neural network to train a neural network.
  • a model for a desued function is created as a multidimensional function
  • a decision is made as to whether to model it in a suigle stage or not.
  • to detemime whether to model the function as a smgle stage or not it is determined if the created model fits a simple fuute geometry model or not. There is always a geometry that wid fit a particular applciation. If that geometry is better expressed as bemg of l gher dimension than 2, then the model will be to used multiple stages.
  • the desued function is fed through the Radon transfo ⁇ n to generate weights at processing block 1304 These weights are then used to tram a multilayer perceptron of the neural network as seen at processmg block 1305.
  • the const .naive proof is used to program neural networks for more than simple model problems.
  • Nc w neural networks can be created wl ch can model arbitrary functions with simple ii ve r sion formulas, making ti:e ⁇ r programmmg easier.
  • Tliis method adows suigle pass trauung of a neural network once the geometry of the trauung network is specdied. It also adows the interpolation of neurons m the ludden layer to add specificity. Tliis is not currently done with backpropagation. In addition, it adows simplification of a neural network functionality by analytic techniques from geometry and combinatorics.
  • FIG. 14 is a block diagram of one embodunent of system 1400 for using a neural network to traui a neural network.
  • System 1400 includes a model generator 1410 coupled to a decision module 1420.
  • the model generator 1410 creates a model for a desired function as a multi-dimensional function.
  • the decision module 1420 dete ⁇ nines if the created model fits a simple geometry model or not. There is always a geometry that will fit a particular application. If that geometry is better expressed as being of lugher dunension than 2, then the model wdl be to use multiple stages.

Abstract

A method and apparatus of training a neural network. The method and apparatus include creating a model for a desired function as a multi-dimensional function (1301), determining if the created model fits a simple finite geometry model (1302), and generating a Radon transform to fit the simple finite geometry model (1303). The desired function is fed through the Radon transform to generate weights (1304). A multilayer perception of the neural network is trained using the weights (1305).

Description

A METHOD AND APPARATUS OF USING A NEURAL NETWORK TO TRAIN A NEURAL NETWORK
FIELD OF THE INVENTION
The present invention relates generally to image compression. More particularly, the present invention relates to training a neural network.
BACKGROUND OF THE INVENTION
Wavelet transforms are widely used in analysis, where they are known as "multiresυlution analysis", and in image and audio compression, where they are used as a pyramid coding method for lossy compression. The wavelets used are generally from a very small set of analytically designed wavelets, such as Daubechies wavelets, or quadrature mirror filters ( "QMF"). For some applications, designing specific wavelets with special coding properties would be beneficial.
Presently, there are no methods to construct a neural network which performs the finite or discrete Radon transform, on the desired geometry to satisfy the connectedness of the desired neural network.
SUMMARY OF THE INVENTION A method and apparatus of training a neural network are described. The method and apparatus include creating a model for a desired function as a multi-dimensional function, deteπτ_ining if the created model fit a simple finite geometry model, and generating a Radon transform to fit the simple finite geometry model. The desired function is fed tlirough the Radon transform to generate weights. A multilayer perceptron of the neural network is trained usinsi the weights. BRIEF DESCRIPTION OF THE DRAWINGS
Features and advantages of the present mvention will be apparent to one skilled in the art in hght of the following detailed description in winch.
Figure 1 is a diagram of one embodiment of a multilayer perceptπ_n .
Figures 2a and 2b are illustrations of a unit square and a torus.
Figure 3a illustrates one embodiment of geodesies on a sphere.
Figure 3b is an illustration ot a leaf showmg one embodiment of the overlapping segments of the geodesic of a half-sphere.
Figure 4 is an illustration ot one embodiment of the mapping of half-sphere geodesies to a plane in a continuum.
Figure 5 is an illustration of one embodiment of budding dimension.
Figure 6 is a block diagram ot one embodiment ot a computer system.
Figure 7 is a flow diagram of one embodiment ot a method of designing a set ot wavelet basis.
Figure 8 is a block diagram of one embodiment of system for designing a set of wavelet basis.
Figure 9 is a flow diagram of one embodiment of a method of compressmg images;
Figure 10 is a block diagram of one embodiment of system for compressmg images,
Figure 11 is a flow diagram of one embodiment of a method of reconstructing audio/video/image data from lugher moment data.
Figure 12 is a block diagram of one embodiment ot system for reconstructing audio/video/image data from lugher moment data,
Figure 13 is a flow diagram of one embodiment of a method of us g a neural network to tram a neural network, and
Figure 14 is a block diagram of one embodiment of system tor usmg a neural network to tram a neural network.
DETAILED DESCRIPTION
A method and an apparatus ot creatmg wavelet basis are described In the
following detaded description of the present mventior, numerous specific detads are set forth m ord to piovide a thorough understandmg ot the piesent invention However, it
will be apparent to one skilled in the art that the present mvention may be practiced
without the e specific details In some instances, w ell-know n structures and devices are
shown m block diagram form, rather than in detad. m order to avoid obscuπng the present
invention
Some portions of the detailed descriptions that follow ai e presented m terms of
algorithms and symbolic representations ot operations on data bits w itlun a computer
memory These αlgoπtlurac descriptions and lepresentations ai e the means used by those
skilled ui the data processmg ai ts to most effectiveK convey the substance of their work to
others skilled m the ai t An algontlmi is hei e, and generalh . conceiv ed to be a self-
consistent sequence of steps leading to a desn ed iesult The steps ai e those l equirmg
physical manipulations of physical quantities Usually, though not necessaπly these
quantities take the form of electrical or magnetic signals capable of beuig stoied,
transferred, combmed, compared, and otherwise manipulated It has pi ov en convenient at
tunes, prmcipaUy for reasons of common usage, to refer to these signals as bits, values,
elements, symbols, characters, terms, numbers. 01 the like
It should be home in mind, ho v ei . that a l of these and sinulai terms ai e to be
associated with the appropriate physical quantities and aie merely convenient labels
applied to these quantities Uidess specifically stated otherwise as appai ent horn the
fodowing discussion, it is appreciated that thioughout the description, discussions utihzmg
terms such as "processmg" or "computmg or "calculating or 'determmmg 01
"displaying" or the like, iefer to the action and processes of a computei system. 01 simdai
electronic computmg device, that manipulates and transforms data repi esented as physical
(electronic ) quantities witlun the computer system s registers and memories mto other data
-V simdarly represented as physical quant ties witlun the computer system memories or
registers or other such mformaf on storage, transmission or display devices.
The present invention also reL es to apparatus for performing the operations
herem. This apparatus may be specially constructed for the required purposes, or it may
comprise a general purpose computer selectively activated or reconfigured by a computer
program stored in the computer. Such a computer program may be stored a computer
readable storage medium, such as, but is not limited to, any type of disk including floppy
disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs),
random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or
any type ot media suitable for storing electronic instructions, and each coupled to a
computer system bus.
The algorithms and displays presented herein are not inherently related to any
particular computer or other apparatus. Various general purpose systems may be used
with programs m accordance with the teachuigs herem, or it may prove convement to
construct more specialized apparatus to perform the requued method steps The requued
structure tor a variety of these systems will appear from the description below. In
addition, the present invention is not described with reference to any particular
programmuig language It will be appreciated that a variety of programming languages
may be used to unplement the teachmgs of the invention as described herem.
Wavelet transforms convert a signal mto a series of wavelets. Wavelets are convenient for data transformation because they are finite m nature and contam frequency information. Since most actual waves have a finite duration and their frequencies change abruptly, wavelet transforms are better approxunations for actual waveforms than other transforms, such as the Fourier transform. Signals processed by wavelet transforms are stored more efficiently than those processed by Fourier transforms.
Imagery may be created by inverting one or more layers ot unction m neural networks. Such a reversal of visual system processing may take place stages or all at once. Finite Radon transforms on certam geometries are used to accomplish the reversal of the visual system processmg. A dual system is created to certam feed forward network models of visual processmg, and its application to visual processmg and to non-image processmg applications is shown.
For purposes ot explanation, m the following description, numerous specdic details are set forth m order to provide a thorough understanding of the present invention. It will be apparent, however, to one skdled m the art that the present invention can be practiced without these details. In other instances, well-known structures and devices are showing block diagram toπn in order to avoid obscuring the present invention
In unage signal processmg, an underlying assumption for any model of the visual system must deal with recall of imagery as well as comparison, classification and identification. Images are recalled in some form such as mental imagery, dream sequences, and other forms of recall, more or less vivid, depending on the individual. A basic postulate of visual imagery is that imagery comes from the creation, or recreation, ot visual signals m the areas of the brain w hich process incoming images
One approach to modeling visual systems is to assume that the processes of the visual system would have to be inverted order to produce imagery witlun the visual system as a form of recall. Both the inversion process and the estimation of the visual system may be examined by looking at the inversion of the Radon transform. This is because the forward transformations, which m many cases occur m the visual system- resemble Radon transforms Thus, the process of extracting miormation from the visual stream is modeled usmg the Radon transform and its dual backpπηection. In Radon transforms, mstead ot assigning a value to each pomt on a plane, each line on a plane is assigned a value by adding up the values of all the pomts along the line (i e take the mtegral of the pomts along the une) To obtam the backprojection of the Radon transform, the value ot a particular pomt on a plane is calculated usmg the values of all the hues that go through the particular pomt
A neural network ("neural net") is a collection of nodes and weighted connections between the nodes The nodes are configured in layers At each node, all of the mputs mto the node are summed, a no n- linear function is applied to the sum ot the mputs and the iesult is transmitted to the next layer of the neural network A neural network may be used to budd a radon transform.
Multilayer perceptrons ( "MLP" ) are a frequent tool m artificial neural networks MLP have been used to model some brain functions It has been shown that MLP with one hidden layer may be used to model any contmuous function Thus, given a layer that models a function of sufficient dimension, the ability to form Radon transform mverses implies the ability to foπn all continuous functions
An MLP with one hidden layer is capable of representmg a contmuous function. It can be shown that the function it represents is the function that results from backprojectmg whatever function is represented at the hidden layer In Older to budd a neui al network (MLP) for a particular function, a trauung method is used ( usually backprojection ) to set the weights at the hidden layer If the function is discrete, then it should be set to the Radon transform of the desired function, with a sharpening filter unposed to get rid of the blurring from the average If there is no average, then there is no blurring and no sharpening is needed
In the context of human vision, mput is put through a different kmd of neural network . particularly one that performs a finite or discrete Radon transform. If this network is set to create the Radon transform of the desired function, then it can be used to set the weights needed by the MLP. So tlus neural netwo I (afferent on the hidden layer
-( - of the MLP) trams the MLP Tlus is quicker than bαckpropogαtion. and unlike traditional techniques such as backpropogation, it allows the calculation of additional weights to add neurons to the MLP ludden la /er
Figure 1 is a diagram ot a multdayer perceptron Multilayer perceptron 100 includes an mput 101 , a ludden layer 102, an afferent luik to dual network 103 and output 104 An mput 101 is received by MLP 100 and processed through ludden layer 102 Hidden layer 102 includes nodes 102a-f The nodes 102α-t are shown tor illustration purposes A ludden layer 102 may include greater or fewer nodes dependmg on the design of the MLP
Neural networks ot arbitral y complexity may be constructed using discrete and finite Radon transforms A discrete and finite Radon transform involves takmg the values of hue segments instead of hues Thus, the values of all the line segments on a plane are taken for the discrete Radon transform and the backprojection ot the Radon ti anstorm is accomplished usmg hue segments through a particular pomt
GeneraUy a backprojection is not the mverse of a Radon transform because there is some blurπng Thus, typically a filter is used to make the inverse sharpei However, if the function is transferred to a new function on pomts so that the backprojection ot a radon transform is the radon transform s invei se, there is no blurπng The transformation ot the function that causes the backprojection to be the vei se is a wavelet ti anstormation because it satisfies "the wavelet condition ' (that the average value ot the function is zero )
The central equation for constructing the neural networks, the Guidikm or Bolker equation, mvolves backprojectmg the Radon transform and subtracting a global ( to the pomt m question ) average function The nature of the average function to be subtracted is dependent on the transform geometry, and can be varied by varying the interconnect structure of the neural network
The transform is dual to the network Thus, the transform may be weighted to a desued template function Hidden layer 102 is repre sented as a Radon backpr ojection Thus, mput 101 is the stored sum of the values of line s egments gomg through the pomt At hidden layer 102, a function representing α radon tra ist ir.n is performed on i le mput 101
Thus, if the mput 101 is represented by x, the output 104 is represented by o = (σ∑i X1 vv,j , where σ is the radon transform- As illustrated, ludden layer 102 receives mput 101 and afferent mputs 103 a-f Afferent mputs 103a-f bemg transmitted to hidden layer 102 represent the back propagation of the MLP 100 Thus, if MLP 100 represents α radon transform, afferent mputs 103α-t are the mversions of the radon transform. Tie back propagation is used to adjust the weights of the function at ludden layer 102 so that the inversion 103α-f is equal to the mput 101
Tlie sum of the mputs received at each of nodes 102a-f is processed by applying a function, such as a radon transform backprojection
Tlie afferent mputs are received through afferent hide 103 to a dual network (not shown) Tlie afferent mputs are mversions of the radon transforms The results of ludden layer 102 processmg are summed usmg a weighting to produce output 104
After the network is prepared, the wavelet prototype, is fed through a network and its back propagation The wavelet prototype is generally a function which is close to the desued shape, if that is known, although it is arbitrary
Tlie output is then used to modify the mput function by subtracting output from the mput function to obtam a difference and moving the mput function m the opposite direction from the difference Tlie process converges to zero difference between the mput and the output, which satisfies the wavelet condition Tlie resultmg function is then a "mother wavelet" from which a wavelet basis local to that pomt may be formed
In constructmg discrete radon transform mverses. the inverse process on different geometries and for different structures are examined One inverse, based on cosets of Z2
has the form f ( x) = a( f ) + ∑ U ιg ) - ti( f ))
?EG.
(Equation 1 ) where Zp is the rmg w ith p elements, with addition beuig addition modulo p, and multiplication likewise This is stundαid notation Tlie superscnpt 2 indicates that tlus is the rmg of ordered paus ot two members of this rmg, with addition and multiplication done componentwise It is the rmg ot pairs ( a. b) where a and b are in the rmg Z Tlus is known to one skilled m the art In equation 1 , the sum (∑) is taken over the mcidence set
G ot lines the group which intersect \ and the average, represented by α( f), is taken over the whole group See F Matus and J Flussei . 'Image Repiesentations via a Finite
Radon Tianstorm. ' lEEE Ti an v PAMI. \ 15. no 10, 1993, pp 996- 1006 The form implies that, tor a function with zero average, the backprojection is the inverse If the cosets of Zl are plotted, the plot is essentially a disci etization of the closed geodesies on a
torus
Figures 2a and 2b are illustration ot a uiut square and a torus Unit square 200 mcludes sides 205a-d As seen m Figure 2b, torus 210 is formed by jouung opposite sides of a uiut square 200 such as, for example, sides 205a and 205c Tlus operation is an isometry, so that the size and shape ot a volume element does not change from that tor R~ Consequently, the geodesies on the torus map to straight hues m R". and they pack tightly , forming uniform approximations to the av erages on that surface For example, geodesies 201b and 202b on torus 210 map to straight hues 201 a and 202a unit square 200
Whde, the same is not true toi α half-sphere, an mterestmg version ot the mversion formal may be formed for the half-sphere wluch wid lead to findmg a reasonable formula for the human visual system.
Figure 3a illustrates one embodurient of geodesies on a sphei e Spheie 300 mcludes geodesies 30 lα and 301 b, for example On the sphere, geodesies are "great circles", meaiung that foi S", any copy of S" ' shares the same center and radius as the sphere itself. . n antipodal map, which takes x to its opposmg pomt on the other side of the sphere, may be denoted by A(x). However, an mvertable transform on a sphere, usmg integration along geo Iesics, can not be obtamed because the geodesies through pomt x are identical to those through A(x)
If the transform is restricted to functions tor which / (Λ) =/ (A(x)), then the transform is essentiaUy restricted to a half-sphere transform. By equatuig the pomts x and A(x), another essentially property is found. Given two pomts on the half-sphere, there is exactly one geodesic which passes through them.
An mversion over geodesies on the sphere may be obtamed as follows Assuming that for each pomt x() a half-sphere may be used, k + 1 geodesies 301a,b through n are chosen, divided into k sections including sections 302a,b, tor example On each section 302a.b of geodesic #,, a representative .γ„ ιs chosen A discrete Radon transform is calculated by takmg the average ot the k sections usmg the formula
fig, ) = γ k ∑ J=1 f(X)
(Equation 2).
To keep notation simple, by rearrangmg the indices for example, the sample on each geodesic contauung λ is mdexed Λ, I , and the sample picked for tlus segment on all geodesies is X Constramts on segmenting the geodesies insure that this is reasonable as k gets large.
Tlie backprojection at pomt x(> is defined to be the sum of the values on the geodesies through x0 ,
shχ 0 ) = ∑hg. )
(Equation 3) The sum mav be rearranged to be
s/(V = ∑ ∑/( > = <∑ ∑/u )+/(χ„,J) + 7∑/(v )
(Equation 4)
In equation 4 the ist term m the expiession contains one copv of each ot the L samples taken Denotmg the average lue over all the samples as / smce \ is chosen to be the sample for each segment m which it falls the equation is /(V= +∑ /(? )-/) (EquationS)
With some adjustments m the samples taken as the size tit the sample set giows / approaches the average value ot the function over the half-spheie and /(!,' ) approaches the usual definition ot the Radon transtoim Matus and Flussu found the same expression m the case ot the gioup Z' wheie then analysis performs the double
hbration alluded to b\ Bolkei See E D Bolkei Tlie Finite Radon Tianstorm Integral Geometn, AMS Contemporary Mathematics scr v 63, 1984, pp 27-50 F Matus and I Flusser, Image Representations via a Fuute Radon Tianstorm, IEEE Trans PAMI,\ 15, no 10, 1993, pp 996-1006
Equation 5 is a limit value for the formula given bv Bolker in the case of sets m which theie aie ceitain block design constiamts The constraints aie satisfied above b\ notmg that, given two points on the half-spheie theie is exactly one geodesic passing through them, and bv the use of the index L guaranteeing that theie aie equal numbeis of geodesies through each point m the disci etization formula Specihcalh using Bolker s notation, a - L + 1 and β = 1, so that the formula reads
f(λ) = SRf(\) + — - — u( ) = —SRf( \) + — u(f) (Equation 6) a- β a- β L k
In Vvedenska a and Gindikin s formula the term β does not show specifically because it is arranged b\ geometn See N D Vvedenskava and S G Gmdikui Discrete Radon Transform and Image Re> υnstruction ' Mathematical Problems in Tomographs . AMS, 1990, 141- 188 Tlie term β does ado w, however, the buddmg ot mterestmg biologically plausible structures
In order to create a scenario lesembung the fuute transforms encountered m bram processmg, a set of discrete transforms need to be w ven together into α sheet Tlus is done by usmg the foπnula for the half-sphere (Equation ??) and acknowledgmg the finiteness of each geodesic set
Fust segments of geodesies are taken on the half-sphere If a pattern of finite segments is allowed, then one possible arrangement is to allow that each segment is incident on pomts only along the geodesic on which it lies, that each segment is incident on the same number ot points, and that there is a segment centered at each sample point
If the number ot samples m each segment is k. and theie is a sample centered at each Λ,; , then there are k segments mcident on the trial sample point \„ These A. segments comprise k~ samples, counting repetition, so that an "average" over these segments would require a factor of 1/ k~ The rest of the analysis proceeds as with the half-sphere analysis, except that there is a different average value calculation, and, a different wavelet condition Tlie average is replaced with a weighted average
Each local set along a geodesic on the half-spheie will be referred to as a segment. and it will be assumed that each segment contains k samples Furthermoi e, it is assumed that the segments are centered at samples spaced one sample apart, so that along a given geodesic, the segment centered at Λ7 contams ΛV, if, and oidy if, the distance between the two samples is less than ( A D/2.
For each distance 0<d< (A 1 V2, there are two segments whose centers are this far from Λr; , so that there are a total of k- 1 segments which overlap Xa , but are not centered there Thus, there are k segments wluch contain n along any geodesic Because each segment contains k samples, there are a total of k~ values summed by summing up the segments along one geodesic overlapping n . Each set of segments along one v iodesie covering the point χlt will be referred to as a "leaf".
Figure 3b is an illustration of a leaf showmg one embodiment of the overlapping segments of the geodesic of a half-sphere. Point 310 represents a point AV.OΠ a half- sphere. Segments 312, 314, 316, 318 and 320 overlap along a geodesic covering point AT.. Segments 312-320 form a leaf.
Proceeding with the previous construction, without making adjustments for the number of overlaps or samples, the Radon transform for a segment of length k centered at sample x„ will be defined as
/'(£. = ∑/ ". ) (Equation 7)
and the backprojection or adjoint transform will be defined Sf(x0)= ∑ (£,.) (Equation 8)
to be the sum over the set of ad segments of aU leaves at x . Written for in terms of samples, for each leaf, the sum is sX(g„) = ∑(k-d)f(x,) (Equation 9).
As before, assuming k + 1 geodesies on the half- sphere, intersecting \n , the equation becomes
Sf(x0 ) = ∑ ∑ ( k - d)f(xη ) (Equation 10). ι=0 )=l
Tlie sum. as before, is manipulated to expose the inverse formula:
SfXχ = ∑∑(k- )f(x,) + fXx )J
Figure imgf000015_0001
(Equation 11). The term mside the parentheses r. Equation 1 1 has { k + 1 )(A0 - k) + k = A0 samples, mdicatmg that rf the Radon transform were defined with a factor accounting for the k~ samples occurring on each leaf, and the average w ere defined to be the sum on the right, with a weighting factor of 1/ A0, to account for the samples on each leaf, the inverse formula would be k f (xa ) = u{f ) + ∑ (j(g„ ) - u( f) (Equation 12)
The weighted average μ(/ ) needs to be expressed as a function of the Radon transform of/, not / itself See Bolker If the incidence structure of the points and segments is uruform, tlus is no problem because then every pomt ends up with k segments incident on it. and the weight g formula may be defined on the Radon transform segments by definuig a distance d between the segment and x() to be the distance from x to the center of the segment
For the spherical model, tlus leads to an average over all segments, weighted by distance and divided by a factor of A0 for the overlap The same exercise may be done usmg different packing formulas, which amount to specifying the connectivity between pomts in the model of the visual system
Figure 4 is an illustration ot one embodiment of the mappmg ot half- sphere geodesies to a plane m a continuum. Half- spheie 400 has geodesies 401a.b.c Tlie geodesies 401a.b,c are mapped 41 lα,b,c to plane 420 including a grid 430 The shape shown for the grid 430 is a function ot the kmd of connectivity the contmuum has
By usmg orthographic projection, a domam of pomts is obtamed at which oriented filters are represented by Radon set transforms Consequently, the form tor an mversion of a uniformly packed domam of finite segment Radon transforms have been found As with the other cases examined, ύ a functional mput exists which adheres to the wavelet condition, modified, m this case, to accommodate the weighted average rather than a uruform measure, the mverse can be obtamed directlv Lorn the backprojection Partial Backprojections
A specific example wdl be used to illustrate the use of partial backprojections On the surface of a half-sphere with the same geodesies through point x(> , α large number ot objects are desired to be formed by tak g paus ot geodesies through point ,; In neural terms, correlations are forming, specifically junctions and end stoppmg cells ot a particular variety
Tlie correlations may be made more like end stoppmg cells by takmg half arcs jomed at the pomt x„ . Since the two cases are conceptually identical, the latter formulation will be taken Tlie coirelation may be built from the structures generated by a gπd of half-spheres Tlie constructs are parameterized as follows At each point A, sets ore parameterized to be g ( ϋ,φ, x ) where ϋ is the angle of the first hαlf-geodesic. and φ is the angle from the first to the second The Radon transform horn the set ot points to the set of f> (&, φ, x ) may be denoted by f(g) = ∑ f (-Xj ) +f(xn) + ∑ f lX+ΨJ ) (Equation 13)
which is nothing more than the sum up one arc to AV, plus the sum up the other arc.
Movement between representations is possible, usmg the formulation given tor the discrete Radon transform above, by noting that if two such structures are taken, with 9'2=9'ι+ π/2, and summed, another Radon transform may be defined on pairs of geodesies Tlus duplicates the value f(x0)
Tlus representation is correct is correct m the following way A set ot geodesies through pomts is assumed at the start. Tlie values on these geodesies are given by the Radon transform m the usual sense If sets of these structures, characterized by fixed angle ϋ , are added, a different average v alue formula is obtamed, but the backprojection is of the same general form. Consequently, the iesult ot the transformation may be inverted m a single step
Generalizing tlus model, sets of geodesies may be taken from the set ot leaves m GAO, the set of all segments intersecting x . Because any set of these segments contams copes ot n, and because by rot ational symmetry, all rotations ot such sets may be taken as a basis at each pomt, the san t construct may be generated in formmg an mverse. Such cor .tructs are referred to as 'pi rti d oαckprojections "
Partial backprojections are unportant for tw o l easons The first reason is that there are unportant examples ot these sets that correspond to human visual system constructs The set just mentioned, for example, corresponds well to the angular cells among the hypercomplex cells - they respond to angles
Thus, it is shown that, with some adjustments for a thresholding process that occurs in forming such unions ( e g , throw out high frequency wavelets), the output ot such cells is reversible, and can be reversed m one backprojection step Tlus is an interesting point smce feedback afferent on the early stages of the human v isual system comes from stages that are separated by more than one step
Also, for the visual system, in a space in wluch the input functions do not have an average value equal to zero, the entue algorithmic formula comes mto play Supposmg a localized Radon transform of a color space, when vertmg the color space, the backprojection may be adjusted or the backprojection may be filtered to render no dispersion m the pomt spread function Then net effect is that edge information has been extracted at the expense ot level set mformation, and the level set information has been replaced with a new value Tlus is identical to a gray world assumption m the retinex or simdar algorithms
The second reason that tlus transformation is important is because, bemg a groupmg of the elements of the Radon transform ( ι e Lines) mto sets in an incidence structure, it represents another geometry tor the Radon transform, wluch may be defined as the sum of the hne values in the sets Tliis is just the definition given to the partial backprojection Consequently, the sets that define a partial backprojection have been used to foπn a new Radon transform Bolker has shown that J these sets are spreads, then the transform so generated will be constant on the range of the Radon transfoπn of the pomts. Bolker uses his in local form to budd a coho mo logy-like sequence of transforms.
There is nothing preventmg the taking of arbitrary sets of geodesies except tractabϋity, however, and the one chosen is particularly practical. Because the sets chosen give response to a coπelation of multiple (e.g. , two in this example) orientations, they are defined by a pair of lines, and therefore have the dimension of a plane.
Figure 5 is an illustration of one embodiment of building dimension. Two points 501 , 502 foπn a l ne 51 1. Two lines 51 1 , 512 foπn a plane 521. Two planes 521.522 foπu a volume 531.
Evidently, this transformation has resulted in increasing the dimension by one. This is evident from the fact that two angles and a two dimensional position must be specdied for each new segment set.
It may be noted that none of the derivation done for samples is affected by what dimension of sphere is being worked on. although one could add a factor of k for each dimension to satisfy intuition (the geodesies on S" are copies of S"'J sharing the property of common center with S"). Consequently. Radon transfoπn sequences may be built wliich build geometries of arbitrary dimension in this fashion.
Figure 6 shows a diagrammatic representation of machine m the exemplary form
of a computer system 600 within wliich a set of instructions, for causing the machine to
perfoπn any one of the methodologies discussed above, may be executed. In alternative
embodiments, the machine may comprise a network router, a network switch, a network
bridge. Personal Digital Assistant (PDA), a cellular telephone, a web appliance or any
machine capable of executing a sequence of instructions that specify actions to be taken by that machine.
The computer system 600 includes a processor 602, a mam memory 604 and a static memory 606, which communicate with each other vi i a bus 608 Tlie computer system 600 may further mclude a video display unit 610 (e g., a liquid crystal display (LCD) or α cathode ray tube (CRT)) Tlie computer syste>n 600 also mcludes an alphanumeric mput device 612 (e g , a keyboard), a cursor control device 614 (e g., a mouse), a disk drive uiut 616, a signal generation device 620 (e g , a speaker) and a network mterface device 622.
The disk drive unit 616 mcludes a computer- readable medium 624 on which is stored a set ot instructions ( i e., software) 626 embodying any one, or ad, of the methodologies described above Tlie software 626 is also shown to reside, completely or at least partially, within the mam memory 604 and/or within the processor 602 The software 626 may further be transnutted or received via the network mterface device 622 For the purposes of this specdication, the term " computer-readable medium" shall be taken to mclude any medium that is capable ot stormg or encoding a sequence of instructions for execution by the computer and that cause the computer to perform any one of the methodologies of the present invention Tlie term "computer-readable medium" shad accordmgly be taken to uicluded, but not be lunited to, solid-state memories, optical and magnetic disks, and caπier wave signals
Figure 7 is a flow diagram of one embodunent of a method of designing a set of wavelet basis to fit a particular problem
At processmg block 701 a neural network of arbitrary complexity is constructed usmg a discrete and finite Radon transform. Central equation for doing the Radon transform may mclude the Gmdikm equation or the Bolker equation referenced above
Construction of the neural network wdl mclude back projectmg the Radon transform to a pomt and subtractmg a global average iun tιon of the pomt. The global average function is dependent on the transfoπn geometry and may be v aried by varyuig the mterconnect structure by the neural network, as described above
The Radon transform may be w eighted to a desued template function Tlie neu al network may be budt to have a particular geometry so that, given a particular pomt, the size and shape ot the lme segments chosen for the Radon transtoπri and its back projection form the particular geometry Tlie geometry may be any geometry such as. for example, a sphere or hyperbolic, etc
The Radon transform is duel to the network because the neural network performs the Radon transform and mveits the Radon transtoπu
At processuig block 702. an mput wavelet prototype designed to tit α particular problem is ted through the neural network and its back propagation to u.se an output The wavelet prototype may be a function w luch is close to the desired shape, if that is known Tlie wavelet is used to train the neural network to be specific to α certain set ot images
At processuig block 703, the mput function ot the neural netwoi k is modified usmg the output. Tlie mput function is moddied by subtractmg the difference between the output of the neural net, wluch is the inverse of the Radon transfomi. and the mput ot the neural network, which is the oπgmal data, such as tor example, an image Tlie difference between the output and the input is used by the neural network to modify the input function by moving the mput function in the opposite direction from the ditfei ence between the output and the input Tlus is then a "mother w avelet" from w Inch α wavelet basis local to that pomt to be toπned
This process converges to zero difference between uiput and output, which satisfies the wavelet condition Thus, the neural network will produce wavelets that are capable of processmg the images with little loss Tlie trauung ot the neui al network may contmue untd the difference between the output and the input reaches a predeteπnmed value, which may be an error value tor the neural network Once the predetermined value is reached, the trauung will cease so that the neural network is not overtramed Tlus method of construct ng wavelets is optimized for massively paradel unage processmg and distribution It optimizes around the unage or template bemg processed, and does not require ti at the exa ;t < hαracteπstics ot the template be known αnalyticady Tlie method of constructing wavelets also works tor any dimension, and can work on data that comes from experunent. when a template is not known, by using the template as α block design
Tlie method is adaptive m paradel. and could be used to geneiate w avelet basis tuned to very specdic templates, such as, tor example, to measure differences Tlie method also allows wavelets be built tor unage analysis functions specdically . and adows "picture-centric" wavelet bases Picture-centric wavelet bases includes wav elet bases that are specdic to a certain type ot image For example, the wavelet bases may be constructed for images of houses, which have a large number of parallel and horizontal luies The wavelet basis may also be constructed to be an edge detector, as described above
The method of constructmg wavelets generalizes to many dimensions, and may be used to compress multi-dimensional data Tlie method, with another dimension, may be appropriate to spatio temporal data such as, tor example, video The method of constructmg wavelets models the human visual system, and could be important to computer vision tasks
Figure 8 is a block diagram ot one embodiment of system 800 tor uesigiπng a set of wavelet basis System 800 includes a designing module 810 coupled to α feeder 820 The desigrung module 810 designs an input wavelet to fit a paiticular problem and this mput wavelet prototype is fed through a neural network 840 and its backpropagation to produce an output Tlie wavelet prototype may be a function wluch is close to the desued shape, if that is known The wavelet is used to tram the neural network 840 to be specdic to a certam set ot unages
The neural network 840 is of arbitrary complexity and is constructed by a neural network constructor 830 usmg a discrete and fuute Radon transfomi Central equation tor domg the Radon transfomi may mclude the Gmdik i equation or the Bolker equation as refeπed to above
Construction ot the neural network 840 wn include back projectmg the Radon transform to a pomt and subtractmg a global average function ot the point Tlie global average function is dependent on the transfoπn geometry and may be varied by varymg the interconnect structure by the neural network 840. as described above.
The Radon transfomi mav be weighted to a desired template function. Tlie neural network 840 may be built to have a particular geometry so that, given a particular pomt, the size and shape of the line segments chosen tor the Radon ti anstorm and its back projection foπn the particular geometry Tie geometry may be any geometry such as, for example, α sphere or hyperbolic, etc
The Radon transfomi is duel to the netwoi k 840 because the neural netwoi k 840 pertoπns the Radon transfomi and inverts the Radon transfomi.
The neural network 840 is also coupled to a moddier module 850 that modifies an mput function of the neural network 840 using the output. Tlie input function is moddied by subtracting the difference between the output of the neural network 840. which is the mverse of the Radon transform, and the input ot the neural network 840. hich is the original data, such as tor example, an image The difference between the output and the input is used by the neural netwoi k 840 to modify the input function by moving the input function in the opposite direction from the difference between the output and the input Tlus is then a "mother wavelet" from wluch a wavelet basis local to that pomt to be foπned.
Tlus process converges to zero difference between input and output, which satisfies the wavelet condition Thus, the neural network 840 wdl produce wavelets that are capable of processing the images ith little loss The training of the neural network 840 may continue untd the difference between the output and the mput reaches a predeteπmned alue, which may be an eπor value tor the neural network 840 Once the predetermmed value is reached, the trauung will cease so that the neural network 840 is not overtramed
Figure 9 is a flow diagram ot one embodiment of a method of compre.. ,ιng images
At processmg block 901 , α neural network hαv g α specific geometry is constructed usmg a discrete and finite Radon transfomi Tlie construction of the neural network is based on an analysis of the geometry ot the desued netwoi k Tlie specdic geometry chosen may depend on the simplicity ot the encoding, the simplicity of the decodmg, the natural geometry suggested by the subject mattei to be compressed, and/or the natural geometries suggested by the network architecture
At processing block 902. the data to be compressed is fed tlirough the network to produce α transfoπn data stream Data is passed through a neui l network that produces the i adon transfoπn of the data Passmg it through the MLP stage produces the backprojection of the Radon transfoπn If the Radon transform is designated R, and the backprojection designated R*, then the whole system performs R*R on an mput Output is compared to mput and weights are set for the mput stage such that output minus mput equals zero The resultmg mput stage is α wavelet transfomi Data passed through the input process is wavelet transtoπned by this stage That constitutes the "ti anstoπned stream" By trauung the input stage to i esult in no blumiig horn the neural net, the mput stage becomes a wavelet transfomi Passmg data through this stage results m a transformed (by the wavelet transform) stream
At processmg block 903, the transform data stream is thresholded Thresholdmg the data stream may include thresholding the data based on predetermmed criteria The predetermmed criteria may mclude quality of features to be preserved, such as. for example, outlines, or a criteria such as desued compression ratio Tlie thresholdmg process may also include removing components of the data stream above a predetermmed maximum fre αency Thus, frequencies that, for example, would normady not be seen by the human eye, may be removed to reduce the amount ot data to be compressed
At pro ;essιng block 904, α fixed mput signal is ted back through the neural network to generate a decodmg calculation ot an average value Tlie average value will be used to invert the Radon transtoπn to recover the transformed data Refemng back to Figure 1 , the feedback coiuiections ehmmate the average, wluch causes blumng Tlus is a function ot the geometry of the netwoi k A signal may be input that is fixed and constant over the network inputs Tlus produces the blur part ot the output If the blur part of the output is fed back to the weights on the network, this can be used to tune the weights to make the output and input match to tune the network
At processmg block 905. the thresholded data stream is entropy encoded to compress the data stream Tlie thresholded data stream may be divided into a plurality ot data streams if compressed data is to be stored m α distributed mode In alternative embodiments, the compressed stream may also be zero-tree encoded or bitplane encoded This produces the compressed stream Whether the transformed stream should be thresholded and/or zero-tree or bitplane encoded depends on the geometπc design ot the Radon transfomi. The mverse is the inverse ot the entropy and bitplane encoding plus the neural net express g R^R To decompress it, the entropy and bitplane oi zeio-tree encoding is inverted ( standard) and passed tluough R ' R which produces the oπguial. decoded
In the method ot compression described, the wavelet used to transform data is designed by the shape of the oriented filters and the geometry of the neural network Thus, the wavelets may be generated to fit extraordmary forms of compression demands, or specdic material
Tlie method of compression also provides a method of cleaiung the data while compiessmg it In one embodiment, tlus is accomplished by usmg threshold functions which are soft (i.e., graduated ra .her than bmary) for compression geometry that have multiple resolutions.
Since the geometry of the u put. tι e geometry of the output, the configuration ot the oriented fdters, and the dunension ot the compression are explicit, one embodiment of the method ot compression adows extra control over compression optmuzatioii By usmg partial backprojections, tlus embodunent adows storage of the compressed data m a form which could be used for texture detection, some curvature and three-dunensional information, without decompressmg Tie partial backprojections may be done by the use of coπelation, such as the correlation ot neighbormg data, and adows image compression which is compatible with feature detection and query by content
Tlie method of compression ado s a very general, but very analytical, method toi designing unage compression. Tlie method allows image compression which munmizes concentration of activity on a network, trauung of specialized wavelet compression methods for special data, and the creation of compression methods consistent with image querymg.
Figure 10 is a block diagram ot one embodiment ot system 1000 tor compressmg images System 1000 mcludes a data repository 1010 coupled to a feeder 1020 Data repository 1010 contains data to be ted through a neural network 1040 by the feeder 1020. The neural network 1040 has a specdic geometry and is constructed by a neural network constructor 1030 by usmg a fuute and discrete Radon transfomi
Tlie construction ot the neural network 1040 is based on an analysis of the geometry of the desued network Tlie specific geometry chosen may depend on the simplicity of the encodmg, the simphcity of the decodmg, the natural geometry suggested by the subject matter to be compressed, and/or the natural geometries suggested by the network architecture
The data is fed through the neural network 1040, and the neural network 1040 produces a transformed data stream Tie transformed data stream moves through a thresholder 1050 wluch thiesholds the data stream Tluesholdmg the Jαtα stieam may mclude thresholdmg the data based on predeteππmed criteria Tlie predetermmed criteria may include quality of features to be preserv ed, such as. for example, outlines, or a criteria such as desired compression ratio Tlie tliresholdmg process may also mclude removing components of the data stream above a predetemimed niaxunum frequencv Thus, frequencies that, foi example, would nonnully not be seen by the human ey e. may be remov ed to i educe the amount ot data to be compi essed
A fixed input signal feeder 1060 feeds a fixed input signal through the neural netwoi k 1040 to generate a decoding calculation ot an av erage v alue The av erage value will be used to invert the Radon transtoim to recover the transfonned data Refemng back to Figuie 1 the feedback connections eliminate the av erage, w hich causes blumng Tlus is a function only ot the geometry ot the netwoi \ signal mav be input that is fixed and constant ov er the netwoi inputs This produces the blui pai t of the output It the blur part ot the output is fed back to the weights on the network, tins can be used to tune the weights to make the output and mput match to tune the network
An entropy encoder 1070 is coupled to the thresholder 1050, and the entropy encoder 1070 encodes the thiesholded data stream coming out ot the thi esholdei 1050 Tlus compresses the data stream The thresholded data stieam mav be div ided into a plurality ot data streams if compressed data is to be stoi ed in a distnbuted mode
Figure 11 is a flow diagram ot one embodiment ot a method ot reconstructuig audio/video/nnage data from lugher moment data
At processing block 1 101 , a finite Radon transfomi is pei formed on α lugher moment data At processuig block 1 102. an average function is generated to adow mversion of the Radon transtoπn in one step Tlie average function may be calculated only based on the geometiy and used foi multiple i econstructions At processmg block 1 103. the Radon transfonn at each pomt is conelated When α Radon transtoπn of a
-2 - dimension one lugher than the oπguial transform is created by takmg correlations at each pomt of the transformed data, a partial backprojection is produced.
At processuig block 1 104, α resultant set ot duplications is calculated usmg the correlation process m order to generate a new average function At block 1 105, the sum is taken ot the partial backprojections ot the Radon transfomi at each point Tlie new average function tor each point is subtracted ft-om the sum ot the partial backprojections at that point 1 106 Tlie mverse to each step is representative of the Guidikm formula.
In one embodiment, the general form tor discrete Radon transforms is explicitly given, in new cases, specifically tor the case in which balanced resolved block designs are not present The solution is not a relaxation method Tlie solution is consistent with moments generated m unage analysis Also, the solution takes geometry mto account, significantly generalizing the moment method ot describing unage data
The method disclosed, when executed m paradel, is potentially faster, smce it requires oidy a suigle step Also, the average function may be calculated only based on the geometry and used for multiple reconstructions. Tlie solution can also model many different experimental designs and coπelation statistics. In addition, the method can be trained for geometries with no closed form by backprojectmg a constant function
Figure 12 is a block diagram of one embodiment ot system 1200 for reconstructmg audio/video/image data from lugher moment data System 1200 includes a lugher moment data repository 1210 coupled to a Radon transfomi module 1220 Tlie lugher moment data repository contauis lugher moment data that is used by the Radon transtoπn module 1220 to perfoπn a fuute Radon transfoπn 1230
An average function generator 1240 generates an average function to adow mversion of the Radon transfomi 1230 m one step 1250 Tlie average function may be calculated only based on the geometry and used for multiple reconstructions
A correlation module 1260 coπelates the Radon ransfoπn 1230 at each pomt. When a Radon transfomi 1230 of a dunension one lugher t tan the oπguial transform is created by takmg correlations at ' ach pomt ot the transtoπned data, a partial backprojection is produced.
A calculator 1270 couple..*, to the correlation module 1260 calculates a resultant set of dupdcations usmg the correlation process to generate a new average function.
A summuig module 1280 sums partial backprojections of the Radon transform 1230 at each pomt. A subtractmg module 1290 is coupled to both the calculator 1270 and the summuig module 1280. Tlie subtractmg module 1290 subtracts the new average function for each pomt from the sum of the partial backprojections at that pomt. Tlie mverse to each step is representative ot the Guidikm toπnula.
In one embodunent, the general foπn tor discrete Radon transtoπns is explicitly given, m new cases, specificady tor the case in wluch balanced resolved block designs are not present. Tlie solution is not a relaxation method The solution is consistent with moments generated ui unage analysis. Also, the solution takes geometry into account, significantly generauz g the moment method of descπbuig unage data.
Figure 13 is a flow diagram of one embodunent of a method of usmg a neural network to train a neural network.
At processing block 1301 , a model for a desued function is created as a multidimensional function A decision is made as to whether to model it in a suigle stage or not. In one embodiment at processing block 1302, to detemime whether to model the function as a smgle stage or not, it is determined if the created model fits a simple fuute geometry model or not. There is always a geometry that wid fit a particular applciation. If that geometry is better expressed as bemg of l gher dimension than 2, then the model will be to used multiple stages.
At processuig block 1303, a Radon transform is generated to fit the simple fuute geometry model. The desued function is fed through the Radon transfoπn to generate weights at processing block 1304 These weights are then used to tram a multilayer perceptron of the neural network as seen at processmg block 1305. In this method, the const .naive proof is used to program neural networks for more than simple model problems. Nc w, neural networks can be created wl ch can model arbitrary functions with simple ii version formulas, making ti:eιr programmmg easier.
Tliis method adows suigle pass trauung of a neural network once the geometry of the trauung network is specdied. It also adows the interpolation of neurons m the ludden layer to add specificity. Tliis is not currently done with backpropagation. In addition, it adows simplification of a neural network functionality by analytic techniques from geometry and combinatorics.
Furtheπnore, the present invention presents a new, possibly simpler way to program neural networks. Tliis may allow more networks to be built with the same geometry in less time, by giving different input specifications to the trauung network. It also presents a way to add nodes to networks without rebuilding or retraining the network. Currently, d' the size of a multdayer perceptron is misestimated, the process requires going through the entire trauung cycle again. With tlus method of training, only angular projections are added. These can be calculated to ύiterpolate the existhig neurons.
Figure 14 is a block diagram of one embodunent of system 1400 for using a neural network to traui a neural network. System 1400 includes a model generator 1410 coupled to a decision module 1420. The model generator 1410 creates a model for a desired function as a multi-dimensional function. In order to detemime whether to model the function as a suigle stage or not, the decision module 1420 deteπnines if the created model fits a simple geometry model or not. There is always a geometry that will fit a particular application. If that geometry is better expressed as being of lugher dunension than 2, then the model wdl be to use multiple stages.
A Radon transfomi generator 1430 generates a Radon transform 1450 to fit the simple geometry model, and a feeder 1440 feeds the desired function through the Radon transfoπn 1450 to generate weights. A training module 1460 trauis a multdayer perceptron of the neural network usdig the weights. The specific arrangements and methods herein are merely illustrative ot the prmciples of this mvention Numerous moddications ui form and detad may be made by those skilled m the art without departmg from the true spun and scope of the mvention

Claims

What is claimed is
1 A method of tπnrmg a neural network, the method comprismg creating α model for a desued function as a multi-dimensional function 1301, determining d the created model fits a simple finite geometry model 1302, generatmg a Radon transform to fit the simple finite geometry model 1303, feeding the desued function through the Radon transform to generate weights 1304. and trauung a multdayer perceptron of the neural network usmg the weights 1305
2 The method of claun 1 wlierem the neural network is a first neural network and the Radon transform is a second neural network so that the first neural network is tramed by the second neural network
3 Tlie method of claun 1 wlierem the first neural network and the second neural network are dual to each other
4 A system for trauung a neural network, the system comprismg means for creatmg a model for a desired function as a multi-dimensional function 1410. means for deteπnuung d the created model fits a simple finite geometry model 1420, means tor generatmg a Radon transfomi to fit the simple fuute geometry model 1430, means for feeduig the desued function tlirough the Radon transfoπn to generate weights 1440, and means for trauung a multdayer perceptron ot the neural network usmg the weights 1460
5 A computer readable medium comprismg mstrut ticns, which when executed on a processor, perform a method of trauung a neural network, the method comprismg creating a model tor α desired function as a rrulti-dunensional function 1301 , deterπurung d the created model fits a simple fuute geometry model 1302, generatmg a Radon transform to fit the sunple fuute geometry model 1303, teedmg the desued function through the Rud' n transtoπn to generate eights 1304. and trauung α multilayer perceptron ot the neural network usmg the w eights 1305
6 An apparatus for trauung α first neural network, the apparatus comprismg a model generator to create a model for a desired function as a multi-duiiensional function 1410. a decision module to detemime if the created model fits α sunple fuute geometry model, the decision module coupled to the model genei ator 1420, a Radon transfomi generator to generate α Radon transfomi to tit the simple fuute geometry model, the Radon transfomi generator coupled to the decision module 1430, a feeder to teed the desired function through the Radon transfomi to generate weights, the feeder coupled to the decision module 1440, and a trauung module to tram a multdayer perceptron ot the first neural network usmg the weights, the trauung module coupled to the Radon transfoπn generator 1460
7 Tlie apparatus ot claim 6 wherein the Radon transfomi compnses a second neural network such that the second neural network is used to ti αm the fu st neui al network
8 Tie apparatus ot claim 7 w herein the first neui al netwoi and the second neural network aie dual to each other
PCT/US2001/002426 2000-01-24 2001-01-24 A method and apparatus of using a neural network to train a neural network WO2001054060A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001231139A AU2001231139A1 (en) 2000-01-24 2001-01-24 A method and apparatus of using a neural network to train a neural network

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US17806000P 2000-01-24 2000-01-24
US60/178,060 2000-01-24
US09/767,279 US6976012B1 (en) 2000-01-24 2001-01-22 Method and apparatus of using a neural network to train a neural network
US09/767,279 2001-01-22

Publications (2)

Publication Number Publication Date
WO2001054060A2 true WO2001054060A2 (en) 2001-07-26
WO2001054060A3 WO2001054060A3 (en) 2002-05-02

Family

ID=26873917

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/002426 WO2001054060A2 (en) 2000-01-24 2001-01-24 A method and apparatus of using a neural network to train a neural network

Country Status (3)

Country Link
US (2) US6976012B1 (en)
AU (1) AU2001231139A1 (en)
WO (1) WO2001054060A2 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8374974B2 (en) * 2003-01-06 2013-02-12 Halliburton Energy Services, Inc. Neural network training data selection using memory reduced cluster analysis for field model development
US7321809B2 (en) * 2003-12-30 2008-01-22 The Boeing Company Methods and systems for analyzing engine unbalance conditions
US8065244B2 (en) * 2007-03-14 2011-11-22 Halliburton Energy Services, Inc. Neural-network based surrogate model construction methods and applications thereof
US7775107B2 (en) * 2007-10-03 2010-08-17 Hamilton Sundstrand Corporation Measuring rotor imbalance via blade clearance sensors
US7814038B1 (en) 2007-12-06 2010-10-12 Dominic John Repici Feedback-tolerant method and device producing weight-adjustment factors for pre-synaptic neurons in artificial neural networks
US20090276385A1 (en) * 2008-04-30 2009-11-05 Stanley Hill Artificial-Neural-Networks Training Artificial-Neural-Networks
NO2310880T3 (en) * 2008-08-06 2017-12-30
US9514388B2 (en) * 2008-08-12 2016-12-06 Halliburton Energy Services, Inc. Systems and methods employing cooperative optimization-based dimensionality reduction
JP2011107648A (en) * 2009-11-20 2011-06-02 Fujifilm Corp Lens unit
US9230208B2 (en) 2013-12-18 2016-01-05 International Business Machines Corporation Haptic-based artificial neural network training
US10275705B2 (en) 2014-08-08 2019-04-30 Vicarious Fpc, Inc. Systems and methods for generating data explanations for neural networks and related systems
US10417525B2 (en) * 2014-09-22 2019-09-17 Samsung Electronics Co., Ltd. Object recognition with reduced neural network weight precision
US11221990B2 (en) 2015-04-03 2022-01-11 The Mitre Corporation Ultra-high compression of images based on deep learning
US11775850B2 (en) 2016-01-27 2023-10-03 Microsoft Technology Licensing, Llc Artificial intelligence engine having various algorithms to build different concepts contained within a same AI model
US11841789B2 (en) 2016-01-27 2023-12-12 Microsoft Technology Licensing, Llc Visual aids for debugging
US11868896B2 (en) 2016-01-27 2024-01-09 Microsoft Technology Licensing, Llc Interface for working with simulations on premises
US10671938B2 (en) 2016-01-27 2020-06-02 Bonsai AI, Inc. Artificial intelligence engine configured to work with a pedagogical programming language to train one or more trained artificial intelligence models
US11120299B2 (en) 2016-01-27 2021-09-14 Microsoft Technology Licensing, Llc Installation and operation of different processes of an AI engine adapted to different configurations of hardware located on-premises and in hybrid environments
KR20170118520A (en) 2016-04-15 2017-10-25 삼성전자주식회사 Interface neural network
EP3472713A4 (en) * 2016-06-21 2020-02-26 Vicarious FPC, Inc. Systems and methods for generating data explanations for neural networks and related systems
US10623775B1 (en) * 2016-11-04 2020-04-14 Twitter, Inc. End-to-end video and image compression
US11461383B2 (en) * 2017-09-25 2022-10-04 Equifax Inc. Dual deep learning architecture for machine-learning systems
US10726335B2 (en) 2017-10-26 2020-07-28 Uber Technologies, Inc. Generating compressed representation neural networks having high degree of accuracy
EP3847803A4 (en) 2018-09-05 2022-06-15 Vicarious FPC, Inc. Method and system for machine concept understanding
US11429406B1 (en) 2021-03-08 2022-08-30 Bank Of America Corporation System for implementing auto didactic content generation using reinforcement learning

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5311600A (en) * 1992-09-29 1994-05-10 The Board Of Trustees Of The Leland Stanford Junior University Method of edge detection in optical images using neural network classifier

Family Cites Families (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4995088A (en) * 1987-04-09 1991-02-19 Trustees Of The University Of Pennsylvania Super resolution
IL95934A (en) 1989-10-16 1994-03-15 Hughes Aircraft Co Fast image decoder
CA2031765C (en) * 1989-12-08 1996-02-20 Masahide Nomura Method and system for performing control conforming with characteristics of controlled system
US5218529A (en) * 1990-07-30 1993-06-08 University Of Georgia Research Foundation, Inc. Neural network system and methods for analysis of organic materials and structures using spectral data
US5175678A (en) * 1990-08-15 1992-12-29 Elsag International B.V. Method and procedure for neural control of dynamic processes
US5101270A (en) * 1990-12-13 1992-03-31 The Johns Hopkins University Method and apparatus for radon transformation and angular correlation in optical processors
US5377305A (en) * 1991-10-01 1994-12-27 Lockheed Sanders, Inc. Outer product neural network
US5414623A (en) * 1992-05-08 1995-05-09 Iowa State University Research Foundation Optoelectronic system for implementation of iterative computer tomography algorithms
US5953452A (en) * 1992-11-05 1999-09-14 The Johns Hopkins University Optical-digital method and processor for pattern recognition
US5329478A (en) * 1992-11-25 1994-07-12 Kirk David B Circuit and method for estimating gradients
US5446776A (en) * 1993-08-02 1995-08-29 General Electric Company Tomography with generation of radon data on polar grid points
JPH07175876A (en) * 1993-10-12 1995-07-14 At & T Corp Method and apparatus for control of feedback of process using neural network
US5400255A (en) 1994-02-14 1995-03-21 General Electric Company Reconstruction of images from cone beam data
US5481269A (en) * 1994-05-27 1996-01-02 Westinghouse Electric Corp. General frame wavelet classifier
TW284869B (en) 1994-05-27 1996-09-01 Hitachi Ltd
US5677609A (en) * 1994-07-28 1997-10-14 National Semiconductor Corporation Intelligent servomechanism controller
US5687364A (en) * 1994-09-16 1997-11-11 Xerox Corporation Method for learning to infer the topical content of documents based upon their lexical content
US5504792A (en) 1994-12-27 1996-04-02 General Electric Company Method and system for masking cone beam projection data generated from either a region of interest helical scan or a helical scan
JPH08273247A (en) * 1995-03-27 1996-10-18 Sony Corp Recorder
IL117951A (en) * 1995-09-06 1999-09-22 3T True Temperature Technologi Method and apparatus for true temperature determination
US5870502A (en) * 1996-04-08 1999-02-09 The Trustees Of Columbia University In The City Of New York System and method for a multiresolution transform of digital image information
US5784481A (en) 1996-06-25 1998-07-21 General Electric Company CT cone beam image reconstruction with circle and line scan path
US6208982B1 (en) 1996-11-18 2001-03-27 Lockheed Martin Energy Research Corporation Method and apparatus for solving complex and computationally intensive inverse problems in real-time
US5953388A (en) 1997-08-18 1999-09-14 George Mason University Method and apparatus for processing data from a tomographic imaging system
US5960055A (en) * 1997-12-19 1999-09-28 Siemens Corporate Research, Inc. Fast cone beam image reconstruction using a detector weight list
US6009142A (en) 1998-03-31 1999-12-28 Siemens Corporate Research, Inc. Practical cone beam image reconstruction using local regions-of-interest
US6560586B1 (en) 1998-10-30 2003-05-06 Alcatel Multiresolution learning paradigm and signal prediction
SE9902550D0 (en) 1999-07-02 1999-07-02 Astra Ab New crystalline forms
US6876779B2 (en) 2000-01-24 2005-04-05 Sony Côrporation Method and apparatus of reconstructing audio/video/image data from higher moment data
US6424737B1 (en) 2000-01-24 2002-07-23 Sony Corporation Method and apparatus of compressing images using localized radon transforms

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5311600A (en) * 1992-09-29 1994-05-10 The Board Of Trustees Of The Leland Stanford Junior University Method of edge detection in optical images using neural network classifier

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CARROLL S M ET AL: "Construction of neural nets using the radon transform" IJCNN: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (CAT. NO.89CH2765-6), WASHINGTON, DC, USA, 18-22 JUNE 1989, pages 607-611 vol.1, XP010087709 1989, New York, NY, USA, IEEE TAB Neural Network Committee, USA *
ELSHERIF H ET AL: "On modifying the weights in a modular recurrent connectionist system" NEURAL NETWORKS, 1994. IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE., 1994 IEEE INTERNATIONAL CONFERENCE ON ORLANDO, FL, USA 27 JUNE-2 JULY 1994, NEW YORK, NY, USA,IEEE, 27 June 1994 (1994-06-27), pages 535-539, XP010127265 ISBN: 0-7803-1901-X *
MEIR R ET AL: "Stochastic approximation by neural networks using the Radon and wavelet transforms" NEURAL NETWORKS FOR SIGNAL PROCESSING. PROCEEDINGS OF THE IEEE SIGNAL PROCESSING SOCIETY WORKSHOP, XX, XX, 1998, pages 224-233, XP002167255 *
SACHA J P ET AL: "On the synthesis and complexity of feedforward networks" 1994 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS. IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE (CAT. NO.94CH3429-8), PROCEEDINGS OF 1994 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS (ICNN'94), ORLANDO, FL, USA, 27 JUNE-2 JULY 1994, pages 2185-2190 vol.4, XP002191291 1994, New York, NY, USA, IEEE, USA ISBN: 0-7803-1901-X *

Also Published As

Publication number Publication date
WO2001054060A3 (en) 2002-05-02
AU2001231139A1 (en) 2001-07-31
US20050114280A1 (en) 2005-05-26
US6976012B1 (en) 2005-12-13

Similar Documents

Publication Publication Date Title
WO2001054060A2 (en) A method and apparatus of using a neural network to train a neural network
US6898583B1 (en) Method and apparatus of creating application-specific, non-uniform wavelet transforms
Wei et al. An advanced deep residual dense network (DRDN) approach for image super-resolution
Johnston et al. Computationally efficient neural image compression
CN112001914A (en) Depth image completion method and device
CN112541864A (en) Image restoration method based on multi-scale generation type confrontation network model
US6424737B1 (en) Method and apparatus of compressing images using localized radon transforms
CN110188667B (en) Face rectification method based on three-party confrontation generation network
Zhao et al. CREAM: CNN-REgularized ADMM framework for compressive-sensed image reconstruction
CN113658040A (en) Face super-resolution method based on prior information and attention fusion mechanism
CN112233012A (en) Face generation system and method
Du et al. Blind image denoising via dynamic dual learning
Yang et al. DMAT: Deformable medial axis transform for animated mesh approximation
US6876779B2 (en) Method and apparatus of reconstructing audio/video/image data from higher moment data
Knop et al. Generative models with kernel distance in data space
CN116095183A (en) Data compression method and related equipment
Zhu et al. Learning knowledge representation with meta knowledge distillation for single image super-resolution
WO2001054059A2 (en) A method and apparatus of creating application-specific, non-uniform wavelet transforms
Wu et al. Semantic image inpainting based on generative adversarial networks
Liu et al. Multiscale image denoising algorithm based on UNet3+
KR20240004777A (en) Online training of computer vision task models in the compressed domain.
Mageed How Satellite Imaging and Deep Learning Are Influenced by Tensor Decompositions: A Review
Vasilyev et al. Classification via compressed latent space
KR20230162061A (en) Multi-rate neural networks for computer vision tasks in the compressed domain.
CN117222997A (en) Compressed domain multi-rate computer vision task neural network

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP