WO2007072256A2 - Apparatus and method for classifying data - Google Patents

Apparatus and method for classifying data Download PDF

Info

Publication number
WO2007072256A2
WO2007072256A2 PCT/IB2006/054529 IB2006054529W WO2007072256A2 WO 2007072256 A2 WO2007072256 A2 WO 2007072256A2 IB 2006054529 W IB2006054529 W IB 2006054529W WO 2007072256 A2 WO2007072256 A2 WO 2007072256A2
Authority
WO
WIPO (PCT)
Prior art keywords
data
features
updating
data set
class
Prior art date
Application number
PCT/IB2006/054529
Other languages
French (fr)
Other versions
WO2007072256A3 (en
Inventor
Simona E. Grigorescu
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2007072256A2 publication Critical patent/WO2007072256A2/en
Publication of WO2007072256A3 publication Critical patent/WO2007072256A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2115Selection of the most significant subset of features by evaluating different subsets according to an optimisation criterion, e.g. class separability, forward selection or backward elimination

Definitions

  • the present invention relates to an apparatus and method for classifying data, and relates particularly, but not exclusively, to an apparatus and method for adaptively classifying 3D medical image model data.
  • CAD Computer aided detection and diagnosis systems
  • CT computer tomography
  • MR magnetic resonance
  • Such systems generally take 3D model data obtained by scanning a patient, and output a list of locations of suspicious structures which can then be investigated further by the radiologist.
  • Figure 1 represents a set of examples of lesions and false alarms for which two features fl and f2 are computed. Prior to the CAD system going on-line, these examples are plotted in Figure 1, and define a first set of points 2 corresponding to false alarms and a second set of points 4 corresponding to lesions. The CAD system uses this representation to define a boundary 6 between the two sets of points 2,4. After the CAD system goes on-line, every new data example is classified on the basis of its position relative to this boundary 6.
  • the system can update the definition of the boundary 6 on the basis of each new example of lesion of false alarm it encounters.
  • the CAD system updates the boundary 6 between the lesions and the false alarms, and any data example subsequently received is classified according to its position relative to the updated boundary 6.
  • the arrangement described above suffers from the drawback that the features needed to discriminate between lesions and false alarms depend upon the scanning protocol used.
  • the data received from a hospital using one scanning protocol may be easy to classify such as the data shown in Figures 1 and 2
  • the same data from a different hospital using a different scanning protocol could be as shown in Figure 3.
  • an updating method for updating a method of classifying medical image data comprising:
  • each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes;
  • this provides the advantage of enabling the classification of data to be optimized for the particular data values available. This in turn minimizes the extent to which classification based on results obtained from one location becomes unreliable when applied to results obtained from another location.
  • the method may further comprise selecting a plurality of said features on the basis of said correspondence.
  • the method may be offered to customers via the Internet.
  • the method may further comprise generating at least one said feature of said first data sets.
  • the method may further comprise determining the extent to which values of features of said first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
  • the method may further comprise determining the Mahalanobis distance between at least one pair of said regions.
  • a classifying method for classifying medical image data comprising: (i) receiving a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
  • a medical imaging method comprising:
  • each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features
  • an updating apparatus for updating a method of classifying medical image data comprising:
  • each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes;
  • At least one selecting device for selecting, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
  • At least one said selecting device may be adapted to select a plurality of said features on the basis of said correspondence.
  • the apparatus may further comprise at least one generating device for generating at least one said feature of said first data sets.
  • At least one said determining device may be adapted to determine the extent to which values of features of said first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
  • At least one said determining device may be adapted to determine the
  • a classifying apparatus for classifying medical image data comprising:
  • each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features
  • At least one second receiving device for receiving at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and (iv) at least one allocating device for allocating data classes to at least one second data set on the basis of at least one said feature.
  • a medical imaging apparatus comprising:
  • At least one imaging device for generating at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of features;
  • an updating data structure for use by a computer system for updating a method of classifying medical image data, the updating data structure comprising:
  • first computer code executable to determine, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and (ii) second computer code executable to select, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represent at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
  • the second computer code may be executable to select a plurality of said features on the basis of said correspondence.
  • the updating data structure may further comprises third computer code executable to generate at least one said feature of said first data sets.
  • the first computer code may be executable to determine the extent to which values of features of first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
  • the first computer code may be executable to determine the Mahalanobis distance at least one pair of said regions.
  • a classifying data structure for use by a computer system for classifying medical image data, the classifying data structure comprising:
  • each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features
  • fifth computer code executable to receive at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and (iv) sixth computer code executable to allocate a respective data class to at least one second data set on the basis of at least said feature.
  • a medical imaging data structure for use by a computer system for medical imaging, the medical imaging data structure comprising: seventh computer code executable to generate at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of features; and a classifying data structure as defined above.
  • a computer readable medium carrying a data structure as defined above stored thereon.
  • FIG. 1 to 3 illustrate the principle of operation of an existing CAD system
  • Fig. 4 is a schematic representation of a medical imaging apparatus embodying the present invention.
  • Fig. 5 is a flowchart explaining the operation of a method embodying the present invention of updating the apparatus of Fig. 4.
  • a medical imaging apparatus 2 has a platform 4 for supporting a patient 6, and a plurality of opposed pairs of x-ray sources 8 and detectors 10 arranged around a generally circular frame 12 through which the platform 4 passes.
  • the x-ray sources 8 and detectors 10 are controlled by means of a processor 14, which also controls a motor (not shown) for moving the platform 4 in the direction of arrow A relative to the frame 12.
  • the processor 14 also receives input signals from the X-ray detectors 10 and forms a 3-D model of the part of the patient 6 under investigation, and enables an image to be shown on a display 16. This aspect of the operation of the apparatus 2 will be familiar to persons skilled in the art and will therefore not be explained in greater detail herein.
  • the patient's organ of interest is selected by means of one or more methods which will be familiar to persons skilled in the art.
  • voxels having a certain property are chosen for further processing, in order to identify suspicious regions (i.e. suspected lesions) in step SlO.
  • Objects are then generated by adding neighboring voxels satisfying certain properties. For example, for polyp detection in CT images, first, the air in the colon is segmented by means of thresholding the CT data. Further, the voxels surrounding the segmented air are selected. These voxels represent the colon lumen. Next, those voxels in the colon lumen having a shape index close to 1 are retained for further analysis. Finally, objects are generated starting from the selected voxels by adding neighboring voxels that belong to the colon lumen and have a shape index of 0.75 or greater.
  • step S20 for the objects generated at the previous step, features describing object size, shape, and/or texture are computed.
  • features describing object size, shape, and/or texture are computed.
  • polyp detection for the objects generated in the previous step one can determine the minimum ellipsoid enclosing the segmented object and use the length of its principal diameters as features.
  • features are statistics (i.e. average, standard deviation, minimum, maximum) computed over the grey values in an object.
  • a tuning step of the present invention can be either launched by a user, or carried out remotely on a user's behalf, for example via the Internet.
  • data from one or more patients is loaded, and objects having a series of features are generated, as described above.
  • Each object is then labeled by the human expert (such as a radiologist) as a true lesion or a faulty detection. These objects are called throughout this embodiment "example set”.
  • feature selection is first carried out at step S30A.
  • the feature selection procedure works as follow: for all objects labeled by the human expert as true lesions, the distribution of the values of each feature is computed, and the features are then sorted according to how similar their distribution is to a Gaussian distribution. This similarity can be measured by means of statistical tests which will be familiar to persons skilled in the art such as the Kolmogorov-Smirnov test.
  • An empty set of "good features” is created, and the most Gaussian feature is selected and used to compute the Mahalanobis distance between the group of objects labeled as true lesions and the other group of objects. If the Mahalanobis distance is larger than a certain threshold, for example larger than 0.01, the feature is added to the set of "good features" and removed from the original pool of sorted features. Otherwise, it is simply discarded from the pool of sorted features.
  • a certain threshold for example larger than 0.01
  • the most Gaussian feature from the pool of sorted features is then selected and together with the features from the "good feature” set is used to compute a new value for the Mahalanobis distance. If the difference between this new value and the previous one is greater than a certain threshold, for example 0.01, the feature is added to the set of "good features" and removed from the original pool of sorted features. Otherwise, it is simply discarded from the pool of sorted features. This process is then continued with every feature in the top of the sorted list until the list is empty or until a prescribed number of features was added to the "good feature” set. As a result, only the selected "good features" are computed at step S20 for subsequent data acquisition, until the tuning process is next carried out. In this way, unnecessary feature acquisition and/or data processing is avoided.
  • a certain threshold for example 0.01
  • a classifier has a set of internal parameters that are used together with object feature values in the classification phase (step S40) for taking the decision to which class that object belongs.
  • these internal parameter values are updated based on the values of the "good feature" set for the "example set".
  • a linear classifier combines linearly the feature values of an object using a set of weights in one single quantity. This quantity is compared to a certain threshold for taking the decision about the class to which the object belongs. The values of these weights and that of the threshold are computed during the training phase S3 OB based on the "example set” feature values by means of mathematical formulas that are familiar to the person skilled in the art.
  • the apparatus 2 could be set for a number of scanning protocols and selected features together with the associated trained classifier stored for each such protocol as a system preset. This would enable the user to choose, from knowledge of the scanning protocol, which of the available presets best suits the data being classified.
  • the tuning process of Figure 5 can be carried out either within a medical imaging apparatus, or remotely on behalf of a user of the medical imagining apparatus, for example as an updating service provided via the Internet.

Abstract

An apparatus (2) for classifying medical image data is disclosed. The apparatus includes a processor 14 for receiving first data sets belonging to more than one class and including a plurality of features. The classes to which the first data sets belong are known and the processor determines the feature which best discriminates between the known classes of the first data sets. Subsequently received second data sets are then classified on the basis of the same feature.

Description

Apparatus and method for classifying data
The present invention relates to an apparatus and method for classifying data, and relates particularly, but not exclusively, to an apparatus and method for adaptively classifying 3D medical image model data.
Computer aided detection and diagnosis systems (CAD) have been developed to assist radiologists in quickly and reliably detecting unhealthy structures in the human body from computer tomography (CT) or magnetic resonance (MR) scan data. Such systems generally take 3D model data obtained by scanning a patient, and output a list of locations of suspicious structures which can then be investigated further by the radiologist.
The operation of an existing CAD system of this type is explained with reference to Figures 1 to 3. Figure 1 represents a set of examples of lesions and false alarms for which two features fl and f2 are computed. Prior to the CAD system going on-line, these examples are plotted in Figure 1, and define a first set of points 2 corresponding to false alarms and a second set of points 4 corresponding to lesions. The CAD system uses this representation to define a boundary 6 between the two sets of points 2,4. After the CAD system goes on-line, every new data example is classified on the basis of its position relative to this boundary 6.
After going on-line, the system can update the definition of the boundary 6 on the basis of each new example of lesion of false alarm it encounters. Referring to Figure 2, if a new known lesion 8 is introduced, the CAD system updates the boundary 6 between the lesions and the false alarms, and any data example subsequently received is classified according to its position relative to the updated boundary 6.
However, the arrangement described above suffers from the drawback that the features needed to discriminate between lesions and false alarms depend upon the scanning protocol used. For example, although the data received from a hospital using one scanning protocol may be easy to classify such as the data shown in Figures 1 and 2, the same data from a different hospital using a different scanning protocol could be as shown in Figure 3. In that case, it may be difficult to define a boundary between the two classes of data points on the basis of the position of points in the space defined by features fl and f2. This results in many faulty classifications, which may cause the CAD system to perform well at one clinical site and poorly at a different one.
Preferred embodiments of the present invention seek to improve the reliability of classification of such data. According to an aspect of the present invention, there is provided an updating method for updating a method of classifying medical image data, the updating method comprising:
(i) determining, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and
(ii) selecting, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
By obtaining data having a plurality of parameters and selecting a parameter to form the basis of subsequent classification based on correspondence between that parameter and membership of the known data classes into which subsequent data is to be divided, this provides the advantage of enabling the classification of data to be optimized for the particular data values available. This in turn minimizes the extent to which classification based on results obtained from one location becomes unreliable when applied to results obtained from another location.
The method may further comprise selecting a plurality of said features on the basis of said correspondence. This provides the advantage of enabling an apparatus for carrying out the method to be provided with pre-settings suitable, for example, for medical imaging data obtained according to different scanning protocols.
The method may be offered to customers via the Internet.
The method may further comprise generating at least one said feature of said first data sets.
The method may further comprise determining the extent to which values of features of said first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class. The method may further comprise determining the Mahalanobis distance between at least one pair of said regions.
According to another aspect of the present invention, there is provided a classifying method for classifying medical image data, the classifying method comprising: (i) receiving a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
(ii) updating the method by means of an updating method as defined above;
(iii) receiving at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and
(iv) allocating a respective data class to at least one said second data set on the basis of at least one said feature.
According to a further aspect of the present invention, there is provided a medical imaging method comprising:
(i) generating a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
(ii) generating at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and
(iii) classifying the or each said second data set by means of a classifying method as defined above.
According to a further aspect of the present invention, there is provided an updating apparatus for updating a method of classifying medical image data, the updating apparatus comprising:
(i) at least one determining device for determining, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and
(ii) at least one selecting device for selecting, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features. At least one said selecting device may be adapted to select a plurality of said features on the basis of said correspondence.
The apparatus may further comprise at least one generating device for generating at least one said feature of said first data sets. At least one said determining device may be adapted to determine the extent to which values of features of said first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
At least one said determining device may be adapted to determine the
Mahalanobis distance between at least one pair of said regions. According to a further aspect of the present invention, there is provided a classifying apparatus for classifying medical image data, the classifying apparatus comprising:
(i) at least one receiving device for receiving a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
(ii) an updating apparatus as defined above;
(iii) at least one second receiving device for receiving at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and (iv) at least one allocating device for allocating data classes to at least one second data set on the basis of at least one said feature.
According to a further aspect of the present invention, there is provided a medical imaging apparatus comprising:
(i) at least one imaging device for generating at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of features; and
(ii) a classifying apparatus as defined above.
According to a further aspect of the present invention, there is provided an updating data structure for use by a computer system for updating a method of classifying medical image data, the updating data structure comprising:
(i) first computer code executable to determine, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and (ii) second computer code executable to select, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represent at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
The second computer code may be executable to select a plurality of said features on the basis of said correspondence.
The updating data structure may further comprises third computer code executable to generate at least one said feature of said first data sets. The first computer code may be executable to determine the extent to which values of features of first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
The first computer code may be executable to determine the Mahalanobis distance at least one pair of said regions. According to a further aspect of the present invention, there is provided a classifying data structure for use by a computer system for classifying medical image data, the classifying data structure comprising:
(i) fourth computer code executable to receive a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
(ii) an updating data structure as defined above;
(iii) fifth computer code executable to receive at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and (iv) sixth computer code executable to allocate a respective data class to at least one second data set on the basis of at least said feature.
According to further aspect of the present invention, there is provided a medical imaging data structure for use by a computer system for medical imaging, the medical imaging data structure comprising: seventh computer code executable to generate at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of features; and a classifying data structure as defined above. According to a further aspect of the present invention, there is provided a computer readable medium carrying a data structure as defined above stored thereon.
A preferred embodiment of the invention will now be described, by way of example only, and not in any limitative sense, with reference to the accompanying drawings, in which:
Figs. 1 to 3 illustrate the principle of operation of an existing CAD system; Fig. 4 is a schematic representation of a medical imaging apparatus embodying the present invention; and
Fig. 5 is a flowchart explaining the operation of a method embodying the present invention of updating the apparatus of Fig. 4.
Referring to Figure 4, a medical imaging apparatus 2 has a platform 4 for supporting a patient 6, and a plurality of opposed pairs of x-ray sources 8 and detectors 10 arranged around a generally circular frame 12 through which the platform 4 passes. The x-ray sources 8 and detectors 10 are controlled by means of a processor 14, which also controls a motor (not shown) for moving the platform 4 in the direction of arrow A relative to the frame 12. The processor 14 also receives input signals from the X-ray detectors 10 and forms a 3-D model of the part of the patient 6 under investigation, and enables an image to be shown on a display 16. This aspect of the operation of the apparatus 2 will be familiar to persons skilled in the art and will therefore not be explained in greater detail herein.
The operation of the present invention will now be described with reference to Figure 5.
Once the image data is available, the patient's organ of interest is selected by means of one or more methods which will be familiar to persons skilled in the art. Of the data relating to the organ of interest, only voxels having a certain property are chosen for further processing, in order to identify suspicious regions (i.e. suspected lesions) in step SlO. Objects are then generated by adding neighboring voxels satisfying certain properties. For example, for polyp detection in CT images, first, the air in the colon is segmented by means of thresholding the CT data. Further, the voxels surrounding the segmented air are selected. These voxels represent the colon lumen. Next, those voxels in the colon lumen having a shape index close to 1 are retained for further analysis. Finally, objects are generated starting from the selected voxels by adding neighboring voxels that belong to the colon lumen and have a shape index of 0.75 or greater.
At step S20, for the objects generated at the previous step, features describing object size, shape, and/or texture are computed. As an example, in the case of polyp detection, for the objects generated in the previous step one can determine the minimum ellipsoid enclosing the segmented object and use the length of its principal diameters as features. Another examples of features are statistics (i.e. average, standard deviation, minimum, maximum) computed over the grey values in an object.
A tuning step of the present invention can be either launched by a user, or carried out remotely on a user's behalf, for example via the Internet. Prior to the tuning step S30, data from one or more patients is loaded, and objects having a series of features are generated, as described above. Each object is then labeled by the human expert (such as a radiologist) as a true lesion or a faulty detection. These objects are called throughout this embodiment "example set". In the tuning step, feature selection is first carried out at step S30A. The feature selection procedure works as follow: for all objects labeled by the human expert as true lesions, the distribution of the values of each feature is computed, and the features are then sorted according to how similar their distribution is to a Gaussian distribution. This similarity can be measured by means of statistical tests which will be familiar to persons skilled in the art such as the Kolmogorov-Smirnov test.
An empty set of "good features" is created, and the most Gaussian feature is selected and used to compute the Mahalanobis distance between the group of objects labeled as true lesions and the other group of objects. If the Mahalanobis distance is larger than a certain threshold, for example larger than 0.01, the feature is added to the set of "good features" and removed from the original pool of sorted features. Otherwise, it is simply discarded from the pool of sorted features.
The most Gaussian feature from the pool of sorted features is then selected and together with the features from the "good feature" set is used to compute a new value for the Mahalanobis distance. If the difference between this new value and the previous one is greater than a certain threshold, for example 0.01, the feature is added to the set of "good features" and removed from the original pool of sorted features. Otherwise, it is simply discarded from the pool of sorted features. This process is then continued with every feature in the top of the sorted list until the list is empty or until a prescribed number of features was added to the "good feature" set. As a result, only the selected "good features" are computed at step S20 for subsequent data acquisition, until the tuning process is next carried out. In this way, unnecessary feature acquisition and/or data processing is avoided.
Once generated, the "good feature" set selected in step S30A is then used at step S30B for supervised training of the classifier that is used at step S40. Generally speaking, a classifier has a set of internal parameters that are used together with object feature values in the classification phase (step S40) for taking the decision to which class that object belongs. During step S30B, these internal parameter values are updated based on the values of the "good feature" set for the "example set". As an example, a linear classifier combines linearly the feature values of an object using a set of weights in one single quantity. This quantity is compared to a certain threshold for taking the decision about the class to which the object belongs. The values of these weights and that of the threshold are computed during the training phase S3 OB based on the "example set" feature values by means of mathematical formulas that are familiar to the person skilled in the art.
It will be appreciated by persons skilled in the art that the above embodiment has been described by way of example only, and not in any limitative sense, and that various alterations and modifications are possible without departure from the scope of the invention as defined by the appended claims. For example, the apparatus 2 could be set for a number of scanning protocols and selected features together with the associated trained classifier stored for each such protocol as a system preset. This would enable the user to choose, from knowledge of the scanning protocol, which of the available presets best suits the data being classified. Also, the tuning process of Figure 5 can be carried out either within a medical imaging apparatus, or remotely on behalf of a user of the medical imagining apparatus, for example as an updating service provided via the Internet.

Claims

CLAIMS:
1. An updating method for updating a method of classifying medical image data, the updating method comprising:
(i) determining, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and
(ii) selecting, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
2. An updating method according to claim 1, further comprising selecting a plurality of said features on the basis of said correspondence.
3. An updating method according to claim 1, wherein the method is offered to customers via the Internet.
4. An updating method according to claim 1, further comprising generating at least one said feature of said first data sets.
5. An updating method according to claim 1, further comprising determining the extent to which values of features of said first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
6. An updating method according to claim 1, further comprising determining the
Mahalanobis distance between at least one pair of said regions.
7. A classifying method for classifying medical image data, the classifying method comprising: (i) receiving a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
(ii) updating the method by means of an updating method according to claim 1 ; (iii) receiving at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and
(iv) allocating a respective data class to at least one said second data set on the basis of at least one said feature.
8. A medical imaging method comprising:
(i) generating a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features; (ii) generating at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and
(iii) classifying the or each said second data set by means of a classifying method according to claim 7.
9. An updating apparatus for updating a method of classifying medical image data, the updating apparatus comprising:
(i) at least one determining device for determining, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and (ii) at least one selecting device for selecting, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
10. An updating apparatus according to claim 9, wherein at least one said selecting device is adapted to select a plurality of said features on the basis of said correspondence.
11. An updating apparatus according to claim 9, further comprising at least one generating device for generating at least one said feature of said first data sets.
12. An updating apparatus according to claim 9, wherein at least one said determining device is adapted to determine the extent to which values of features of said first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
13. An updating apparatus according to claim 9, wherein at least one said determining device is adapted to determine the Mahalanobis distance between at least one pair of said regions.
14. A classifying apparatus for classifying medical image data, the classifying apparatus comprising: (i) at least one receiving device for receiving a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features;
(ii) an updating apparatus according to claim 9;
(iii) at least one second receiving device for receiving at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and
(iv) at least one allocating device for allocating data classes to at least one second data set on the basis of at least one said feature.
15. A medical imaging apparatus comprising:
(i) at least one imaging device for generating at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of features; and (ii) a classifying apparatus according to claim 14.
16. An updating data structure for use by a computer system for updating a method of classifying medical image data, the updating data structure comprising: (i) first computer code executable to determine, for a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features, correspondence between values of a plurality of said features and membership of said known data classes; and (ii) second computer code executable to select, on the basis of said correspondence, at least one said feature to form the basis of subsequent allocation of data classes to at least one second data set, wherein the or each said second data set represent at least part of an entity belonging to a respective unknown data class and has a plurality of said features.
17. An updating data structure according to claim 16, wherein the second computer code is executable to select a plurality of said features on the basis of said correspondence.
18. An updating data structure according to claim 16, further comprising third computer code executable to generate at least one said feature of said first data sets.
19. An updating data structure according to claim 16, wherein the first computer code is executable to determine the extent to which values of features of first data sets can be represented as a plurality of separated regions, wherein each said region corresponds to a respective known data class.
20. An updating data structure according to claim 16, wherein the first computer code is executable to determine the Mahalanobis distance at least one pair of said regions.
21. A classifying data structure for use by a computer system for classifying medical image data, the classifying data structure comprising:
(i) fourth computer code executable to receive a plurality of first data sets, wherein each said first data set represents at least part of an entity belonging to a respective known data class and has a plurality of features; (ii) an updating data structure according to claim 16; (iii) fifth computer code executable to receive at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of said features; and
(iv) sixth computer code executable to allocate a respective data class to at least one second data set on the basis of at least said feature.
22. A medical imaging data structure for use by a computer system for medical imaging, the medical imaging data structure comprising: seventh computer code executable to generate at least one second data set, wherein the or each said second data set represents at least part of an entity belonging to a respective unknown data class and has a plurality of features; and a classifying data structure according to claim 21.
23. A computer readable medium carrying a data structure according to claim 16 stored thereon.
PCT/IB2006/054529 2005-12-23 2006-11-30 Apparatus and method for classifying data WO2007072256A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05112885 2005-12-23
EP05112885.8 2005-12-23

Publications (2)

Publication Number Publication Date
WO2007072256A2 true WO2007072256A2 (en) 2007-06-28
WO2007072256A3 WO2007072256A3 (en) 2007-10-25

Family

ID=38189043

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/054529 WO2007072256A2 (en) 2005-12-23 2006-11-30 Apparatus and method for classifying data

Country Status (1)

Country Link
WO (1) WO2007072256A2 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030215119A1 (en) * 2002-05-15 2003-11-20 Renuka Uppaluri Computer aided diagnosis from multiple energy images
EP1398721A2 (en) * 2002-09-13 2004-03-17 GE Medical Systems Global Technology Company LLC Computer assisted analysis of tomographic mammography data
US20040148266A1 (en) * 2003-01-29 2004-07-29 Forman George Henry Feature selection method and apparatus
US20050049913A1 (en) * 2003-07-11 2005-03-03 Huddleston David E. Method and apparatus for automated feature selection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030215119A1 (en) * 2002-05-15 2003-11-20 Renuka Uppaluri Computer aided diagnosis from multiple energy images
EP1398721A2 (en) * 2002-09-13 2004-03-17 GE Medical Systems Global Technology Company LLC Computer assisted analysis of tomographic mammography data
US20040148266A1 (en) * 2003-01-29 2004-07-29 Forman George Henry Feature selection method and apparatus
US20050049913A1 (en) * 2003-07-11 2005-03-03 Huddleston David E. Method and apparatus for automated feature selection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WEBB A.R.: "statistical pattern recognition - second edition" 2002, JOHN WILEY AND SONS , XP002444702 *sections 9.1 and 9.2* *

Also Published As

Publication number Publication date
WO2007072256A3 (en) 2007-10-25

Similar Documents

Publication Publication Date Title
US6760468B1 (en) Method and system for the detection of lung nodule in radiological images using digital image processing and artificial neural network
US7965876B2 (en) Systems and methods for image segmentation with a multi-stage classifier
CN1930584B (en) System and method for filtering a medical image
US8229200B2 (en) Methods and systems for monitoring tumor burden
JP4652023B2 (en) Image data processing method and apparatus useful for disease detection
NL1024869C2 (en) Method and system for measuring tissue changes relevant to disease.
Ozekes et al. Nodule detection in a lung region that's segmented with using genetic cellular neural networks and 3D template matching with fuzzy rule based thresholding
US9014456B2 (en) Computer aided diagnostic system incorporating appearance analysis for diagnosing malignant lung nodules
US20070052700A1 (en) System and method for 3D CAD using projection images
El-Baz et al. Three-dimensional shape analysis using spherical harmonics for early assessment of detected lung nodules
EP2116974B1 (en) Statistics collection for lesion segmentation
US10706534B2 (en) Method and apparatus for classifying a data point in imaging data
JP2005526583A (en) Lung nodule detection using wheel-like projection analysis
EP2987114B1 (en) Method and system for determining a phenotype of a neoplasm in a human or animal body
WO2007026598A1 (en) Medical image processor and image processing method
WO2007117506A2 (en) System and method for automatic detection of internal structures in medical images
JP2020010805A (en) Specification device, program, specification method, information processing device, and specifier
US7873196B2 (en) Medical imaging visibility index system and method for cancer lesions
US20100266173A1 (en) Computer-aided detection (cad) of a disease
US11783476B2 (en) System and method for analyzing three-dimensional image data
CN111226287A (en) Method for analyzing a medical imaging dataset, system for analyzing a medical imaging dataset, computer program product and computer readable medium
WO2007072256A2 (en) Apparatus and method for classifying data
Nugroho et al. Quad Convolutional Layers (QCL) CNN Approach for Classification of Brain Stroke in Diffusion Weighted (DW)-Magnetic Resonance Images (MRI).
EP4270305A1 (en) Learning device, method, and program, and medical image processing device
Qashgari Computer Aided Diagnosis System for Lung Cancer from Computed Tomography Image

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06832022

Country of ref document: EP

Kind code of ref document: A2