US8971614B2 - Extracting object edges from images - Google Patents
Extracting object edges from images Download PDFInfo
- Publication number
- US8971614B2 US8971614B2 US13/894,276 US201313894276A US8971614B2 US 8971614 B2 US8971614 B2 US 8971614B2 US 201313894276 A US201313894276 A US 201313894276A US 8971614 B2 US8971614 B2 US 8971614B2
- Authority
- US
- United States
- Prior art keywords
- image
- edge
- images
- human observer
- visual indicator
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/143—Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
-
- G06T7/0085—
-
- G06T7/0087—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/13—Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
Definitions
- This disclosure relates to extracting object edges from images.
- Detecting object contours can be a key step to object recognition. See Biederman, I. (1987), “ Recognition - by - components: A theory of human image understanding ” Psychological Review, 94(2), 115-147, doi:10.1037/0033-295X.94.2.115; Biederman, I., & Ju, G. (1988), “ Surface versus edge - based determinants of visual recognition ”, Cognitive Psychology, 20(1), 38-64. doi:10.1016/0010-0285(88)90024-2; DeCarlo, D.
- a computation in visual cortex may be the extraction of object contours, where the first stage of processing is commonly attributed to V1 simple cells.
- the brain's ability to finely discriminate edges from non-edges therefore may depend on information encoded by local oriented cell populations.
- Raising thresholds or applying an expansive output nonlinearity can sharpen tuning curves to an arbitrary degree, but may not be an effective strategy from an edge-detection perspective because the underlying linear filtering operation may not be able to distinguish properly aligned low contrast edges from misaligned high contrast ones (or a multitude of contrast non-edge structures). This weakness may not be remedied by output thresholding.
- edge/contour detection algorithms may exploit the Gestalt principle of “good continuation” or related principles to improve detection performance See Choe, Y., & Miikkulainen, R. (1998), “ Self - organization and segmentation in a laterally connected orientation map of spiking neurons ”, Neurocomputing, 21(1-3), 139-158, doi:10.1016/50925-2312(98)00040-X; Elder, J. H., & Zucker, S. W. (1998), “ Local scale control for edge detection and blur estimation ”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 699-716, doi:10.1109/34.689301; Grossberg, S., & Williamson, J. R.
- Measurements needed for contour extraction may lie in a butterfly-shaped “association field” centered on a reference edge that reflects contour continuity principles, see Field, D. J., Hayes, A., & Hess, R. F. (1993), “ Contour integration by the human visual system: evidence for a local “association field ””, Vision Research, 33(2), 173-193, with an inhibitory region orthogonal to the edge, see FIG. 1 ; Geisler, W S, Perry, J. S., Super, B. J., & Gallogly, D. P.
- Identifying a set of image measurements that are most useful for contour extraction can be a crucial step, but may leave open the question as to how those measurements should be algorithmically combined to detect contours in natural images.
- a priori (e.g. geometric) models of edge/contour structure can provide important insights, but may face challenges, such as including the multiscale structure of natural object boundaries, lighting inhomogeneities, partial occlusions, disappearing local contrast, and optical effects such as blur from limited depth of field. All of these complexities, and others known and unknown, may in principle be treated as noise sources that randomly perturb filter values in the vicinity of a candidate edge, suggesting that a probabilistic, population-based approach to edge detection may be most appropriate. See Dollar, P., Tu, Z., & Belongie, S.
- Bayesian inference has had successes in explaining behavior in sensory and motor tasks. See Fiser, J., Berkes, P., Orbán, G., & Lengyel, M. (2010), “ Statistically optimal perception and learning: from behavior to neural representations ”, Trends in Cognitive Sciences, 14(3), 119-130. doi:10.1016/j.tics.2010.01.003; Kording, K. P., & Wolpert, D. M.
- Non-transitory, tangible, computer-readable storage media may contain a program of instructions that cause a computer system running the program of instructions to elicit from a human observer ground truth data useful in automatically detecting one or more features in images.
- the elicitation may include presenting an image to a human observer that has a visual indicator in an image, the visual indicator indicating having a location and orientation with respect to the image; asking the human observer to judge whether a particular image feature is present in the image at the location and orientation indicated by the visual indicator; receiving input from the human observer indicative of whether the particular image feature is present at the location and orientation indicated by the visual indicator; storing the input received from the human observer as part of the human-labeled ground truth data; and repeating the process described above one or more times in connection with a visual indicator that has a different location or orientation with respect to the image or that uses a different image.
- the stored human-labeled ground truth data may have a content that is useful in automatically detecting one or more features in other images.
- the asking may include asking the human observer to rate their degree of certainty that the particular image feature is present in the image at the location and orientation indicated by the visual indicator.
- the receiving may include receiving input from the human observer indicative of their degree of certainty that the particular image feature is present in the image at the location and orientation indicated by the visual indicator.
- the visual indicator may indicate a size of the feature.
- the visual indicator may indicate a direction of motion of the particular image feature during a sequence of video images.
- Multiple image sequences with their associated visual indicators may be classified into multiple classifications. Each classification may only having images sequences with the same numeric, measured, image characteristic.
- the presenting, asking, receiving, and storing steps may be performed in connection with a predetermined number of image sequences and their visual indicators from each of the classifications.
- the numeric, measured, image characteristics may collectively form a systematic sampling of a value range of a numeric characteristic or a combination of numeric characteristics.
- Multiple images paired with visual indicators may be classified into multiple classifications, each classification only having images with the same numeric, measured, image characteristic.
- the presenting, asking, receiving, and storing steps may be performed in connection with a predetermined number of images and their visual indicators from each of the classifications.
- the particular feature may be a curve, junction, or a compound feature consisting of any combination of the following: edges, curves, and junctions in a specified spatial relationship.
- the common, numeric, measured, image characteristics may collectively form a systematic sampling of a value range of a numeric characteristic or a combination of numeric characteristics.
- At least one numeric, measured, image characteristic may have a numeric range.
- the numeric range may be sufficiently narrow as to result in substantial decorrelation of other numeric, measured, image characteristics.
- one or more other numeric, measured image characteristics of the images may be determined within each of the classifications that the human observer classified similarly during the receiving step, that are substantially statistically independent of each other, and that provide substantial information relevant to the presence of the particular image feature in the other images.
- the particular image feature may be an edge.
- the classification scheme is modified dynamically in the course of collecting human observer responses to concentrate human labeling effort within those classifications or to develop new classifications, where data is most needed to accurately estimate feature probability based on the collected human responses up to that point.
- the visual indicator may indicate a region with defined ends.
- the asking the human observer may include asking the human observer to specify whether an edge is present in the image that enters the indicated region at one end, remains within the indicated region over the entire length of the region, and exits the region at the opposite end.
- a ground truth data acquisition system may include a computer data processing system that includes at least one computer data hardware processor and other computer hardware components that, collectively, elicit from a human observer ground truth data in connection with one or more images that is useful in automatically detecting one or more features in other images.
- the system may implement any of the steps and functions that have been described herein.
- FIG. 1 illustrates tangential vs. orthogonal regions surrounding a candidate edge.
- FIG. 2A illustrates an oriented linear filter kernel.
- FIG. 2B illustrates a log pdf of filter responses measured at all locations and orientations in an image database.
- FIG. 2C illustrates example image patches at 3 linear responses levels measured at a reference location indicated by the red box. Marked red pixel within the red box indicates the bright side of the edge.
- FIG. 2D illustrates a probability of an edge for a given linear response.
- FIG. 3B illustrates plots of 5 parameters used to fit Poisson-smoothed likelihoods as a function of reference filter contrast.
- FIG. 3C illustrates examples of on-edge likelihood functions generated from a parametric model at a range of reference filter values, with Poisson-smoothed data shown superimposed in thin black lines for 5 cases for which labeled data was actually collected.
- FIG. 4B illustrates weighted average ranks over contrast levels for all neighboring filters, inverted so tall columns indicate more information.
- FIG. 4C illustrates position and orientation of most informative filters in the orthogonal region shown relative to a reference location.
- FIG. 5 illustrates distribution of mean absolute pairwise correlations (MAPC) scores for ⁇ 1.3 million 6-wise combinations of the most informative filters.
- FIG. 6A-FIG . 6 D illustrate orientation and positioning tuning of the local edge probability (LEP) calculated for each of ⁇ 3,400 filter sets tested.
- LEP local edge probability
- FIG. 7A-FIG . 7 C illustrate a set of 6 neighboring filters finally chosen for local edge probability computation.
- FIG. 8A-FIG . 8 C illustrate linear response vs. local edge probability.
- FIG. 9 illustrates local edge probability computation at two locations with same linear score but very different LEPs.
- FIG. 10 illustrates results of applying the probabilistic edge detection algorithm to natural images.
- FIG. 11 is an example of a ground truth data acquisition system.
- FIG. 12 is an example of a non-transitory, tangible, computer-readable storage media containing a program of instructions.
- Bayes rule may be used to calculate edge probability at a given location/orientation in an image based on a surrounding filter population. Beginning with a set of ⁇ 100 filters, a subset may be culled out that are maximally informative about edges, and minimally correlated to allow factorization of the joint on- and off-edge likelihood functions.
- Features of this approach may include an efficient method for ground-truth edge labeling by humans, with an emphasis on achieving class-conditional independence of filters in the vicinity of an edge.
- the resulting population-based edge detector may have zero parameters, may calculate edge probability based on a sum of surrounding filter influences, may be much more sharply tuned than underlying linear filters, and may effectively capture fine-scale edge structure in natural scenes.
- An approach to edge detection may be taken that depends on class conditional independence (CCI) within a chosen filter set (that is, independence of the filter responses both when an edge is present and when one is absent). If/when the CCI assumption is satisfied, (see Jacobs, R. A., 1995, Methods for combining experts' probability assessments. Neural Computation, 7, 867-888) the on- and off-edge likelihood functions, consolidated in the denominator of Equation 2 below, can be factored into products of single-filter likelihood functions, and then rewritten in terms of a sum of log likelihood (LL) ratio terms, as illustrated by Equation 3 below.
- CCI class conditional independence
- each LL ratio term can be expressed and visualized as a function of a single filter value r i , making explicit the information that a filter at one location in an image carries about the presence of an edge at another; and (3) the positive and negative evidence from filters surrounding an edge (captured by these LL ratios) can be combined linearly in the overall edge probability calculation, and is thus a very simple calculation.
- ⁇ 1 ) P ⁇ ( r 1 , r 2 ⁇ ⁇ ... ⁇ ⁇ r N ⁇ edge ) ⁇ P ⁇ ( edge ) P ⁇ ( r 1 , r 2 ⁇ ⁇ ... ⁇ ⁇ r N ⁇ edge ) ⁇ P ⁇ ( edge ) + P ⁇ ( r 1 , r 2 ⁇ ⁇ ... ⁇ ⁇ r N ⁇ edge _ ) ⁇ P ⁇ ( edge _ ) ( Eq .
- Equation 3 Under the assumption of class-conditional independence among the N filters, the likelihoods in Equation 3 can be factored and rewritten in terms of a sum of N log-likelihood ratio terms, which in turn functions as the argument to a sigmoid function[[ ]]:
- the overall edge probability computation can be expressed as a sum of influences from a set of surrounding filters that is then run through a sigmoid function.
- a modified version of Bayes rule conditioned on the value of a reference filter r ref , evaluated at the location/orientation where the edge probability is being calculated, may be used in the Results section to reduce higher-order statistical dependencies among the other contributing filters (see text for details and references):
- RGB images were converted to the following three independent components using the method of Hyvarinen, A. (1999), “ Fast and robust fixed - point algorithms for independent component analysis ”, IEEE Transactions on Neural Networks, 10(3), 626-634. doi:10.1109/72.761722, trained on a random sample of 1.5 million pixels from the Berkeley Segmentation database:
- the components O 1 , O 2 , O 2 roughly corresponded to red-green, blue-yellow and luminance channels, respectively. In this paper we used O 2 the luminance channel only.
- Chernoff information may be used as a measure of distance between on-edge and off-edge likelihood distributions for a given filter Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003), “ Statistical Edge Detection: Learning and Evaluating Edge Cues ”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(1), 57-74. doi:http://doi.ieeecomputersociety.org/10.1109/TPAMI.2003.1159946:
- kernel density estimation may be used, where each instance of a filter's value was spread along the x-axis of its likelihood function with a smoothing “kernel”.
- 450 images were used from the COREL database including a variety of indoor and outdoor scenes. Only the luminance channel was used (see Methods). Luminance images were convolved at a single scale with an oriented spatial difference operator as shown in FIG. 2A .
- the 5 ⁇ 2 pixel filter was applied at 16 orientations in even steps of 22.5 degrees.
- the center of rotation was the center of the shaded pixel.
- Filter responses were rectified at 0 and normalized to lie in the range r i ⁇ [0,1].
- pixel values off the grid were determined by bilinear interpolation.
- the pdf of the filter's response at all locations and orientations in the database is shown in FIG. 2B .
- the filter had a mean response of 0.012 (out of a maximum of 1), and a roughly exponential fall off over most of the range so that, for example, the probability of measuring a filter value near 0.6 was 100,000 times lower than the probability of measuring a value near 0.
- FIGS. 2A-2D illustrate a linear filter, its statistics, and its use in ground truth labeling.
- FIG. 2 A illustrates an oriented linear filter kernel. Convolution results were rectified at 0 to obtain the filter response r i . The pixel that denotes the location of the filter is marked by red shading.
- FIG. 2B illustrates a log pdf of filter responses measured at all locations and orientations in the database.
- FIG. 2C illustrates example image patches at 3 linear responses levels measured at the reference location (red rectangle).
- LEP local edge probability
- edge probability was measured at the reference location conditioned on r ref , the filter value computed at the reference location itself (i.e., in the red box).
- edges were scored and scores were averaged within each bin. The result is plotted in FIG. 2D (red data) along with a sigmoidal fit (black solid curve).
- a general case is considered in which multiple filters surrounding a reference location would be used, in addition to r ref , to calculate the edge probability at the reference location (Equation 3).
- Multiple strategies described in the following, were used to narrow down the large population of filters surrounding a reference location to a subset that is as CCI as possible.
- a goal was to include only the most informative filters in the chosen filter set, but also to avoid measuring the informativeness of large numbers of filters that would later be rejected based on their failure to meet the CCI criteria, the steps taken to minimize filter dependencies and maximize filter informativeness were interleaved so as to reduce overall computational effort.
- Neighboring filter responses in natural images may exhibit higher-order correlations that stem from the fact that nearby points in the world are often part of the same texture and/or subject to the same illumination or contrast conditions. These regional effects may induce a particular type of higher-order dependency between nearby filters, in which a strong response in one filter predicts a higher response variance in other filters (Karklin, Y., & Lewicki, M. (2003), “ Learning higher - order structures in natural images ”, Network: Computation in Neural Systems, 14(3), 483-499. doi:10.1088/0954-898X/14/3/306; Parra, L., Spence, C., & Sajda, P.
- a secondary advantage of “slicing up” and separately collecting the likelihood functions at a range of r ref values, beyond its effect of de-correlating surrounding filters, is that the approach greatly increases the amount of true on-edge data from which all the on-edge likelihood functions are constructed.
- a parametric representation allowed the evaluation of filter likelihood functions at arbitrary reference filter contrasts, i.e. not limited to the 5 discrete values of r ref at which labeled data was actually collected.
- Example fits of the on-edge and off-edge distributions at 2 contrast levels are shown in FIG. 3A for the filter rotated 45° relative to the reference location.
- the parameters of the fits were plotted at the 5 reference contrast levels for which labeled data was collected.
- a piecewise cubic Hermite interpolating polynomial was then fit through each of the 5 parameter plots for both the on-edge and off-edge distributions for each filter. Plots of the 5 spline-fit functions are shown in FIG. 3B for a filter rotated 22.5° relative to the reference location.
- the spline fits allowed the parameters of the on-edge and off-edge likelihood distributions to be generated for any value of reference filter contrast ( FIG. 3C ).
- FIGS. 3A-3C illustrate modeling likelihood functions of neighboring filters.
- 3C illustrates examples of on-edge likelihood functions generated from the parametric model at a range of reference filter values, with the smoothed data shown superimposed in thin black lines for the 5 cases for which labeled data was actually collected (red curves).
- Kernel-smoothed likelihood functions are shown for a neighboring filter as thin lines in FIG. 3A (black curves for on-edge, magenta for off-edge), along with parametric fits shown as superimposed thick lines and dots.
- FIG. 4 illustrates selecting informative filters.
- FIG. 4B illustrates weighted average ranks over contrast levels for all neighboring filters, inverted so tall columns indicate more information. Top 30% of the 112 filters are marked in red.
- FIG. 4C illustrates position and orientation of the most informative filters in the orthogonal region are shown relative to the reference location.
- the low MAPC score for the filter set that would ultimately be chosen is marked with a red triangle, while the average MAPC value over all 6-wise filter combinations is marked with a green square.
- 6 ⁇ 6 pairwise correlation matrices for the two marked cases are shown as insets. For the average case, a single representative filter set was chosen.
- the 3,362 filter sets with correlation scores in the lowest 0.25% of the MAPC distribution (lower red tail, including the red triangle case) were set aside for performance testing on labeled natural edges.
- FIG. 5 illustrates distribution of mean absolute pairwise correlations (MAPC) scores for ⁇ 1.3 million 6-wise combinations of the most informative filters.
- 6 ⁇ 6 pairwise correlation matrices are shown at upper right for 2 cases: red triangle corresponds to a filter set with one of the lowest correlation scores; this set was eventually used in the edge detection algorithm; green square shows a case with an average MAPC score. Least inter-correlated 0.25% of filter sets (left tail of distribution, shaded red) were carried forward for further processing.
- the ⁇ 3,000 image patches in the database that has been labeled as containing edges were presented to each edge detector at 16 orientations (at the reference position) and 7 positions (at the reference orientation), and tuning curves were generated. Examples of tuning curves for the filter set that would eventually be chosen are shown in FIG. 6 at 5 levels of contrast.
- Full width at half maximum (FWHM) scores were extracted from each of the ⁇ 3,000 tuning curves for each filter set, and the scores were averaged and histogrammed ( FIG. 6C-D ).
- FIG. 6A-D illustrate orientation and position tuning of the local edge probability (LEP) calculated for each of the ⁇ 3,400 filter sets tested.
- FIG. 6A illustrates an example orientation tuning curves for the chosen filter set are shown at 5 values of r ref . Averages for each reference value are shown as thick colored lines. Inset shows response at preferred orientation at 5 different levels of contrast.
- FIG. 6B illustrates, for each tested filter set, tuning curves were generated for each of the ⁇ 3,000 human-labeled edges in the database. Full width at half maximum (FWHM) values were calculated for each tuning curve, the results were averaged, and the average tuning width for that filter set was entered into the histogram. The orientation tuning score of the chosen filter set is marked by a red triangle.
- FWHM Full width at half maximum
- FIG. 6C illustrates positional tuning curves covering 3 pixels above and below the reference position.
- FIG. 6D illustrates distribution of average FWHM values for the positional tuning curves. Tuning score for the chosen filter set is again marked by a red triangle, and the tuning for a linear filter at the reference location is marked by a green square.
- FIGS. 7A-C illustrate the set of 6 neighboring filters finally chosen for the local edge probability computation.
- FIG. 7C illustrates likelihood ratios (i.e. ratio of red and blue curves in B) for each filter.
- the final filter set is depicted in FIG. 7A , along with its on-edge and off-edge likelihood functions ( FIG. 7B ) and likelihood ratios ( FIG. 7C ) conditioned on a reference filter value of 0.3.
- Likelihood functions and ratios at higher and lower values of r ref were similar in form, but were pushed towards higher or lower ends of the r i range, respectively.
- FIGS. 8A-C illustrate linear response vs. local edge probability.
- FIG. 8B illustrates image patches corresponding to marked examples in A are shown with their corresponding LEP scores. Note the much higher LEP scores, and edge probability, in top vs. bottom row.
- FIG. 8B illustrates image patches corresponding to marked examples in A are shown with their corresponding LEP scores. Note the much higher LEP scores, and edge probability, in top vs. bottom
- Red line in A the scatter plot in A with LEP scores over the 80 th percentile
- red line segments in the left panel the scatter plot in A
- blue line in A the locations below the 20 th percentile
- Red lines are generally well aligned with object edges whereas most blue lines are misplaced or misoriented.
- FIG. 8A shows a scatter plot of linear reference responses vs. calculated edge probability at all positions and orientations in the image shown in FIG. 8C .
- image patches were identified at the 10 th (blue dots) and 90 th (red dots) percentiles of the LEP range in 5 evenly spaced bins along the r ref axis.
- the image patches are shown in FIG. 8B , along with their corresponding LEP scores.
- FIG. 8A were presumptive “bad edges” and were labeled with blue line segments in FIG. 8C (right frame).
- the upper cutoff of 0.38 on the linear axis was chosen because at that linear score, the edge probability reached 50% ( FIG. 8A ), so that the visual distinction between “good” and “bad” edges within any given linear bin above that value would necessarily begin to fade.
- FIG. 8B red-labeled edges were much more likely to be properly aligned and positioned relative to actual object edges, whereas blue edges were typically misaligned by a pixel or two, and/or misoriented.
- FIGS. 9A-9C illustrate local edge probability computation at 2 locations with same linear score but very different LEPs.
- FIG. 9B illustrates log likelihood ratio curves, and values marked with red and blue symbols for the 6 neighboring filters applied to the upper and lower image patches, respectively.
- FIG. 9C illustrates log likelihood ratios shown as bar heights. Resulting LEP values are shown above and below the image patches in A.
- FIG. 10 illustrates results of applying the algorithm to natural images. Maximum value of local edge probability across all orientations is shown at each pixel as the grey level. PbCanny results were generated with scale parameter of 1.
- the local edge probability was computed at every pixel position and orientation in the luminance channel of entire images, and displayed the maximum LEP value over all orientations as each pixel's greyscale value (scaled between 0 and 255, with darker pixels indicating higher edge probability).
- the overall edge detection algorithm is referred to as rm* (for “-r- esponse based on -m-ultiple *riented filters”).
- Example images are shown in FIG. 10 , in comparison to a graded Canny-like algorithm (PbCanny) developed at UC Berkeley (Martin, D. R., Fowlkes, C. C., & Malik, J.
- Human labelers may also reject certain classes of strong edges based on a perceived lack of importance, for example the stripes produced by window shades, or the large number of ‘uninteresting’ strong edges contained within textures. Given that unimportant or uninteresting strong edges may constitute a large fraction of all strong edges in natural images, these labeling choices can dilute edge probability at the strong-response end of the on-edge likelihood distribution. Taken together, these effects can produce a substantial rearrangement of the probability density in the on-edge distribution along the detector's response axis, mostly pushing towards the low end of the response range. The greater overlap of the on- and off-edge distributions that this rearrangement causes leads to the appearance of reduced edge detection performance in a precision-recall analysis. Presumably for reasons such as these, benchmarking scores, and the rankings they generate on specific images, often seem visually unintuitive.
- the probabilistic approach to edge detection described here can likely be adapted to other types of visual features.
- the constraint that the probability calculation be expressible in terms of sums of positive and negative interactions among nearby cells, tied to the CCI assumption, means that the process outlined here, whether applied to edges or other features, may only be a first stage in a multi-stage process. Nonetheless, the ability to break a complex natural feature-extraction process into a first quasi-linear stage where cue independence roughly holds, followed by additional processing stages where bona fide nonlinear interactions can occur, has the advantage of modularity, and seems likely to simplify the overall computational scheme.
- Image data may be loaded into a ground truth data acquisition system.
- the image data may be representative of an image and may have originated from any of several sources, such as a camera, document scanner, medical imaging device, satellite imager, and/or another type of spatial sensor array.
- the image data may be digitized versions of hand drawn pictures, or artificial images such as produced by computer graphics programs or computer-based graphical design tools.
- the image data may be arrays of data that have a spatial interpretation and include meaningful regions separated by boundaries.
- the image may be an individual, still image or part of a series of images that form a video. Other images in the video may be processed in the same way.
- the image data may comprise values at multiple locations in the image, such as at each pixel in the image.
- the image data may contain one or more channels of information, such as RGB (color) images or multispectral images.
- RGB color
- the algorithms that are discussed below refer to edge-detection in a single channel extracted from the image, such as the intensity channel, or some other channel which could be a raw channel (e.g. the red value in an RGB image) or a combination or transformation of original image channels, such as a red-green opponent channel or hue or saturation
- FIG. 11 is an example of a ground truth data acquisition system 1101 .
- the ground truth data acquisition system 1101 may include a data processing system 1103 containing one or more computer data hardware processors 1105 and other computer hardware components 1107 , such as one or more tangible memories (e.g., random access memories (RAMs), read-only memories (ROMs), and/or programmable read only memories (PROMS)), tangible storage devices (e.g., hard disk drives, CD/DVD drives, and/or flash memories), system buses, video processing components, network communication components, input/output ports, and/or user interface devices (e.g., keyboards, pointing devices, displays, microphones, sound reproduction systems, and/or touch screens).
- tangible memories e.g., random access memories (RAMs), read-only memories (ROMs), and/or programmable read only memories (PROMS)
- tangible storage devices e.g., hard disk drives, CD/DVD drives, and/or flash memories
- system buses video processing components
- network communication components e
- the data processing system 1103 may be configured to perform one, all, or any combination of the functions and processes that have been described above and/or in the claims below.
- the data processing system 1103 may be a desktop computer or a portable computer, such as a laptop computer, a notebook computer, a tablet computer, a PDA, a smartphone, or part of a larger system, such a vehicle, appliance, and/or telephone.
- the data processing system may include one or more computers at the same or different locations. When at different locations, the computers may be configured to communicate with one another through a wired and/or wireless network communication system.
- the data processing system 1103 may include software (e.g., one or more operating systems, device drivers, application programs, and/or communication programs).
- software e.g., one or more operating systems, device drivers, application programs, and/or communication programs.
- the software includes programming instructions and may include associated data and libraries.
- the programming instructions are configured to implement one or more algorithms that implement one or more of the functions of the data processing system 1103 , as recited herein.
- the description of each function that is performed by the data processing system 1103 also constitutes a description of the algorithm(s) that performs that function.
- the software may be stored on or in one or more non-transitory, tangible storage devices, such as one or more hard disk drives, CDs, DVDs, and/or flash memories.
- the software may be in source code and/or object code format.
- Associated data may be stored in any type of volatile and/or non-volatile memory.
- the software may be loaded into a non-transitory memory and executed by one or more processors.
- FIG. 12 is an example of a non-transitory, tangible, computer-readable storage media 1201 containing a program of instructions 1203 .
- the program of instructions may cause the data processing system 1103 to perform the functions of the data processing system 1103 as described above, when loaded in and run by the data processing system 1103 .
- the visual indicator used in FIG. 2 to identify an oriented edge was a red rectangle.
- Alternatives include lines of different colors, thicknesses, or levels of transparency.
- An alternative to solid lines could be blinking lines or moving (marquee) dashed lines.
- the pixels corresponding to the indicated area might also be modulated, such as by modulating their brightness.
- the length of the indicated edge could have been indicate by arrows perpendicular to the edge pointing to the ends of the putative edge location or other types of length markers. Shapes other than a rectangle could have been used, such as a flared rectangle, wider at the ends than in the middle, or a bulbous rectangle wider in the middle than at the ends.
- the ends of the indicated area could also have been omitted.
- Visual indicators for other types of features would be tailored to the shape of the feature.
- a curved box could be used for human labeling of curves, or an L-shaped area could be indicated for judging the presence of an L-junction.
- the more precision in the indicator that is, the less tolerance to changes in position, orientation, width, length, etc., the more accurate the labeling data will generally be.
- Indicators could also include an visual cue indicating feature type in case multiple feature types are being simultaneously labeled that have the same general shape and size. For example, a different indicator appearance may be used to distinguish object edges vs. shadow edges, or sharp edges vs. blurry edges, or the boundaries of animals vs. their backgrounds vs. all other types of objects and their backgrounds.
- the response scheme for human observers could be binary (“edge present” vs. “edge absent”), or have any number of graded values indicating different levels of perceived edge strength or edge probability. Human responses could be given verbally, or through keypresses, or by clicking on an on-screen response panel.
- Images could be presented with unlimited time for labeling, or could be speeded, so that labels must be entered within a certain time window.
- the scheme used to classify images based on numerical measurements prior to presenting them to human observers can vary. Classification can be based on a single measurement or multiple image measurements in combination, such as a combination of nearby filter values. Or, the measurement(s) can be arbitrary functions of one or more filter values or other more general computations on images.
- the classification scheme need not use explicit value ranges for classification based on numeric variables, but may choose a sample of images whose numeric values used for classification fit some distributional criteria, such as approximating a uniform distribution over the space of classification variable values.
- Relational terms such as “first” and “second” and the like may be used solely to distinguish one entity or action from another, without necessarily requiring or implying any actual relationship or order between them.
- the terms “comprises,” “comprising,” and any other variation thereof when used in connection with a list of elements in the specification or claims are intended to indicate that the list is not exclusive and that other elements may be included.
- an element preceded by an “a” or an “an” does not, without further constraints, preclude the existence of additional elements of the identical type.
Abstract
Description
Image Database and Extraction of the Luminance Channel
where Pon=P(r|edge), Poff=P(r|
Poisson Smoothing
Assigned edge | |||
Score given | | probability | |
1 | Certainly no | 0 |
2 | Probably no edge | 0.25 |
3 | Can't tell - around 50/50 | 0.5 |
4 | Probably an edge | 0.75 |
5 | Certainly an | 1 |
Table 1: Labeling system used to score edges at the reference location, with the corresponding interpretation and assigned edge probability.
where N=6 and ρ(ri, rj) is the correlation between 2 filters i and j over all pixel locations and orientations in the image database. The distribution of MAPC values is shown in
Claims (16)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/894,276 US8971614B2 (en) | 2012-05-14 | 2013-05-14 | Extracting object edges from images |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201261646514P | 2012-05-14 | 2012-05-14 | |
US13/894,276 US8971614B2 (en) | 2012-05-14 | 2013-05-14 | Extracting object edges from images |
Publications (2)
Publication Number | Publication Date |
---|---|
US20130301910A1 US20130301910A1 (en) | 2013-11-14 |
US8971614B2 true US8971614B2 (en) | 2015-03-03 |
Family
ID=49548657
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/894,276 Expired - Fee Related US8971614B2 (en) | 2012-05-14 | 2013-05-14 | Extracting object edges from images |
Country Status (1)
Country | Link |
---|---|
US (1) | US8971614B2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107767387A (en) * | 2017-11-09 | 2018-03-06 | 广西科技大学 | Profile testing method based on the global modulation of changeable reception field yardstick |
CN108090492A (en) * | 2017-11-09 | 2018-05-29 | 广西科技大学 | The profile testing method inhibited based on scale clue |
US10387770B2 (en) | 2015-06-10 | 2019-08-20 | Samsung Electronics Co., Ltd. | Spiking neural network with reduced memory access and reduced in-network bandwidth consumption |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101318812B1 (en) * | 2011-07-18 | 2013-10-17 | 한양대학교 산학협력단 | Filtering method for detecting orientation of edge from image and image recognition method using thereof |
US9036888B2 (en) * | 2012-04-30 | 2015-05-19 | General Electric Company | Systems and methods for performing quality review scoring of biomarkers and image analysis methods for biological tissue |
WO2018209057A1 (en) * | 2017-05-11 | 2018-11-15 | The Research Foundation For The State University Of New York | System and method associated with predicting segmentation quality of objects in analysis of copious image data |
CN109544577B (en) * | 2018-11-27 | 2022-10-14 | 辽宁工程技术大学 | Improved straight line extraction method based on edge point grouping |
CN111062957B (en) * | 2019-10-28 | 2024-02-09 | 广西科技大学鹿山学院 | Non-classical receptive field contour detection method |
US11335037B2 (en) * | 2020-02-04 | 2022-05-17 | Adobe Inc. | Smart painting tools |
CN111461139B (en) * | 2020-03-27 | 2023-04-07 | 武汉工程大学 | Multi-target visual saliency layered detection method in complex scene |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774591A (en) * | 1995-12-15 | 1998-06-30 | Xerox Corporation | Apparatus and method for recognizing facial expressions and facial gestures in a sequence of images |
US5802220A (en) * | 1995-12-15 | 1998-09-01 | Xerox Corporation | Apparatus and method for tracking facial motion through a sequence of images |
US6919892B1 (en) * | 2002-08-14 | 2005-07-19 | Avaworks, Incorporated | Photo realistic talking head creation system and method |
US7027054B1 (en) * | 2002-08-14 | 2006-04-11 | Avaworks, Incorporated | Do-it-yourself photo realistic talking head creation system and method |
US20090285456A1 (en) * | 2008-05-19 | 2009-11-19 | Hankyu Moon | Method and system for measuring human response to visual stimulus based on changes in facial expression |
US20100191124A1 (en) * | 2007-04-17 | 2010-07-29 | Prokoski Francine J | System and method for using three dimensional infrared imaging to provide psychological profiles of individuals |
US7958063B2 (en) * | 2004-11-11 | 2011-06-07 | Trustees Of Columbia University In The City Of New York | Methods and systems for identifying and localizing objects based on features of the objects that are mapped to a vector |
US8488863B2 (en) * | 2008-11-06 | 2013-07-16 | Los Alamos National Security, Llc | Combinational pixel-by-pixel and object-level classifying, segmenting, and agglomerating in performing quantitative image analysis that distinguishes between healthy non-cancerous and cancerous cell nuclei and delineates nuclear, cytoplasm, and stromal material objects from stained biological tissue materials |
-
2013
- 2013-05-14 US US13/894,276 patent/US8971614B2/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5774591A (en) * | 1995-12-15 | 1998-06-30 | Xerox Corporation | Apparatus and method for recognizing facial expressions and facial gestures in a sequence of images |
US5802220A (en) * | 1995-12-15 | 1998-09-01 | Xerox Corporation | Apparatus and method for tracking facial motion through a sequence of images |
US6919892B1 (en) * | 2002-08-14 | 2005-07-19 | Avaworks, Incorporated | Photo realistic talking head creation system and method |
US7027054B1 (en) * | 2002-08-14 | 2006-04-11 | Avaworks, Incorporated | Do-it-yourself photo realistic talking head creation system and method |
US7958063B2 (en) * | 2004-11-11 | 2011-06-07 | Trustees Of Columbia University In The City Of New York | Methods and systems for identifying and localizing objects based on features of the objects that are mapped to a vector |
US20100191124A1 (en) * | 2007-04-17 | 2010-07-29 | Prokoski Francine J | System and method for using three dimensional infrared imaging to provide psychological profiles of individuals |
US20090285456A1 (en) * | 2008-05-19 | 2009-11-19 | Hankyu Moon | Method and system for measuring human response to visual stimulus based on changes in facial expression |
US8488863B2 (en) * | 2008-11-06 | 2013-07-16 | Los Alamos National Security, Llc | Combinational pixel-by-pixel and object-level classifying, segmenting, and agglomerating in performing quantitative image analysis that distinguishes between healthy non-cancerous and cancerous cell nuclei and delineates nuclear, cytoplasm, and stromal material objects from stained biological tissue materials |
Non-Patent Citations (6)
Title |
---|
Dollar, P. et al. 2006. Supervised Learning of Edges and Object Boundaries. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-vol. 2, 1964-1971. IEEE Computer Society, 2006. http://portal.acm.org/citation.cfm?id=1153171.1153683. |
Dollar, P. et al. 2006. Supervised Learning of Edges and Object Boundaries. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition—vol. 2, 1964-1971. IEEE Computer Society, 2006. http://portal.acm.org/citation.cfm?id=1153171.1153683. |
Geisler, W.S. et al. 2001. Edge Co-occurrence in Natural Images Predicts Contour Grouping Performance. Vision Research 41, No. 6 (Mar. 2001): 711-724. |
Konishi, S. et al. 2003. A Statistical Approach to Multi-scale Edge Detection. Image and Vision Computing 21, No. 1 (Jan. 10, 2003): 37-48. doi:10.1016/S0262-8856(02)00131-2. |
Martin, D.R. et al. 2004. Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues. IEEE Trans. Pattern Anal. Mach. Intell. 26, No. 5 (2004): 530-549. |
Rana Ayman El Kaliouby, Mind-reading machines: automated inference of complex mental states, Jul. 2005, University of Cambridge, Cambridge, United Kingdom, pp. 185. * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10387770B2 (en) | 2015-06-10 | 2019-08-20 | Samsung Electronics Co., Ltd. | Spiking neural network with reduced memory access and reduced in-network bandwidth consumption |
CN107767387A (en) * | 2017-11-09 | 2018-03-06 | 广西科技大学 | Profile testing method based on the global modulation of changeable reception field yardstick |
CN108090492A (en) * | 2017-11-09 | 2018-05-29 | 广西科技大学 | The profile testing method inhibited based on scale clue |
CN108090492B (en) * | 2017-11-09 | 2019-12-31 | 广西科技大学 | Contour detection method based on scale clue suppression |
CN107767387B (en) * | 2017-11-09 | 2020-05-05 | 广西科技大学 | Contour detection method based on variable receptive field scale global modulation |
Also Published As
Publication number | Publication date |
---|---|
US20130301910A1 (en) | 2013-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8971614B2 (en) | Extracting object edges from images | |
Garcia-Lamont et al. | Segmentation of images by color features: A survey | |
US7406200B1 (en) | Method and system for finding structures in multi-dimensional spaces using image-guided clustering | |
Shang et al. | Fuzzy-rough feature selection aided support vector machines for mars image classification | |
Tso et al. | A contextual classification scheme based on MRF model with improved parameter estimation and multiscale fuzzy line process | |
US8428354B2 (en) | Image segmentation by hierarchial agglomeration of polygons using ecological statistics | |
Huang et al. | Identification of group-housed pigs based on Gabor and Local Binary Pattern features | |
Situ et al. | Malignant melanoma detection by bag-of-features classification | |
Fathi et al. | General rotation-invariant local binary patterns operator with application to blood vessel detection in retinal images | |
Su | A filter-based post-processing technique for improving homogeneity of pixel-wise classification data | |
García et al. | Supervised texture classification by integration of multiple texture methods and evaluation windows | |
Amelio et al. | An evolutionary approach for image segmentation | |
Shahkolaei et al. | Blind quality assessment metric and degradation classification for degraded document images | |
Bionda et al. | Deep autoencoders for anomaly detection in textured images using CW-SSIM | |
Da Xu et al. | Bayesian nonparametric image segmentation using a generalized Swendsen-Wang algorithm | |
Cariou et al. | Unsupervised texture segmentation/classification using 2-D autoregressive modeling and the stochastic expectation-maximization algorithm | |
Pohudina et al. | Method for identifying and counting objects | |
Sebastian et al. | Significant full reference image segmentation evaluation: a survey in remote sensing field | |
Wo et al. | A saliency detection model using aggregation degree of color and texture | |
CN113096079B (en) | Image analysis system and construction method thereof | |
Ramachandra et al. | Computing local edge probability in natural scenes from a population of oriented simple cells | |
Torrent et al. | A boosting approach for the simultaneous detection and segmentation of generic objects | |
Chamorro-Martínez et al. | A fuzzy approach for modelling visual texture properties | |
Lin et al. | A learning-based framework for supervised and unsupervised image segmentation evaluation | |
Martínez-Jiménez et al. | Perception-based fuzzy partitions for visual texture modeling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: UNIVERSITY OF SOUTHERN CALIFORNIA, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MEL, BARTLETT W.;RAMACHANDRA, CHAITHANYA A.;REEL/FRAME:031376/0454 Effective date: 20131009 |
|
AS | Assignment |
Owner name: NATIONAL INSTITUTES OF HEALTH (NIH), U.S. DEPT. OF Free format text: CONFIRMATORY LICENSE;ASSIGNOR:UNIVERSITY OF SOUTHERN CALIFORNIA;REEL/FRAME:034714/0298 Effective date: 20141217 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: MICROENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: MICROENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20190303 |
|
PRDP | Patent reinstated due to the acceptance of a late maintenance fee |
Effective date: 20200131 |
|
FEPP | Fee payment procedure |
Free format text: SURCHARGE, PETITION TO ACCEPT PYMT AFTER EXP, UNINTENTIONAL (ORIGINAL EVENT CODE: M3558); ENTITY STATUS OF PATENT OWNER: MICROENTITY Free format text: ENTITY STATUS SET TO MICRO (ORIGINAL EVENT CODE: MICR); ENTITY STATUS OF PATENT OWNER: MICROENTITY Free format text: PETITION RELATED TO MAINTENANCE FEES FILED (ORIGINAL EVENT CODE: PMFP); ENTITY STATUS OF PATENT OWNER: MICROENTITY Free format text: PETITION RELATED TO MAINTENANCE FEES GRANTED (ORIGINAL EVENT CODE: PMFG); ENTITY STATUS OF PATENT OWNER: MICROENTITY |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, MICRO ENTITY (ORIGINAL EVENT CODE: M3551); ENTITY STATUS OF PATENT OWNER: MICROENTITY Year of fee payment: 4 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: MICROENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: MICROENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20230303 |