US20060142981A1 - Statistical modeling and performance characterization of a real-time dual camera surveillance system - Google Patents

Statistical modeling and performance characterization of a real-time dual camera surveillance system Download PDF

Info

Publication number
US20060142981A1
US20060142981A1 US11/360,800 US36080006A US2006142981A1 US 20060142981 A1 US20060142981 A1 US 20060142981A1 US 36080006 A US36080006 A US 36080006A US 2006142981 A1 US2006142981 A1 US 2006142981A1
Authority
US
United States
Prior art keywords
camera
module
space
applying
circumflex over
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/360,800
Inventor
Michael Greiffenhagen
Visvanathan Ramesh
Dorin Comaniciu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US11/360,800 priority Critical patent/US20060142981A1/en
Publication of US20060142981A1 publication Critical patent/US20060142981A1/en
Priority to US11/484,994 priority patent/US20070019073A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/77Determining position or orientation of objects or cameras using statistical methods
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19602Image analysis to detect motion of the intruder, e.g. by frame subtraction
    • G08B13/19604Image analysis to detect motion of the intruder, e.g. by frame subtraction involving reference image or background adaptation with time to compensate for changing conditions, e.g. reference image update on detection of light level change
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19602Image analysis to detect motion of the intruder, e.g. by frame subtraction
    • G08B13/19608Tracking movement of a target, e.g. by detecting an object predefined as a target, using target direction and or velocity to predict its new position
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19617Surveillance camera constructional details
    • G08B13/19626Surveillance camera constructional details optical details, e.g. lenses, mirrors or multiple lenses
    • G08B13/19628Surveillance camera constructional details optical details, e.g. lenses, mirrors or multiple lenses of wide angled cameras and camera groups, e.g. omni-directional cameras, fish eye, single units having multiple cameras achieving a wide angle view
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19639Details of the system layout
    • G08B13/19641Multiple cameras having overlapping views on a single scene
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19639Details of the system layout
    • G08B13/19641Multiple cameras having overlapping views on a single scene
    • G08B13/19643Multiple cameras having overlapping views on a single scene wherein the cameras play different roles, e.g. different resolution, different camera type, master-slave camera
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • G08B13/19639Details of the system layout
    • G08B13/19647Systems specially adapted for intrusion detection in or around a vehicle
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/695Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Definitions

  • the present invention relates to computer vision systems, more particularly to a system having computationally efficient real-time object detection, tracking, and zooming capabilities.
  • Recent advancements in processing and sensing performances are facilitating increased development of real-time video surveillance and monitoring systems.
  • Past works have addressed methodological issues and have demonstrated performance analysis of components and systems. However, it is still an art to engineer systems that meet given application needs in terms of computational speed and accuracy. The trend in the art is to emphasize statistical learning methods, more particularly Bayesian methods for solving computer vision problems. However, there still exists the problem of choosing the right statistical likelihood model and the right priors to suit the needs of an application. Moreover, it is still computationally difficult to satisfy real-time application needs.
  • the present invention relates to computer vision systems, more particularly to a system having computationally efficient real-time detection and zooming capabilities.
  • a method according to the present invention performs proper statistical inference, automatically set control parameters and quantify limits of a dual-camera real-time video surveillance system.
  • the present invention provides continuous high resolution zoomed-in image of a person's head at any location in a monitored area.
  • an omni-directional camera video used to detect people and to precisely control a high resolution foveal camera, which has pan, tilt and zoom capabilities.
  • the pan and tilt parameters of the foveal camera and its uncertainties are shown to be functions of the underlying geometry, lighting conditions, background color/contrast, relative position of the person with respect to both cameras as well as sensor noise and calibration errors.
  • the uncertainty in the estimates is used to adaptively estimate the zoom parameter that guarantees with a user specified probability, ⁇ , that the detected person's face is contained and zoomed within the image.
  • the present invention includes a method for selecting intermediate transforms (components of the system), as well as processing various parameters in the system to perform statistical inference, automatically setting the control parameters and quantifying a dual-camera real-time video surveillance system.
  • Another embodiment of the present invention relates to a method for visually locating and tracking an object through a space.
  • the method chooses modules for a restricting a search function within the space to regions with a high probability of significant change, the search function operating on images supplied by a camera.
  • the method also derives statistical models for errors, including quantifying an indexing step performed by an indexing module, and tuning system parameters. Further, the method applies a likelihood model for candidate hypothesis evaluation and object parameters estimation for locating the object.
  • the step of choosing the plurality of modules further includes applying a calibration module for determining a static scene, applying an illumination-invariant module for tracking image transformation, and applying the indexing module for selecting regions of interest for hypothesis generation. Further, the method can apply a statistical estimation module for estimating a number of objects and their positions and apply a foveal camera control module for estimating control parameters of a foveal camera based on location estimates and uncertainties.
  • Additional modules can be applied by the method, for example, a background adaptation module for detecting and tracking the object in dynamically varying illumination situations.
  • Each module is application specific based on prior distributions for imposing restrictions on a search function.
  • the prior distributions includes for example: an object geometry model; a camera geometry model; a camera error model; and an illumination model.
  • the camera is an omnicamera. Further, the object is tracked using a foveal camera.
  • the method derives statistical models a number of times to achieve a given probability of misdetection and false alarm rate.
  • the method also validates a theoretical model for the space monitored for determining correctness and closeness to reality.
  • the indexing module selects regions with a high probability of significant change, motivated by two dimensional image priors induced by prior distributions in the space, where the space is in three dimensional.
  • the method of applying a likelihood model includes estimating an uncertainty of the object's parameters for predicting a system's performance and for automating control of the system.
  • the method can be employed in an automobile wherein the space includes an interior compartment of the automobile and/or the exterior of the automobile.
  • a computer program product in yet another embodiment, includes a computer program code stored on a computer readable storage medium for, for detecting and tracking objects through a space.
  • the computer program product includes computer readable program code for causing a computer to choose modules for a restricting search functions within a context to regions with a high probability of significant change within the space.
  • the computer program product also includes computer readable, program code for causing a computer to derive statistical models for errors, including quantifying an indexing step, and tuning system parameters. Further included is computer readable program code for causing a computer to apply a likelihood model for candidate hypothesis evaluation and object parameters estimation within the space.
  • FIG. 1 is a block diagram showing a method for tracking an object through a space according to one embodiment of the present invention
  • FIG. 2 is an illustration of a system of cameras for tracking a person according to one embodiment of the present invention
  • FIG. 3 is an illustration of an omni-image including the geometric relationships between elements of the system while tracking a person according to one embodiment of the present invention
  • FIG. 4 is an illustration of how uncertainties in three dimensional radial distances-influence foveal camera control parameters.
  • FIG. 5 is an illustration of the geometric relationship between a foveal camera and a person.
  • the present invention solves the problems existing in the prior art described above, based on the following methods.
  • modules are chosen for an optical surveillance system, by use of context, in other words: application specific prior distributions for modules.
  • modules can include, for example, object geometry, camera geometry, error models and illumination models.
  • Real-time constraints are imposed by pruning or indexing functions that restrict the search space for hypotheses. The choice of the pruning functions is derived from the application context and prior knowledge. A proper indexing function will be one that simplifies computation of the probability of false hypothesis or the probability of missing a true hypotheses as a function of the tuning constraints.
  • the derivation of statistical models for errors at various stages in the chosen vision system configuration assists in quantifying the indexing step.
  • the parameters are tuned to achieve a given probability of miss-detection and false alarm rate.
  • a validation of theoretical models is performed for correctness (through Monte-Carlo simulations) and closeness to reality (through real experiments).
  • Bayesian estimation is preferably used to evaluate candidate hypotheses and estimate object parameters by using a likelihood model, P (measurements/hypothesis), that takes into account the effects of the pre-processing steps and tuning parameters. In addition, the uncertainty of the estimate is derived to predict system performance.
  • One embodiment of the present invention includes a two camera surveillance system which continuously provides zoomed-in high resolution images of the face of a person present in a room. These images represent the input to higher-level vision modules, e.g., face recognition, compaction and event-logging.
  • higher-level vision modules e.g., face recognition, compaction and event-logging.
  • the present invention provides: 1) real-time performance on a low-cost PC, 2) person misdetection rate of ⁇ m , 3) person false-alarm rate of ⁇ f , 4) adaptive zooming of person irrespective of background scene structure with maximal possible zoom based on uncertainty of person attributes estimated (e.g., location in three dimensional (3D), height, etc.), with performance of the result characterized by face resolution attainable in area of face pixel region (as a function of distance, contrast between background and object, and sensor noise variance and resolution) and bias in the centering of the face.
  • 3D three dimensional
  • the method makes assumptions about scene structure, for example, the scene illuminate consists of light sources with similar spectrum (e.g., identical light sources in an office area), the number of people to the detected and tracked is bounded, and the probability of occlusion of persons (due to other persons) is small.
  • the present invention uses an omnidirectional sensor including a omni-camera 205 and a parabolic mirror 206 , for example, the OmniCam of S. Nayer, “Omnidirectional Video Camera,” Proceedings of the DARPA Image Understanding Workshop, Vol. 1, pp. 235-242, 1997.
  • This camera is preferably mounted below the ceiling 200 looking into the parabolic mirror located on the ceiling.
  • the parabolic mirror 206 enables the camera 205 to see in all directions simultaneously.
  • FIG. 2 is an illustration of one embodiment of the present invention.
  • inventions are contemplated, including, for example, different mirror alignments, alternative camera designs (including, for example, catadioptric stereo, panoramic, omni, and foveal cameras), varying the orientation of the cameras and multiple cameras systems.
  • the present invention can be employed using a verity of cameras, calibration modules (discussed below) including a combination of real world and image measurements, compensate for different perspectives.
  • the present invention uses omni-images to detect and estimate the precise location of a given person's foot in the room and this information is used to identify the pan, tilt and zoom settings for a high-resolution foveal camera.
  • An omni-image is the scene as viewed from the omni-camera 205 , typically in conjunction with a parabolic mirror 206 , mounted preferably on the ceiling 200 .
  • the choice of the various estimation steps in the system is motivated from image priors and real-time requirements.
  • the camera control parameters e.g., pan and tilt, are selected based on the location estimate and its uncertainty (that is derived from statistical analysis of the estimation steps) so as to center the person's head location in the foveal image.
  • the zoom parameter is set to maximum value possible so that the camera view still encloses the persons head within the image.
  • the general Bayesian formulation of the person detection and location estimation problem does not suit the real-time constraints imposed by the application.
  • this formulation is used only after a pruning step.
  • the pruning step rules out a majority of false alarms by designing an indexing step motivated by the two dimensional (2D) image priors (region size, shape, intensity characteristics) induced by the prior distribution in the 3D scene.
  • the prior distributions for person shape parameters including, for example, size, height, and his/her 3D location, are reasonably simple.
  • These priors on the person model parameters induce 2D spatially variant prior distributions in the projections, e.g., the region parameters for a given person in the image depends on the position in the image, whose form depends on the camera projection model and the 3D object shape.
  • the image intensity/color priors can be used in the present invention.
  • a method according to the present invention does not make assumptions about the object intensity, e.g., the homogeneity of the object since people can wear variety of clothing and the color spectrum of the light source is therefore not constrained.
  • the background is typically assumed to be a static scene (or a slowly time varying scene) with known background statistics. Gaussian mixtures are typically used to approximate these densities. To handle shadowing and illumination changes, these distributions are computed after the calculation of an illumination invariant measure from a local region in an image. The prior distribution of the spectral components of the illuminants are assumed to have same but unknown spectral distribution.
  • the noise model for CCD sensor noise 106 can be specified. This is typically chosen to be i.i.d. zero mean Gaussian noise in each color band.
  • the system preferably includes five functional modules: calibration, illumination-invariant measure computation at each pixel, indexing functions to select sectors of interest for hypothesis generation, statistical estimation of person parameters (e.g., foot location estimation), and foveal camera control parameter estimation.
  • a sensor 100 for example, an omnidirectional camera, records a scene 105 , which preferably is recorded as a color image, the scene 105 is sent to input 110 as: ⁇ circumflex over (R) ⁇ (x,y), ⁇ (x,y), ⁇ circumflex over (B) ⁇ (x,y).
  • the sensor is also subject to sensor noise 106 which will become part of the input 110 .
  • the input 110 is transformed 115 (T:R 3 ⁇ R 2 ), typically to compute an illumination invariant measure ⁇ circumflex over (r) ⁇ c (x,y), ⁇ c (x,y) 120 .
  • the statistical model for the distribution of the invariant measure is influenced by the sensor noise model and the transformation T(.).
  • the invariant measure mean (B o (x,y) (r b (x,y),g b (x,y))) and covariance matrix ⁇ B o (x,y), is computed at each pixel (x,y) from several samples of R(x,y), G(x,y), B(x,y) for the reference image 121 of the static scene.
  • ⁇ circumflex over (d) ⁇ 2 (x,y) image 130 is obtained by computing the Mahalanobis distance 125 between the current image data values ⁇ circumflex over (r) ⁇ c (x,y), ⁇ c (x,y) and the reference image data B o (x,y).
  • This distance image is used as input to two indexing functions P 1 ( ) 135 and P 2 ( ) 140 .
  • P 1 ( ) 135 discards the radial lines 2 by choosing hysteresis thresholding parameters 136 that satisfy a given combination of probability of false alarm and miss-detection values, passing the results 137 to P 2 ( ) 140 .
  • P 2 ( ) 140 discards segments along the radial lines in the same manner, by choosing hysteresis thresholding parameters 138 .
  • the result is a set of regions with high probability of significant change 141 .
  • the method employs a full blown statistical estimation technique 145 that uses the 3D model information 146 , camera geometry information 147 , priors 148 (including objects, shape, and 3D location), to estimate the number of objects and their positions 150 .
  • the method preferably estimates the control parameters 155 for the foveal camera based on the location estimates and uncertainties. Accordingly, the foveal camera is directed by the control parameters and hysteresis thresholding parameters, for example, a miss-detection threshold.
  • a background adaptation module 111 To generalize the system and cover outdoor and hybrid illumination situations (indoor plus outdoor illumination) as well as slow varying changes in the static background scene, the present invention incorporates a scheme described in “Adaptive background mixture models for real-time tracking”, Chris Stauffer, W. E. L. Grimson (Proceedings of the CVPR conference, 1999), incorporated herein by reference. It can be shown qualitatively that the statistics for background pixels can be approximated by a Gamma distribution. The statistics are stable within a given time window. In the present invention the background adaptation module is fused with the system, without changing the entire analysis and algorithm.
  • the result of the Grimson-approach is re-mapped pixelwise to obtain d ⁇ 2 in block 112 , following the transform described below.
  • d ⁇ 2 for each pixel
  • ⁇ circumflex over (d) ⁇ 2 value 130 see equ. 7
  • a new distance image is obtained. This distance image can be input to the index function 135 .
  • the output of the background adaptation module 111 is also used to update the static background statistics, as shown in block 121 .
  • the distribution of pixels of the new distance measurement are also Chi-square distributed. The only difference is a rise in the degree of freedoms from two to three.
  • the analysis remains the same, the thresholds are derived as described below. This is an illustration of how different modules can be fused in an existing framework without changing the statistical analysis. After reading the present invention, formulation of these additional modules will be within the purview of one ordinary skilled in the art.
  • ⁇ , and ⁇ be the foveal camera 210 control parameters for the tilt and pan angles respectively.
  • D p the projected real world distance between the foveal camera 210 and the person 220 .
  • D p the projected real world distance between the foveal camera 210 and the person 220 .
  • This step is the module that takes in as input, the current color image ( ⁇ circumflex over (R) ⁇ (x,y), ⁇ (x,y), ⁇ circumflex over (B) ⁇ (x,y)), normalizes it to obtain ( ⁇ circumflex over (r) ⁇ c (x,y), ⁇ c (x,y)) and compares it with the background statistical model (B o (x,y), ⁇ B o (x,y)) to produce an illumination invariant measure image ⁇ circumflex over (d) ⁇ 2 (x,y).
  • This section illustrates the derivation of the distribution of ⁇ circumflex over (d) ⁇ 2 (x,y) given that the input image measurements ⁇ circumflex over (R) ⁇ , ⁇ and ⁇ circumflex over (B) ⁇ are Gaussian with mean R,G,B, and identical standard deviation ⁇ .
  • the illumination prior assumption 116 is that the scene contains multiple light sources with the same spectral distribution with no constraint on individual intensities.
  • the method employs a shadow invariant representation of the color data.
  • the invariant representation is according to G. Wyszecki and W. S. Stiles “Color Science: Concepts and Methods, Quantitative Data and Formulae,” John Wiley & Son, 1982 incorporated herein by reference. Accordingly, let S ⁇ R+G+B.
  • ⁇ ⁇ circumflex over (r) ⁇ , ⁇ circumflex over (r) ⁇ 2 , ⁇ ⁇ , ⁇ 2 , and ⁇ ⁇ circumflex over (r) ⁇ , ⁇ 2 are determined offline for an entire OmniCam 205 frame, e.g., for each point or pixel on the image plane 207 . These points vary spatially. Note, that in the normalized space the covariance matrix for each pixel is different: Bright regions in the covariance image correspond to regions with high variance in the normalized image. These regions correspond to dark regions in RGB space.
  • a method according to the present invention calculates the test statistic, i.e., the Mahalanobis distance d 2 , that provides a normalized distance measure of a current pixel being background.
  • ⁇ circumflex over ( ⁇ ) ⁇ b be the vector of mean r b
  • mean g b at a certain background position mean b b is redundant, due to normalization
  • ⁇ circumflex over ( ⁇ ) ⁇ c be the corresponding vector of the current image pixel.
  • ⁇ circumflex over (d) ⁇ 2 is approximately ⁇
  • the method identifies sectored segments in the image that potentially contains people of interest.
  • the method defines two index functions P 1 ( ) and P 2 ( ) that are applied sequentially as shown in FIG. 1 .
  • P 1 ( ) and P 2 ( ) are projection operations. For instance, define ⁇ circumflex over (d) ⁇ 2 (R, ⁇ ) as the change detection measure image in polar coordinates with coordinate system origin at the omni-image center p(x c ,y c ).
  • P 1 ( ) is chosen to be the projection along radial lines to obtain ⁇ circumflex over (M) ⁇ ⁇ , the test statistic that can be used to identify changes along a given direction 2 .
  • This test statistic is justified by the fact that the object projection is approximated by a line-set (approximated as an ellipse) whose major axis passes through the omni-image center with a given length distribution that is a function of the radial foot position coordinates of the person in the omni-image.
  • This section derives the expressions for the probabilities of false alarm and misdetection at this step as a function of the input distributions for ⁇ circumflex over (d) ⁇ 2 (R, ⁇ ), the prior distribution for the expected fraction of the pixels along a given radial line belonging to the object, and the noncentrality parameter of ⁇ circumflex over (d) ⁇ 2 (R, ⁇ ) in object locations.
  • r m be the total number of pixels along a radial line L ⁇ x c ,y c , and k be the expected number of object pixels along this line.
  • the distribution of k can be derived from the projection model and the 3D prior models for person height, size, and position described previously.
  • the distribution of the cumulative measure is: Background M q ⁇ c 2r m 2 (0) (8)
  • the method can solve for an upper threshold T u similarly by evaluating the distribution in object equation above. Note that k is a function of H p , R f , and C. Therefore, the illustrative method would need to know the distributions of H p , R f , and c to solve for T u . Rather then make assumptions about the distribution of non-central parameter c, the method uses LUT T u (x m ) generated by simulations instead.
  • the derivation of the distribution of the test statistic and the choice of the thresholds are exactly similar to the above step.
  • the illustrative method derives the distributions of the ⁇ circumflex over (d) ⁇ 2 image measurements, and has narrowed the hypotheses for people location and attributes.
  • the method performs a Bayes estimation of person locations and attributes. This step uses the likelihood models L( ⁇ circumflex over (d) ⁇ 2
  • the present embodiment uses the fact that the probability of occlusion of a person is small to assert that the probability of a sector containing multiple people is small.
  • the center angle 2 f of a given sector would in this instance provide the estimate of the major axis of the ellipse corresponding to the person. It is then sufficient to estimate the foot location of person along the radial line corresponding to 2 f .
  • the center angle 2 f of the sector defines the estimate for the angular component of the foot position.
  • the illustrative method approximates ⁇ circumflex over ( ⁇ ) ⁇ f to be normal distributed with unknown 2 f and variance ⁇ f . 2 f 's are estimated as the center positions of the angular sectors given by P 1 ( ).
  • the standard deviation of a given estimate can be determined by assuming that the width of the angular sector gives the 99 percentile confidence interval. Alternatively, this estimation can be obtained through sampling techniques.
  • an estimate of the uncertainty in the foot position r f is made.
  • the method provides pdf's up to the latest step in the algorithm. At this point it is affordable to simulate the distribution of r f and generate ⁇ ⁇ circumflex over (r) ⁇ f 2 via perturbation analysis, since only few estimates with known distributions are involved in few operations.
  • the method can approximate ⁇ circumflex over (r) ⁇ f as Gaussian distributed with unknown mean ⁇ circumflex over (r) ⁇ f , and variance ⁇ ⁇ circumflex over (r) ⁇ f 2 .
  • the method can apply formula 1 through 4 above, to estimate 3D distances R p , D p , and foveal camera control parameter tilt ⁇ , pan ⁇ and zoom factor z.
  • FIGS. 4 and 5 illustrate how uncertainties in 3D radial distance R p influence the foveal camera control parameters.
  • ⁇ circumflex over (r) ⁇ m , ⁇ circumflex over (r) ⁇ p , ⁇ o , ⁇ p , ⁇ f , and ⁇ circumflex over (D) ⁇ c are Gaussian random variables with true unknown means r m , r p ,H o ,H p ,R h ,H f , and D c , and variances ⁇ ⁇ circumflex over (r) ⁇ m 2 , ⁇ ⁇ circumflex over (r) ⁇ p 2 , ⁇ ⁇ o 2 , ⁇ ⁇ p 2 , ⁇ ⁇ circumflex over (R) ⁇ h 2 , ⁇ ⁇ f 2 ; and ⁇ ⁇ circumflex over (D) ⁇ c 2 respectively (all estimated in the calibration phase).
  • the method derives the horizontal and vertical angle of view for the foveal camera, ⁇ h respectively ⁇ v , which map directly to the zoom parameter z.
  • FIGS. 4 and 5 show the geometric relationships for the vertical case. Following equation provides the vertical angle of view.
  • factor f v solves for ⁇ 0 fv 2 ⁇ N ⁇ ( 0 , 1 )
  • ⁇ d ⁇ x z 2 ⁇ % given user specified confidence percentile x z that the head is display in the foveal frame. Similar derivations apply for the horizontal case.
  • the method verifies the correctness of the theoretical expressions and approximations through extensive simulations only show plots validating expressions for illumination normalization (eqn. 5), and for foveal camera control parameters (eqn. 13, 14). This validation assumes correctness of the underlying statistical models. Validation of the models on real data is discussed below.
  • the correctness of the models is verified by comparing ground truth values against module estimates for mean and variance of the running system.
  • the following is an illustration of an embodiment of the present invention, eight positions P 1 -P 8 are marked having different radial distances and pan angles. Positions and test persons were chosen to simulate different positions, illumination, and contrast.
  • the table for the final foveal camera control parameters is for one person. Ground truth values for the mean values were taken by measuring tilt angle a, and pan angle b by hand, and are compared against the corresponding mean of system measurements estimated from 100 trials per position and person. The variances calculated from the system estimates for pan and tilt angle are compared against the average of the corresponding variance-estimates calculated based on the analysis.
  • the present invention is reliable in terms of detection and zooming over longtime experiments within the operational limits denoted by the outer line of the upper right contour plot.
  • the setup of the system influences precision globally and locally.
  • Preferred directions of low uncertainties can be used to adapt the system to user defined accuracy constraints in certain areas of the room.
  • a system for monitoring in and around an automobile uses an omni-directional sensor (a standard camera plus a mirror assembly) to obtain a global view of the surroundings within and outside the automobile.
  • the omni-camera video is used for detection and tracking of objects within and around the automobile.
  • the concept is an extension of the methods described above with respect to tracking objects within a room.
  • the system can be used to improve safety and security.
  • the video analysis system can include multiple modules.
  • a calibration module where the center of the Omni-camera image is used with height information of the ceiling of the automobile to translate image coordinates to ground plane coordinates. Where a CAD model of the automobile is available, the image coordinates can be mapped to a 3D point on the interior of the automobile using this calibration step (if the automobile is not occupied).
  • a change detection module that compares a reference map (reference image plus variation around the reference image) to current observed image map to determine a pixel-based change detection measure. This is done by transforming the color video stream into normalized color space (to deal with illumination variation). The change detection measure is used to index into a set of possible hypothesis for object positions and locations.
  • Yet another example includes a background update module for varying background conditions (e.g. gain control change, illumination changes).
  • a grouping module that takes the change detection measure along with a geometric model of the environment and the objects to identify likely object locations.
  • the method provides the areas in the image corresponding to the windows and model people by upright cylinders when they are outside of the automobile. In the interior of the automobile, people can be modeled by generalized cylinders.
  • Still another module includes an object tracking module that takes location information over time to do prediction of object locations in the subsequent time step and to re-estimate their new locations.
  • the visualization is presented on a color liquid crystal display (LCD) panel mounted with the rear-view mirror.
  • the visualization module presents geometrically warped video of the omni-cam video.
  • modules are contemplated by the present invention including, for example, a module that determines an approaching object's potential threat, e.g., at a higher rate of speed or from a particular direction.
  • the OmniCam is a catadioptric system that includes two parts: a parabolic mirror; and a standard CCD camera looking into it.
  • the invention is useful as a sensor for use in driver assistance. It is also useful for monitoring the surroundings when the automobile is stationary and for recording videos in the event that a person approaches the automobile and attempts to get unauthorized access.
  • the omni-camera system can be use in conjunction with a pan-tilt camera to enable the capture of a zoomed up image of the persons involved.
  • a security system integrating vision, global positioning system (GPS) and mobile phone, can transmit the time, location and the face image of the person to a central security agency.
  • GPS global positioning system
  • the ability to present the panoramic view of the surroundings provides a method to alert the driver to potential danger in the surrounding area by visually emphasizing the region in the panoramic view.
  • the mounting position of the Omni-camera looking up into a parabolic mirror located on the ceiling of the automobile (preferably centered), parts of the surroundings that are invisible to the driver are visible in the Omni-view.
  • the driver blind spot area is significantly reduced.
  • By evaluating the panoramic view it is possible to trigger warnings, e.g., if other cars enter a driver's blind spot. If automobile status information (speed, steering wheel position, predicted track) is combined with panoramic video processing it is possible to alert a driver to impending dangers or potential accidents.
  • the present invention contemplates a system and method for tracking an object.
  • the invention can be employed in varying circumstances, for example, video conferencing, distance learning, and security stations where a user can define an area of interest there by replacing traditional systems employing banks of monitors.
  • the present invention also contemplates an application wherein the system is used in conjunction with a data-log for recording time and location together with images of persons present.
  • the system can associate an image with recorded information upon the occurrence of an event, e.g., a person sits at a computer terminal within an area defined for surveillance.
  • the data-log portion of the system is preferably performed by a computer, where the computer records, for example, the time, location, and identity of the subject, as well as an accompanying image.
  • the present invention is not limited to the above applications, rather the invention can be implemented in any situations where object detection, tracking, and zooming is needed.

Abstract

The present invention relates to a method for visually detecting and tracking an object through a space. The method chooses modules for a restricting a search function within the space to regions with a high probability of significant change, the search function operating on images supplied by a camera. The method also derives statistical models for errors, including quantifying an indexing step performed by an indexing module, and tuning system parameters. Further the method applies a likelihood model for candidate hypothesis evaluation and object parameters estimation for locating the object.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates to computer vision systems, more particularly to a system having computationally efficient real-time object detection, tracking, and zooming capabilities.
  • 2. Description of Prior Art
  • Recent advancements in processing and sensing performances are facilitating increased development of real-time video surveillance and monitoring systems.
  • The development of computer vision systems that meet application specific computational and accuracy needs are important to the deployment of real-life computer vision systems. Such a computer vision system has not yet been realized.
  • Past works have addressed methodological issues and have demonstrated performance analysis of components and systems. However, it is still an art to engineer systems that meet given application needs in terms of computational speed and accuracy. The trend in the art is to emphasize statistical learning methods, more particularly Bayesian methods for solving computer vision problems. However, there still exists the problem of choosing the right statistical likelihood model and the right priors to suit the needs of an application. Moreover, it is still computationally difficult to satisfy real-time application needs.
  • Sequential decomposition of the total task into manageable sub-tasks (with reasonable computational complexity) and the introduction of pruning thresholds is one method to solve the problem. Yet, this introduces additional problems because of the difficulty in approximating the probability distributions of observables at the final step of the system so that Bayesian inference is plausible. This approach to perceptual Bayesian is described, for example, in V. Ramesh et al., “Computer Vision Performance Characterization,” RADIUS: Image Understanding for Imagery Intelligence, edited by, O. Firschein and T. Strat, Morgan Kaufmann Publishers, San Francisco, 1997, incorporated herein by reference, and W. Mann and T. Binford, “Probabilities for Bayesian Networks in Vision,” Proceedings of the ARPA IU Workshop, 1994, Vol. 1, pp. 633-643. The work done by Ramesh et al., places an emphasis on performance characterization of a system, while Mann and Binford attempted Bayesian inference (using Bayesian networks) for visual recognition. The idea of gradual pruning of candidate hypotheses to tame the computational complexity of the estimation/classification problem has been presented by Y. Amit and D. Geman, “A computational model for visual selection,” Neural Computation, 1999. However, none of the works identify how the sub-tasks (e.g., feature extraction steps) can be chosen automatically given an application context.
  • Therefore, a need exists for a method and apparatus for a computationally efficient, real-time camera surveillance system with defined computational and accuracy constraints.
  • SUMMARY OF THE INVENTION
  • The present invention relates to computer vision systems, more particularly to a system having computationally efficient real-time detection and zooming capabilities.
  • According to an embodiment of the present invention, by choosing system modules and performing an analysis of the influence of various tuning parameters on the system a method according to the present invention performs proper statistical inference, automatically set control parameters and quantify limits of a dual-camera real-time video surveillance system. The present invention provides continuous high resolution zoomed-in image of a person's head at any location in a monitored area. Preferably, an omni-directional camera video used to detect people and to precisely control a high resolution foveal camera, which has pan, tilt and zoom capabilities. The pan and tilt parameters of the foveal camera and its uncertainties are shown to be functions of the underlying geometry, lighting conditions, background color/contrast, relative position of the person with respect to both cameras as well as sensor noise and calibration errors. The uncertainty in the estimates is used to adaptively estimate the zoom parameter that guarantees with a user specified probability, ∀, that the detected person's face is contained and zoomed within the image.
  • The present invention includes a method for selecting intermediate transforms (components of the system), as well as processing various parameters in the system to perform statistical inference, automatically setting the control parameters and quantifying a dual-camera real-time video surveillance system.
  • Another embodiment of the present invention relates to a method for visually locating and tracking an object through a space. The method chooses modules for a restricting a search function within the space to regions with a high probability of significant change, the search function operating on images supplied by a camera. The method also derives statistical models for errors, including quantifying an indexing step performed by an indexing module, and tuning system parameters. Further, the method applies a likelihood model for candidate hypothesis evaluation and object parameters estimation for locating the object.
  • The step of choosing the plurality of modules further includes applying a calibration module for determining a static scene, applying an illumination-invariant module for tracking image transformation, and applying the indexing module for selecting regions of interest for hypothesis generation. Further, the method can apply a statistical estimation module for estimating a number of objects and their positions and apply a foveal camera control module for estimating control parameters of a foveal camera based on location estimates and uncertainties.
  • Additional modules can be applied by the method, for example, a background adaptation module for detecting and tracking the object in dynamically varying illumination situations.
  • Each module is application specific based on prior distributions for imposing restrictions on a search function. The prior distributions includes for example: an object geometry model; a camera geometry model; a camera error model; and an illumination model.
  • According to an embodiment of the present invention the camera is an omnicamera. Further, the object is tracked using a foveal camera.
  • The method derives statistical models a number of times to achieve a given probability of misdetection and false alarm rate. The method also validates a theoretical model for the space monitored for determining correctness and closeness to reality. The indexing module selects regions with a high probability of significant change, motivated by two dimensional image priors induced by prior distributions in the space, where the space is in three dimensional.
  • The method of applying a likelihood model includes estimating an uncertainty of the object's parameters for predicting a system's performance and for automating control of the system.
  • In an alternative embodiment the method can be employed in an automobile wherein the space includes an interior compartment of the automobile and/or the exterior of the automobile.
  • In yet another embodiment of the present invention, a computer program product is presented. The program product includes a computer program code stored on a computer readable storage medium for, for detecting and tracking objects through a space. The computer program product includes computer readable program code for causing a computer to choose modules for a restricting search functions within a context to regions with a high probability of significant change within the space. The computer program product also includes computer readable, program code for causing a computer to derive statistical models for errors, including quantifying an indexing step, and tuning system parameters. Further included is computer readable program code for causing a computer to apply a likelihood model for candidate hypothesis evaluation and object parameters estimation within the space.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Preferred embodiments of the present invention will be described below in more detail with reference to the accompanying drawings:
  • FIG. 1 is a block diagram showing a method for tracking an object through a space according to one embodiment of the present invention;
  • FIG. 2 is an illustration of a system of cameras for tracking a person according to one embodiment of the present invention;
  • FIG. 3 is an illustration of an omni-image including the geometric relationships between elements of the system while tracking a person according to one embodiment of the present invention;
  • FIG. 4 is an illustration of how uncertainties in three dimensional radial distances-influence foveal camera control parameters; and
  • FIG. 5 is an illustration of the geometric relationship between a foveal camera and a person.
  • Throughout the diagrams, like labels in different figures denote like or corresponding elements or relationships. Further, the drawings are not to scale.
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • The present invention solves the problems existing in the prior art described above, based on the following methods.
  • System Configuration choice: According to one embodiment of the present invention, modules are chosen for an optical surveillance system, by use of context, in other words: application specific prior distributions for modules. These modules can include, for example, object geometry, camera geometry, error models and illumination models. Real-time constraints are imposed by pruning or indexing functions that restrict the search space for hypotheses. The choice of the pruning functions is derived from the application context and prior knowledge. A proper indexing function will be one that simplifies computation of the probability of false hypothesis or the probability of missing a true hypotheses as a function of the tuning constraints.
  • Statistical Modeling and Performance
  • Characterization: According to an aspect of the present invention, the derivation of statistical models for errors at various stages in the chosen vision system configuration assists in quantifying the indexing step. The parameters are tuned to achieve a given probability of miss-detection and false alarm rate. In addition, a validation of theoretical models is performed for correctness (through Monte-Carlo simulations) and closeness to reality (through real experiments).
  • Hypotheses verification and parameter estimation: Bayesian estimation is preferably used to evaluate candidate hypotheses and estimate object parameters by using a likelihood model, P (measurements/hypothesis), that takes into account the effects of the pre-processing steps and tuning parameters. In addition, the uncertainty of the estimate is derived to predict system performance.
  • One embodiment of the present invention includes a two camera surveillance system which continuously provides zoomed-in high resolution images of the face of a person present in a room. These images represent the input to higher-level vision modules, e.g., face recognition, compaction and event-logging.
  • In another embodiment, the present invention provides: 1) real-time performance on a low-cost PC, 2) person misdetection rate of Πm, 3) person false-alarm rate of Πf, 4) adaptive zooming of person irrespective of background scene structure with maximal possible zoom based on uncertainty of person attributes estimated (e.g., location in three dimensional (3D), height, etc.), with performance of the result characterized by face resolution attainable in area of face pixel region (as a function of distance, contrast between background and object, and sensor noise variance and resolution) and bias in the centering of the face. In addition, the method makes assumptions about scene structure, for example, the scene illuminate consists of light sources with similar spectrum (e.g., identical light sources in an office area), the number of people to the detected and tracked is bounded, and the probability of occlusion of persons (due to other persons) is small.
  • Referring to FIG. 2, to continuously monitor an entire scene, the present invention uses an omnidirectional sensor including a omni-camera 205 and a parabolic mirror 206, for example, the OmniCam of S. Nayer, “Omnidirectional Video Camera,” Proceedings of the DARPA Image Understanding Workshop, Vol. 1, pp. 235-242, 1997. This camera is preferably mounted below the ceiling 200 looking into the parabolic mirror located on the ceiling. The parabolic mirror 206 enables the camera 205 to see in all directions simultaneously. Note that FIG. 2 is an illustration of one embodiment of the present invention. Other embodiments are contemplated, including, for example, different mirror alignments, alternative camera designs (including, for example, catadioptric stereo, panoramic, omni, and foveal cameras), varying the orientation of the cameras and multiple cameras systems. The present invention can be employed using a verity of cameras, calibration modules (discussed below) including a combination of real world and image measurements, compensate for different perspectives.
  • The present invention uses omni-images to detect and estimate the precise location of a given person's foot in the room and this information is used to identify the pan, tilt and zoom settings for a high-resolution foveal camera. An omni-image is the scene as viewed from the omni-camera 205, typically in conjunction with a parabolic mirror 206, mounted preferably on the ceiling 200.
  • According to one embodiment of the present invention, the choice of the various estimation steps in the system is motivated from image priors and real-time requirements. The camera control parameters, e.g., pan and tilt, are selected based on the location estimate and its uncertainty (that is derived from statistical analysis of the estimation steps) so as to center the person's head location in the foveal image. The zoom parameter is set to maximum value possible so that the camera view still encloses the persons head within the image.
  • The general Bayesian formulation of the person detection and location estimation problem does not suit the real-time constraints imposed by the application. In one embodiment of the present invention, this formulation is used only after a pruning step. The pruning step rules out a majority of false alarms by designing an indexing step motivated by the two dimensional (2D) image priors (region size, shape, intensity characteristics) induced by the prior distribution in the 3D scene. The prior distributions for person shape parameters, including, for example, size, height, and his/her 3D location, are reasonably simple. These priors on the person model parameters induce 2D spatially variant prior distributions in the projections, e.g., the region parameters for a given person in the image depends on the position in the image, whose form depends on the camera projection model and the 3D object shape. In addition to shape priors, the image intensity/color priors can be used in the present invention.
  • Typically, a method according to the present invention does not make assumptions about the object intensity, e.g., the homogeneity of the object since people can wear variety of clothing and the color spectrum of the light source is therefore not constrained. However, in an alternative embodiment, in a surveillance application, the background is typically assumed to be a static scene (or a slowly time varying scene) with known background statistics. Gaussian mixtures are typically used to approximate these densities. To handle shadowing and illumination changes, these distributions are computed after the calculation of an illumination invariant measure from a local region in an image. The prior distribution of the spectral components of the illuminants are assumed to have same but unknown spectral distribution. Further, the noise model for CCD sensor noise 106 can be specified. This is typically chosen to be i.i.d. zero mean Gaussian noise in each color band.
  • In one embodiment of the present invention, the system preferably includes five functional modules: calibration, illumination-invariant measure computation at each pixel, indexing functions to select sectors of interest for hypothesis generation, statistical estimation of person parameters (e.g., foot location estimation), and foveal camera control parameter estimation.
  • Referring to FIG. 1, block diagram of the transformations applied to the input. A sensor 100, for example, an omnidirectional camera, records a scene 105, which preferably is recorded as a color image, the scene 105 is sent to input 110 as: {circumflex over (R)}(x,y),Ĝ(x,y),{circumflex over (B)}(x,y). The sensor is also subject to sensor noise 106 which will become part of the input 110.
  • The input 110, defined above, is transformed 115 (T:R3→R2), typically to compute an illumination invariant measure {circumflex over (r)}c(x,y),ĝc(x,y) 120. The statistical model for the distribution of the invariant measure is influenced by the sensor noise model and the transformation T(.). The invariant measure mean (Bo(x,y)=(rb(x,y),gb(x,y))) and covariance matrix ΣB o (x,y), is computed at each pixel (x,y) from several samples of R(x,y), G(x,y), B(x,y) for the reference image 121 of the static scene. A change detection measure
  • {circumflex over (d)}2(x,y) image 130 is obtained by computing the Mahalanobis distance 125 between the current image data values {circumflex over (r)}c(x,y),ĝc(x,y) and the reference image data Bo(x,y). This distance image is used as input to two indexing functions P1( ) 135 and P2( ) 140. P1( ) 135 discards the radial lines 2 by choosing hysteresis thresholding parameters 136 that satisfy a given combination of probability of false alarm and miss-detection values, passing the results 137 to P2( ) 140. P2( ) 140 discards segments along the radial lines in the same manner, by choosing hysteresis thresholding parameters 138. The result is a set of regions with high probability of significant change 141. At this point the method employs a full blown statistical estimation technique 145 that uses the 3D model information 146, camera geometry information 147, priors 148 (including objects, shape, and 3D location), to estimate the number of objects and their positions 150. The method preferably estimates the control parameters 155 for the foveal camera based on the location estimates and uncertainties. Accordingly, the foveal camera is directed by the control parameters and hysteresis thresholding parameters, for example, a miss-detection threshold.
  • Additional modules are contemplated by the present invention. For example, a background adaptation module 111. To generalize the system and cover outdoor and hybrid illumination situations (indoor plus outdoor illumination) as well as slow varying changes in the static background scene, the present invention incorporates a scheme described in “Adaptive background mixture models for real-time tracking”, Chris Stauffer, W. E. L. Grimson (Proceedings of the CVPR conference, 1999), incorporated herein by reference. It can be shown qualitatively that the statistics for background pixels can be approximated by a Gamma distribution. The statistics are stable within a given time window. In the present invention the background adaptation module is fused with the system, without changing the entire analysis and algorithm. By re-mapping the test-statistic derived from the data, so that the cumulative density function of the re-mapped test-statistic approximates the cumulative density function of a Chi-square distribution. Therefore, the result of the Grimson-approach is re-mapped pixelwise to obtain dĝ2 in block 112, following the transform described below. By adding dĝ2 (for each pixel) to the {circumflex over (d)}2 value 130 (see equ. 7), a new distance image is obtained. This distance image can be input to the index function 135.
  • The output of the background adaptation module 111 is also used to update the static background statistics, as shown in block 121.
  • The distribution of pixels of the new distance measurement are also Chi-square distributed. The only difference is a rise in the degree of freedoms from two to three. The analysis remains the same, the thresholds are derived as described below. This is an illustration of how different modules can be fused in an existing framework without changing the statistical analysis. After reading the present invention, formulation of these additional modules will be within the purview of one ordinary skilled in the art.
  • The projection model for the two cameras is discussed below with respect to FIGS. 2 through 5. The following geometric model parameters are denoted as:
      • Ho height of OmniCam above floor (inches)
      • Hf height of foveal camera above floor (inches)
      • Hp person's height (inches)
      • Rh person's head radius (inches)
      • Rf person's foot position in world coordinates (inches)
      • Dc on floor projected distance between cameras (inches)
      • p(xc,yc) position of OmniCam center, (in omni-image) (pixel coordinates)
      • rm radius of parabolic mirror (in omni-image)(pixels)
      • rh distance person's head—(in omni-image)(pixels)
      • rf distance person's foot—(in omni-image) (pixels)
      • η—angle between the person and the foveal camera relative to the OmniCam image center (Please see FIG. 3).
      • 2—angle between the radial line corresponding to the person and the zero reference line (please see FIG. 3).
  • Where capital variables are variables in 3D, and small variables are given in image coordinates. During the calibration step (combination of real world and image measurements) Ho, Hf, Dc, rm and p(xc,yc) are initialized and the corresponding standard deviations or tolerances are determined. In a preferred embodiment the calibration step is performed offline. Heights are typically calculated from the floor 201 up.
  • Using the geometric features of an OmniCam 205, including a parabolic mirror, and under the hypothesis that the person 220 is standing upright, the relationship between rf respectively rh and Rp can be shown to be: R p = a H o a = 2 r m r f r m 2 - r f 2 with ( 1 ) R p = b ( H o - H p ) b = 2 r m r h r m 2 - r h 2 with ( 2 )
    Let ∀, and ∃ be the foveal camera 210 control parameters for the tilt and pan angles respectively. Further, Dp, the projected real world distance between the foveal camera 210 and the person 220. Assuming, the person's head is approximately located over his/her feet, and using basic trigonometry in FIGS. 2 and 3, it can easily be seen that Dp, ∀, and ∃ are equal to: D p = D c 2 + R p 2 - 2 D c R p cos ( ϑ ) ( 3 ) tan ( ) = H p - R h - H f D p ; sin ( β ) = R p D p sin ( ϑ ) ( 4 )
    where θ is the angle between the person 220 and the foveal camera 210 relative to the OmniCam 205 position.
  • This step is the module that takes in as input, the current color image ({circumflex over (R)}(x,y),Ĝ(x,y),{circumflex over (B)}(x,y)), normalizes it to obtain ({circumflex over (r)}c(x,y),ĝc(x,y)) and compares it with the background statistical model (Bo(x,y),ΣB o (x,y)) to produce an illumination invariant measure image {circumflex over (d)}2(x,y). This section illustrates the derivation of the distribution of {circumflex over (d)}2(x,y) given that the input image measurements {circumflex over (R)}, Ĝ and {circumflex over (B)} are Gaussian with mean R,G,B, and identical standard deviation Φ.
  • With respect to FIG. 1, the illumination prior assumption 116, is that the scene contains multiple light sources with the same spectral distribution with no constraint on individual intensities. To compensate for shadows which are often present in the image, the method employs a shadow invariant representation of the color data. The invariant representation is according to G. Wyszecki and W. S. Stiles “Color Science: Concepts and Methods, Quantitative Data and Formulae,” John Wiley & Son, 1982 incorporated herein by reference. Accordingly, let S═R+G+B. The illumination normalizing transform T:R3→R2 appropriate for the method's assumptions is: r = R R + G + B , g = G R + G + B .
    It can be shown that, the uncertainties in the normalized estimates {circumflex over (r)} and ĝ are dependent not only on sensor noise variance, but also on the actual true unknown values of the underlying samples (due to the non-linearities in the transformation T(.)). Based on the assumption of a moderate signal to noise ratio (i.e., Φ<<S), the method approximates ({circumflex over (r)},ĝ)T as having a normal distribution with pixel-dependent covariance matrix ( r ^ g ^ ) = ( R + η R S + η R + η G + η B G + η R S + η R + η G + η B ) ~ N ( ( r g ) , r ^ , g ^ ) with r ^ , g ^ = σ 2 S 2 ( 1 - 2 R S + 3 R 2 S 2 - R + G S + 3 R G S 2 - R + G S + 3 R G S 2 1 - 2 G S + 3 G 2 S 2 ) ( 5 )
  • The values of σ{circumflex over (r)},{circumflex over (r)} 2, σĝ,ĝ 2, and σ{circumflex over (r)},ĝ 2 are determined offline for an entire OmniCam 205 frame, e.g., for each point or pixel on the image plane 207. These points vary spatially. Note, that in the normalized space the covariance matrix for each pixel is different: Bright regions in the covariance image correspond to regions with high variance in the normalized image. These regions correspond to dark regions in RGB space.
  • Since the covariance matrices in the normalized space are pixel-dependent, a method according to the present invention calculates the test statistic, i.e., the Mahalanobis distance d2, that provides a normalized distance measure of a current pixel being background. Let {circumflex over (μ)}b be the vector of mean rb, and mean gb at a certain background position (mean bb is redundant, due to normalization), and {circumflex over (μ)}c be the corresponding vector of the current image pixel. Since ( r ^ c - r ^ b g ^ c - g ^ b ) ~ N ( ( r c - r b g c - g b ) , r ^ c , g ^ c + r ^ b , g ^ b ) ( 6 )
    the method can define, for each pixel, a metric d2 which corresponds to the probability, that {circumflex over (μ)}c is background pixel:
    {circumflex over (d)} 2=({circumflex over (μ)}b−{circumflex over (μ)}c)T(2Σ{circumflex over (r)} b −ĝ b )−1({circumflex over (μ)}b−{circumflex over (μ)}c)  (7)
    For background pixels, {circumflex over (d)}2 is approximately χ2 distributed with two degrees of freedom. For object pixels {circumflex over (d)}2 happens to be non-central χ2 distributed with two degrees of freedom, and non-centrality parameter c.
  • To address real-time computational requirements of the application the method identifies sectored segments in the image that potentially contains people of interest. To perform this indexing step in a computational efficient manner the method defines two index functions P1( ) and P2( ) that are applied sequentially as shown in FIG. 1. Essentially P1( ) and P2( ) are projection operations. For instance, define {circumflex over (d)}2(R,θ) as the change detection measure image in polar coordinates with coordinate system origin at the omni-image center p(xc,yc). Then, P1( ) is chosen to be the projection along radial lines to obtain {circumflex over (M)}θ, the test statistic that can be used to identify changes along a given direction 2. This test statistic is justified by the fact that the object projection is approximated by a line-set (approximated as an ellipse) whose major axis passes through the omni-image center with a given length distribution that is a function of the radial foot position coordinates of the person in the omni-image. This section derives the expressions for the probabilities of false alarm and misdetection at this step as a function of the input distributions for {circumflex over (d)}2(R,θ), the prior distribution for the expected fraction of the pixels along a given radial line belonging to the object, and the noncentrality parameter of {circumflex over (d)}2(R,θ) in object locations.
  • Let Lθ x c ,y c be a radial line trough p(xc,yc) parameterized by angle 2, and {circumflex over (M)}(θ)=Σrdθ 2(r) denote the accumulative measure of d2 values at image position p(θ,r) parameterized by angle 2 and distance r in a polar coordinate system at p(xc,yc). Applying Canny's hysteresis thresholding technique on {circumflex over (M)}(θ), provides the sectors of significant change bounded by left and right angles 21 respectively 2r. Let rm be the total number of pixels along a radial line Lθ x c ,y c , and k be the expected number of object pixels along this line. The distribution of k can be derived from the projection model and the 3D prior models for person height, size, and position described previously. The distribution of the cumulative measure is:
    Background Mq˜c2r m 2(0)  (8)
    Object Mq˜(rm −k)c2(r m −k)2(0)+kc2k 2(c)  (9)
    with cε[0 . . . inf).
  • To obtain a false-alarm rate for false sectors of equal or less than xf% the method can set the lower threshold T1 so that
    0 T 1 χ{circumflex over (M)} θ 2(ε)dε=1−x f%  (10)
    To guarantee a misdetection rate of equal or less than xm%, theoretically, the method can solve for an upper threshold Tu similarly by evaluating the distribution in object equation above. Note that k is a function of Hp, Rf, and C. Therefore, the illustrative method would need to know the distributions of Hp, Rf, and c to solve for Tu. Rather then make assumptions about the distribution of non-central parameter c, the method uses LUT Tu(xm) generated by simulations instead.
  • The second index function P2( ) essentially takes as input the domain corresponding to the radial lines of interest and performs a pruning operation along the radial lines R. This is done by the computation of {overscore (d)}θ f 2(r) the integration of the values {circumflex over (d)}2( ) along 2f=2+π/2 (within a finite window whose size is determined by the prior density of the minor axis of the ellipse projection), for each point r on the radial line 2. The derivation of the distribution of the test statistic and the choice of the thresholds are exactly similar to the above step.
  • The illustrative method derives the distributions of the {circumflex over (d)}2 image measurements, and has narrowed the hypotheses for people location and attributes. The method performs a Bayes estimation of person locations and attributes. This step uses the likelihood models L({circumflex over (d)}2|background) and L({circumflex over (d)}2|object) along with 2D prior models for person attributes induced by 3D object priors P(Rp), P(H), P(2) and P(S). The present embodiment uses the fact that the probability of occlusion of a person is small to assert that the probability of a sector containing multiple people is small. The center angle 2f of a given sector would in this instance provide the estimate of the major axis of the ellipse corresponding to the person. It is then sufficient to estimate the foot location of person along the radial line corresponding to 2f. The center angle 2f of the sector defines the estimate for the angular component of the foot position. The illustrative method approximates {circumflex over (θ)}f to be normal distributed with unknown 2f and variance σθf. 2f's are estimated as the center positions of the angular sectors given by P1( ). The standard deviation of a given estimate can be determined by assuming that the width of the angular sector gives the 99 percentile confidence interval. Alternatively, this estimation can be obtained through sampling techniques.
  • Given the line 2f it is necessary to estimate the foot position of the person along this radial line. To find this estimate and variance of the radial foot position rf the method chooses the best hypothesis for the foot position that minimizes the Bayes error. Let P(hi|m) denote the posterior probability to be maximized, where hi denotes the ith out of multiple foot position hypotheses and m the measurements ({overscore (d)}θ f 2(r)), that are statistically independent; hyper-script b or o denotes background respectively object: P ( h i m ) = P ( h i b m b ) P ( h i o m o ) = P ( h i b m b ) ( 1 - P ( h _ i o m o ) ) = p ( m b h i b ) P ( h i b ) p ( m b ) p ( m o ) - p ( m o h _ i o ) P ( h _ i o ) p ( m o ) ( 11 )
    where p denotes the density function. P(hi*m) becomes maximal for maximal p(mb|hi b) and minimal p(mo|{overscore (h)}i o), so that r f = arg max r f log ( p ( m b h i b ) p ( m o h _ i o ) ) = arg max r f ( r = 0 r f - 1 d _ θ f 2 ( r ) + r = r h ( r f ) r m d _ θ f 2 ( r ) - r = r f r h ( r f ) - 1 d _ θ f 2 ( r ) ) ( 12 )
  • In one embodiment of the present invention, an estimate of the uncertainty in the foot position rf is made. The method provides pdf's up to the latest step in the algorithm. At this point it is affordable to simulate the distribution of rf and generate σ{circumflex over (r)} f 2 via perturbation analysis, since only few estimates with known distributions are involved in few operations. The method can approximate {circumflex over (r)}f as Gaussian distributed with unknown mean {circumflex over (r)}f, and variance σ{circumflex over (r)} f 2.
  • Once the foot position P(θf, rf) is known, the method can apply formula 1 through 4 above, to estimate 3D distances Rp, Dp, and foveal camera control parameter tilt ∀, pan ∃ and zoom factor z.
  • FIGS. 4 and 5 illustrate how uncertainties in 3D radial distance Rp influence the foveal camera control parameters. For the following error propagation steps the method assumes that {circumflex over (r)}m,{circumflex over (r)}popf, and {circumflex over (D)}c are Gaussian random variables with true unknown means rm, rp,Ho,Hp,Rh,Hf, and Dc, and variances σ{circumflex over (r)} m 2{circumflex over (r)} p 2Ĥ o 2Ĥ p 2{circumflex over (R)} h 2, σĤ f 2; and σ{circumflex over (D)} c 2 respectively (all estimated in the calibration phase). The estimates and it's uncertainties propagate through the geometric transformations. The method produces the final results for the uncertainties in tilt ∀, and pan ∃, which were used to calculate the zoom parameter z. (for more details, and derivations of σ{circumflex over (R)} p 2, σ{circumflex over (D)} p 2 see M. Greiffenhagen and V. Ramesh, “Auto-Camera-Man: Multi-Sensor Based Real-Time People Detection and Tracking System,” Technical Report, Siemens Corporate Research, Princeton, N.J., USA, November 1999.): σ tan α 2 = σ D ^ p 2 D p 4 ( ( H p - R h - H f ) 2 + σ H ^ p 2 + σ R ^ p 2 + σ H ^ f 2 ) + σ H ^ p 2 + σ R ^ p 2 + σ H ^ f 2 D p 2 ( 13 ) σ sin β 2 = R p 2 σ υ ^ 2 cos 2 υ D p 2 + ( sin 2 v + σ v ^ 2 cos 2 v ) * ( R p 2 σ D ^ p 2 D p 4 + σ R ^ p 2 D p 2 + σ R ^ p 2 σ D ^ p 2 D p 4 ) ( 14 )
  • Given the uncertainties in the estimates, the method derives the horizontal and vertical angle of view for the foveal camera, γh respectively γv, which map directly to the zoom parameter z. FIGS. 4 and 5 show the geometric relationships for the vertical case. Following equation provides the vertical angle of view. γ v = 2 a tan ( R ^ h + f v σ tan α ^ D ^ p R ^ h 2 + D ^ p ′2 ) with D ^ p = D ^ p cos α ( 15 )
    where factor fv solves for 0 fv 2 N ( 0 , 1 ) ξ = x z 2 %
    given user specified confidence percentile xz that the head is display in the foveal frame. Similar derivations apply for the horizontal case.
  • The method verifies the correctness of the theoretical expressions and approximations through extensive simulations only show plots validating expressions for illumination normalization (eqn. 5), and for foveal camera control parameters (eqn. 13, 14). This validation assumes correctness of the underlying statistical models. Validation of the models on real data is discussed below.
  • The correctness of the models is verified by comparing ground truth values against module estimates for mean and variance of the running system. The following is an illustration of an embodiment of the present invention, eight positions P1-P8 are marked having different radial distances and pan angles. Positions and test persons were chosen to simulate different positions, illumination, and contrast. The table for the final foveal camera control parameters is for one person. Ground truth values for the mean values were taken by measuring tilt angle a, and pan angle b by hand, and are compared against the corresponding mean of system measurements estimated from 100 trials per position and person. The variances calculated from the system estimates for pan and tilt angle are compared against the average of the corresponding variance-estimates calculated based on the analysis. The comparison between system output and ground truth demonstrates the correctness of the model assumptions in the statistical modeling process (see Table 1).
    TABLE 1
    Validation: First two lines shows the predicted
    and experimental variances for the tilt angle,
    respectively. The next two lines correspond to pan
    angle.
    ×10−5 P1 P2 P3 P4 P5 P6 P7 P8
    {circumflex over ( )}2 2.1 2.12 1.57 1.4 1.35 1.31 1.31 1.32
    σ tan {circumflex over (α)}
    {circumflex over ( )}2 2.05 2.04 1.6 1.34 1.36 1.32 1.4 1.31
    σ tan {circumflex over (α)}
    {circumflex over (σ)}sin {circumflex over (β)} 2 28.9 26.1 21.3 17.9 15.3 15.2 18.4 20.1
    {tilde over (σ)}sin {circumflex over (β)} 2 25.9 24.1 19.5 15.1 14.9 15 18.1 19.3
  • The performance of the running system will now be discussed. The output of the foveal camera is sufficient as input for face recognition algorithms. Illustrating how the statistical analysis is used to optimize the camera setup, equ. 13 and 14 suggest that the configuration that minimizes these uncertainties is the one with large inter-camera distance Dc and foveal camera height Hf equal to the mean person eye-level height Hp.
  • The present invention is reliable in terms of detection and zooming over longtime experiments within the operational limits denoted by the outer line of the upper right contour plot.
  • The setup of the system (for example, placement of foveal camera) influences precision globally and locally. Preferred directions of low uncertainties can be used to adapt the system to user defined accuracy constraints in certain areas of the room.
  • In another embodiment of the present invention, a system for monitoring in and around an automobile is presented. The inventions uses an omni-directional sensor (a standard camera plus a mirror assembly) to obtain a global view of the surroundings within and outside the automobile. The omni-camera video is used for detection and tracking of objects within and around the automobile. The concept is an extension of the methods described above with respect to tracking objects within a room. In this embodiment the system can be used to improve safety and security.
  • The video analysis system can include multiple modules. For example, a calibration module where the center of the Omni-camera image is used with height information of the ceiling of the automobile to translate image coordinates to ground plane coordinates. Where a CAD model of the automobile is available, the image coordinates can be mapped to a 3D point on the interior of the automobile using this calibration step (if the automobile is not occupied). Another example is a change detection module that compares a reference map (reference image plus variation around the reference image) to current observed image map to determine a pixel-based change detection measure. This is done by transforming the color video stream into normalized color space (to deal with illumination variation). The change detection measure is used to index into a set of possible hypothesis for object positions and locations. Yet another example includes a background update module for varying background conditions (e.g. gain control change, illumination changes). A grouping module that takes the change detection measure along with a geometric model of the environment and the objects to identify likely object locations. In the current embodiment, the method provides the areas in the image corresponding to the windows and model people by upright cylinders when they are outside of the automobile. In the interior of the automobile, people can be modeled by generalized cylinders. Still another module includes an object tracking module that takes location information over time to do prediction of object locations in the subsequent time step and to re-estimate their new locations. Preferably, the visualization is presented on a color liquid crystal display (LCD) panel mounted with the rear-view mirror. The visualization module presents geometrically warped video of the omni-cam video. This is useful for driver assistance (e.g. while the driver is backing up or when he/she is changing lanes). Other modules are contemplated by the present invention including, for example, a module that determines an approaching object's potential threat, e.g., at a higher rate of speed or from a particular direction.
  • According to the automotive embodiment of the present invention, the OmniCam is a catadioptric system that includes two parts: a parabolic mirror; and a standard CCD camera looking into it. The invention is useful as a sensor for use in driver assistance. It is also useful for monitoring the surroundings when the automobile is stationary and for recording videos in the event that a person approaches the automobile and attempts to get unauthorized access. The omni-camera system can be use in conjunction with a pan-tilt camera to enable the capture of a zoomed up image of the persons involved. Once a person gains unauthorized access to the automobile and an alarm is triggered, a security system integrating vision, global positioning system (GPS) and mobile phone, can transmit the time, location and the face image of the person to a central security agency. In addition to the monitoring capability, the ability to present the panoramic view of the surroundings provides a method to alert the driver to potential danger in the surrounding area by visually emphasizing the region in the panoramic view. In addition, due to the mounting position of the Omni-camera, looking up into a parabolic mirror located on the ceiling of the automobile (preferably centered), parts of the surroundings that are invisible to the driver are visible in the Omni-view. Thus, the driver blind spot area is significantly reduced. By evaluating the panoramic view it is possible to trigger warnings, e.g., if other cars enter a driver's blind spot. If automobile status information (speed, steering wheel position, predicted track) is combined with panoramic video processing it is possible to alert a driver to impending dangers or potential accidents.
  • The present invention contemplates a system and method for tracking an object. The invention can be employed in varying circumstances, for example, video conferencing, distance learning, and security stations where a user can define an area of interest there by replacing traditional systems employing banks of monitors. The present invention also contemplates an application wherein the system is used in conjunction with a data-log for recording time and location together with images of persons present. In a data-log application the system can associate an image with recorded information upon the occurrence of an event, e.g., a person sits at a computer terminal within an area defined for surveillance. The data-log portion of the system is preferably performed by a computer, where the computer records, for example, the time, location, and identity of the subject, as well as an accompanying image. The present invention is not limited to the above applications, rather the invention can be implemented in any situations where object detection, tracking, and zooming is needed.
  • Having described preferred embodiments of the present invention having computationally efficient real-time detection and zooming capabilities, it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments of the invention disclosed which are within the scope and spirit of the invention as defined by the appended claims.

Claims (14)

1. A method for visually locating and tracking an object through a space, comprising the steps of:
choosing a plurality of modules for restricting a search function within the space to a plurality of regions with a high probability of significant change, the search function operating on images supplied by a camera;
deriving statistical models for errors, including quantifying an indexing step performed by an indexing module, and tuning system parameters; and
applying a likelihood model for candidate hypothesis evaluation and object parameters estimation for locating the object.
2. The method of claim 1, wherein the step of choosing the plurality of modules further comprises the steps of:
applying a calibration module for determining a static scene;
applying an illumination-invariant module for tracking image transformation; and
applying the indexing module for selecting regions of interest for hypothesis generation.
3. The method of claim 2, further comprising the steps of:
applying a statistical estimation module for estimating a number of objects and their positions; and
applying a foveal camera control module for estimating a plurality of control parameters of a foveal camera based on location estimates and uncertainties.
4. The method of claim 2, further comprising the step of applying a background adaptation module for detecting and tracking the object in dynamically varying illumination situations.
5. The method of claim 1, wherein each module is application specific based on a plurality of prior distributions for imposing restrictions on a search function.
6. The method of claim 5, wherein the plurality of prior distributions comprise:
an object geometry model;
a camera geometry model;
a camera error model; and
an illumination model.
7. The method of claim 1, wherein the camera is an omnicamera.
8. The method of claim 1, wherein the object is tracked using a foveal camera.
9. The method of claim 1, wherein the step of deriving statistical models is applied a plurality of times to achieve a given probability of misdetection and false alarm rate.
10. The method claim 9, further comprising the step of validating a theoretical model for the space monitored for determining correctness and closeness to reality.
11. The method of claim 1, wherein the indexing module selects a plurality of regions with a high probability of significant change, motivated by a plurality of two dimensional image priors induced by a plurality of prior distributions in the space, wherein the space is three dimensional.
12. The method of claim 1, wherein the step of applying a likelihood model further comprises the step of estimating an uncertainty of the object's parameters for predicting a system's performance and for automating control of the system.
13. The method of claim 1, employed in an automobile wherein the space monitored comprises one of an interior compartment of the automobile and an exterior of the automobile.
14. A computer program product comprising computer program code stored on a computer readable storage medium for, for locating and tracking objects through a space, the computer program product comprising:
computer readable program code for causing a computer to choose a plurality of modules for a restricting search functions within a context to a plurality of regions with a high probability of significant change within the space;
computer readable program code for causing a computer to derive statistical models for errors, including quantifying an indexing step, and tuning system parameters; and
computer readable program code for causing a computer to apply a likelihood model for candidate hypothesis evaluation and object parameters estimation for locating the object.
US11/360,800 2000-06-12 2006-02-23 Statistical modeling and performance characterization of a real-time dual camera surveillance system Abandoned US20060142981A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/360,800 US20060142981A1 (en) 2000-06-12 2006-02-23 Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11/484,994 US20070019073A1 (en) 2000-06-12 2006-07-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/592,532 US7006950B1 (en) 2000-06-12 2000-06-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11/360,800 US20060142981A1 (en) 2000-06-12 2006-02-23 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/592,532 Continuation US7006950B1 (en) 2000-06-12 2000-06-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US11/484,994 Continuation US20070019073A1 (en) 2000-06-12 2006-07-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Publications (1)

Publication Number Publication Date
US20060142981A1 true US20060142981A1 (en) 2006-06-29

Family

ID=35922912

Family Applications (4)

Application Number Title Priority Date Filing Date
US09/592,532 Expired - Lifetime US7006950B1 (en) 2000-06-12 2000-06-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11/112,930 Expired - Fee Related US7899209B2 (en) 2000-06-12 2005-04-22 Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11/360,800 Abandoned US20060142981A1 (en) 2000-06-12 2006-02-23 Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11/484,994 Abandoned US20070019073A1 (en) 2000-06-12 2006-07-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US09/592,532 Expired - Lifetime US7006950B1 (en) 2000-06-12 2000-06-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11/112,930 Expired - Fee Related US7899209B2 (en) 2000-06-12 2005-04-22 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Family Applications After (1)

Application Number Title Priority Date Filing Date
US11/484,994 Abandoned US20070019073A1 (en) 2000-06-12 2006-07-12 Statistical modeling and performance characterization of a real-time dual camera surveillance system

Country Status (1)

Country Link
US (4) US7006950B1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100138094A1 (en) * 2008-12-02 2010-06-03 Caterpillar Inc. System and method for accident logging in an automated machine
US20110178703A1 (en) * 2009-01-14 2011-07-21 Sjoerd Aben Navigation apparatus and method
JP2014527655A (en) * 2011-07-14 2014-10-16 バイエリッシェ モートーレン ウエルケ アクチエンゲゼルシャフトBayerische Motoren Werke Aktiengesellschaft Pedestrian gait recognition method and device for portable terminal
JP2015501578A (en) * 2011-10-14 2015-01-15 オムロン株式会社 Method and apparatus for projective space monitoring
US9615062B2 (en) 2010-12-30 2017-04-04 Pelco, Inc. Multi-resolution image display
US20190088011A1 (en) * 2017-09-20 2019-03-21 Boe Technology Group Co., Ltd. Method, device, terminal and system for visualization of vehicle's blind spot and a vehicle

Families Citing this family (70)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10361802B1 (en) 1999-02-01 2019-07-23 Blanding Hovenweep, Llc Adaptive pattern recognition based control system and method
US8352400B2 (en) 1991-12-23 2013-01-08 Hoffberg Steven M Adaptive pattern recognition based controller apparatus and method and human-factored interface therefore
US7966078B2 (en) 1999-02-01 2011-06-21 Steven Hoffberg Network media appliance system and method
US7146260B2 (en) 2001-04-24 2006-12-05 Medius, Inc. Method and apparatus for dynamic configuration of multiprocessor system
US10298735B2 (en) 2001-04-24 2019-05-21 Northwater Intellectual Property Fund L.P. 2 Method and apparatus for dynamic configuration of a multiprocessor health data system
US7940299B2 (en) * 2001-08-09 2011-05-10 Technest Holdings, Inc. Method and apparatus for an omni-directional video surveillance system
US7657935B2 (en) * 2001-08-16 2010-02-02 The Trustees Of Columbia University In The City Of New York System and methods for detecting malicious email transmission
JP2003223633A (en) * 2002-01-29 2003-08-08 Sharp Corp Omnidirectional visual system
JP4010444B2 (en) * 2002-02-28 2007-11-21 シャープ株式会社 Omnidirectional monitoring control system, omnidirectional monitoring control method, and omnidirectional monitoring control program
US7178049B2 (en) 2002-04-24 2007-02-13 Medius, Inc. Method for multi-tasking multiple Java virtual machines in a secure environment
US7194110B2 (en) * 2002-12-18 2007-03-20 Intel Corporation Method and apparatus for tracking features in a video sequence
US20050169415A1 (en) * 2004-01-30 2005-08-04 Agere Systems Inc. Timing error recovery system
TWI253292B (en) * 2004-03-23 2006-04-11 Yu-Lin Chiang Pano camera monitoring and tracking system and method thereof
IL161082A (en) * 2004-03-25 2008-08-07 Rafael Advanced Defense Sys System and method for automatically acquiring a target with a narrow field-of-view gimbaled imaging sensor
US7593547B2 (en) * 2004-10-12 2009-09-22 Siemens Corporate Research, Inc. Video-based encroachment detection
US7337650B1 (en) 2004-11-09 2008-03-04 Medius Inc. System and method for aligning sensors on a vehicle
US20080291278A1 (en) * 2005-04-05 2008-11-27 Objectvideo, Inc. Wide-area site-based video surveillance system
US7583815B2 (en) * 2005-04-05 2009-09-01 Objectvideo Inc. Wide-area site-based video surveillance system
US7466842B2 (en) * 2005-05-20 2008-12-16 Mitsubishi Electric Research Laboratories, Inc. Modeling low frame rate videos with bayesian estimation
US20070076099A1 (en) * 2005-10-03 2007-04-05 Eyal Eshed Device and method for hybrid resolution video frames
US20080100473A1 (en) * 2006-10-25 2008-05-01 Siemens Corporate Research, Inc. Spatial-temporal Image Analysis in Vehicle Detection Systems
US20080272884A1 (en) * 2007-05-03 2008-11-06 Sybase 365, Inc. System and Method for Enhanced Threat Alerting
US9019381B2 (en) * 2008-05-09 2015-04-28 Intuvision Inc. Video tracking systems and methods employing cognitive vision
US20090296989A1 (en) 2008-06-03 2009-12-03 Siemens Corporate Research, Inc. Method for Automatic Detection and Tracking of Multiple Objects
DE102008049872A1 (en) * 2008-10-01 2010-04-29 Mobotix Ag Adjustment lock for surveillance cameras
US9358924B1 (en) 2009-05-08 2016-06-07 Eagle Harbor Holdings, Llc System and method for modeling advanced automotive safety systems
US8417490B1 (en) 2009-05-11 2013-04-09 Eagle Harbor Holdings, Llc System and method for the configuration of an automotive vehicle with modeled sensors
EP2430886B1 (en) * 2009-05-14 2012-10-31 Koninklijke Philips Electronics N.V. Method and system for controlling lighting
US8577083B2 (en) 2009-11-25 2013-11-05 Honeywell International Inc. Geolocating objects of interest in an area of interest with an imaging system
CN101719276B (en) * 2009-12-01 2015-09-02 北京中星微电子有限公司 The method and apparatus of object in a kind of detected image
US20110181716A1 (en) * 2010-01-22 2011-07-28 Crime Point, Incorporated Video surveillance enhancement facilitating real-time proactive decision making
US8385632B2 (en) * 2010-06-01 2013-02-26 Mitsubishi Electric Research Laboratories, Inc. System and method for adapting generic classifiers for object detection in particular scenes using incremental training
US9134399B2 (en) 2010-07-28 2015-09-15 International Business Machines Corporation Attribute-based person tracking across multiple cameras
US8515127B2 (en) 2010-07-28 2013-08-20 International Business Machines Corporation Multispectral detection of personal attributes for video surveillance
US8532390B2 (en) 2010-07-28 2013-09-10 International Business Machines Corporation Semantic parsing of objects in video
US10424342B2 (en) 2010-07-28 2019-09-24 International Business Machines Corporation Facilitating people search in video surveillance
WO2012048173A2 (en) 2010-10-07 2012-04-12 Siemens Corporation Multi-sensor system for high performance and reconfigurable outdoor surveillance
US9497388B2 (en) 2010-12-17 2016-11-15 Pelco, Inc. Zooming factor computation
US8448056B2 (en) * 2010-12-17 2013-05-21 Microsoft Corporation Validation analysis of human target
US8953039B2 (en) * 2011-07-01 2015-02-10 Utc Fire & Security Corporation System and method for auto-commissioning an intelligent video system
US9070285B1 (en) * 2011-07-25 2015-06-30 UtopiaCompression Corporation Passive camera based cloud detection and avoidance for aircraft systems
US9235895B2 (en) * 2011-12-13 2016-01-12 Hitachi, Ltd. Method for estimating direction of person standing still
US9082004B2 (en) 2011-12-15 2015-07-14 The Nielsen Company (Us), Llc. Methods and apparatus to capture images
US8886392B1 (en) 2011-12-21 2014-11-11 Intellectual Ventures Fund 79 Llc Methods, devices, and mediums associated with managing vehicle maintenance activities
US8704904B2 (en) 2011-12-23 2014-04-22 H4 Engineering, Inc. Portable system for high quality video recording
WO2013131036A1 (en) 2012-03-01 2013-09-06 H4 Engineering, Inc. Apparatus and method for automatic video recording
WO2013131100A1 (en) 2012-03-02 2013-09-06 H4 Engineering, Inc. Multifunction automatic video recording device
US9723192B1 (en) 2012-03-02 2017-08-01 H4 Engineering, Inc. Application dependent video recording device architecture
TWI468641B (en) * 2012-11-09 2015-01-11 Univ Nat Central Time synchronization calibration method and system for image taking and coordinate reading and delay time calculation method thereof
US10009579B2 (en) 2012-11-21 2018-06-26 Pelco, Inc. Method and system for counting people using depth sensor
US9367733B2 (en) 2012-11-21 2016-06-14 Pelco, Inc. Method and apparatus for detecting people by a surveillance system
US8769557B1 (en) 2012-12-27 2014-07-01 The Nielsen Company (Us), Llc Methods and apparatus to determine engagement levels of audience members
US9639747B2 (en) * 2013-03-15 2017-05-02 Pelco, Inc. Online learning method for people detection and counting for retail stores
JP6512793B2 (en) * 2014-11-07 2019-05-15 キヤノン株式会社 Imaging device, surveillance camera system, control method and program for imaging device
US9712828B2 (en) * 2015-05-27 2017-07-18 Indian Statistical Institute Foreground motion detection in compressed video data
US9721472B2 (en) * 2015-09-22 2017-08-01 Ford Global Technologies, Llc Formulating lane level routing plans
CN106888352B (en) * 2015-12-16 2020-12-18 中兴通讯股份有限公司 Coke pushing position determining method and device
JP6661082B2 (en) * 2016-03-30 2020-03-11 株式会社エクォス・リサーチ Image recognition device and image recognition program
GB2553570B (en) 2016-09-09 2021-05-19 Canon Kk Surveillance apparatus and surveillance method
US10462354B2 (en) * 2016-12-09 2019-10-29 Magna Electronics Inc. Vehicle control system utilizing multi-camera module
JP6984130B2 (en) * 2017-01-17 2021-12-17 オムロン株式会社 Image processing equipment, control systems, image processing equipment control methods, control programs, and recording media
CN106791701A (en) * 2017-01-20 2017-05-31 国网河北省电力公司衡水供电分公司 Power circuit positions patrol instrument
US10636173B1 (en) 2017-09-28 2020-04-28 Alarm.Com Incorporated Dynamic calibration of surveillance devices
US11012683B1 (en) 2017-09-28 2021-05-18 Alarm.Com Incorporated Dynamic calibration of surveillance devices
AU2017272325A1 (en) * 2017-12-08 2019-06-27 Canon Kabushiki Kaisha System and method of generating a composite frame
DE112018007277T5 (en) 2018-03-13 2021-01-28 Harman International Industries, Incorporated DEVICE AND METHOD FOR AUTOMATIC ERROR THRESHOLD DETECTION FOR IMAGES
US11711638B2 (en) 2020-06-29 2023-07-25 The Nielsen Company (Us), Llc Audience monitoring systems and related methods
US11860704B2 (en) 2021-08-16 2024-01-02 The Nielsen Company (Us), Llc Methods and apparatus to determine user presence
US11758223B2 (en) 2021-12-23 2023-09-12 The Nielsen Company (Us), Llc Apparatus, systems, and methods for user presence detection for audience monitoring
CN114596362B (en) * 2022-03-15 2023-03-14 云粒智慧科技有限公司 High-point camera coordinate calculation method and device, electronic equipment and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5323470A (en) * 1992-05-08 1994-06-21 Atsushi Kara Method and apparatus for automatically tracking an object
US5434617A (en) * 1993-01-29 1995-07-18 Bell Communications Research, Inc. Automatic tracking camera control system
US5473369A (en) * 1993-02-25 1995-12-05 Sony Corporation Object tracking apparatus
US5574498A (en) * 1993-09-25 1996-11-12 Sony Corporation Target tracking system
US5953077A (en) * 1997-01-17 1999-09-14 Fox Sports Productions, Inc. System for displaying an object that is not visible to a camera
US6590999B1 (en) * 2000-02-14 2003-07-08 Siemens Corporate Research, Inc. Real-time tracking of non-rigid objects using mean shift
US6680745B2 (en) * 2000-11-10 2004-01-20 Perceptive Network Technologies, Inc. Videoconferencing method with tracking of face and dynamic bandwidth allocation
US20050259848A1 (en) * 2000-02-04 2005-11-24 Cernium, Inc. System for automated screening of security cameras

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6188776B1 (en) * 1996-05-21 2001-02-13 Interval Research Corporation Principle component analysis of images for the automatic location of control points
US6188777B1 (en) * 1997-08-01 2001-02-13 Interval Research Corporation Method and apparatus for personnel detection and tracking
US6353679B1 (en) * 1998-11-03 2002-03-05 Compaq Computer Corporation Sample refinement method of multiple mode probability density estimation
US7233886B2 (en) * 2001-01-19 2007-06-19 Smartsignal Corporation Adaptive modeling of changed states in predictive condition monitoring
US7136507B2 (en) * 2003-11-17 2006-11-14 Vidient Systems, Inc. Video surveillance system with rule-based reasoning and multiple-hypothesis scoring

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5323470A (en) * 1992-05-08 1994-06-21 Atsushi Kara Method and apparatus for automatically tracking an object
US5434617A (en) * 1993-01-29 1995-07-18 Bell Communications Research, Inc. Automatic tracking camera control system
US5473369A (en) * 1993-02-25 1995-12-05 Sony Corporation Object tracking apparatus
US5574498A (en) * 1993-09-25 1996-11-12 Sony Corporation Target tracking system
US5953077A (en) * 1997-01-17 1999-09-14 Fox Sports Productions, Inc. System for displaying an object that is not visible to a camera
US20050259848A1 (en) * 2000-02-04 2005-11-24 Cernium, Inc. System for automated screening of security cameras
US6590999B1 (en) * 2000-02-14 2003-07-08 Siemens Corporate Research, Inc. Real-time tracking of non-rigid objects using mean shift
US6680745B2 (en) * 2000-11-10 2004-01-20 Perceptive Network Technologies, Inc. Videoconferencing method with tracking of face and dynamic bandwidth allocation

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100138094A1 (en) * 2008-12-02 2010-06-03 Caterpillar Inc. System and method for accident logging in an automated machine
US8473143B2 (en) * 2008-12-02 2013-06-25 Caterpillar Inc. System and method for accident logging in an automated machine
US20110178703A1 (en) * 2009-01-14 2011-07-21 Sjoerd Aben Navigation apparatus and method
US9615062B2 (en) 2010-12-30 2017-04-04 Pelco, Inc. Multi-resolution image display
JP2014527655A (en) * 2011-07-14 2014-10-16 バイエリッシェ モートーレン ウエルケ アクチエンゲゼルシャフトBayerische Motoren Werke Aktiengesellschaft Pedestrian gait recognition method and device for portable terminal
JP2015501578A (en) * 2011-10-14 2015-01-15 オムロン株式会社 Method and apparatus for projective space monitoring
US20190088011A1 (en) * 2017-09-20 2019-03-21 Boe Technology Group Co., Ltd. Method, device, terminal and system for visualization of vehicle's blind spot and a vehicle
US10573068B2 (en) * 2017-09-20 2020-02-25 Boe Technology Group Co., Ltd. Method, device, terminal and system for visualization of vehicle's blind spot and a vehicle

Also Published As

Publication number Publication date
US20100007740A1 (en) 2010-01-14
US7899209B2 (en) 2011-03-01
US7006950B1 (en) 2006-02-28
US20070019073A1 (en) 2007-01-25

Similar Documents

Publication Publication Date Title
US7006950B1 (en) Statistical modeling and performance characterization of a real-time dual camera surveillance system
Greiffenhagen et al. Statistical modeling and performance characterization of a real-time dual camera surveillance system
US11733370B2 (en) Building radar-camera surveillance system
US8599266B2 (en) Digital processing of video images
US9646212B2 (en) Methods, devices and systems for detecting objects in a video
Zhu et al. Reliable detection of overtaking vehicles using robust information fusion
US7623676B2 (en) Method and apparatus for tracking objects over a wide area using a network of stereo sensors
US9129397B2 (en) Human tracking method and apparatus using color histogram
US8189049B2 (en) Intrusion alarm video-processing device
US6628805B1 (en) Apparatus and a method for detecting motion within an image sequence
US7436887B2 (en) Method and apparatus for video frame sequence-based object tracking
US7321386B2 (en) Robust stereo-driven video-based surveillance
Boult et al. Omni-directional visual surveillance
JP2004531842A (en) Method for surveillance and monitoring systems
US20080240616A1 (en) Automatic camera calibration and geo-registration using objects that provide positional information
US20060215031A1 (en) Method and system for camera autocalibration
JP2004537790A (en) Moving object evaluation system and method
JP2004534315A (en) Method and system for monitoring moving objects
US20040141633A1 (en) Intruding object detection device using background difference method
JP2007209008A (en) Surveillance device
JP3910626B2 (en) Monitoring device
Snidaro et al. Quality-based fusion of multiple video sensors for video surveillance
KR20100013855A (en) Method for tracking moving object on multiple cameras using probabilistic camera hand-off
Abidi et al. Automatic target acquisition and tracking with cooperative fixed and PTZ video cameras
Dalka et al. Video content analysis in the urban area telemonitoring system

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION