WO2002091305A2 - Method and system, using a data-driven model for monocular face tracking - Google Patents
Method and system, using a data-driven model for monocular face tracking Download PDFInfo
- Publication number
- WO2002091305A2 WO2002091305A2 PCT/US2002/014014 US0214014W WO02091305A2 WO 2002091305 A2 WO2002091305 A2 WO 2002091305A2 US 0214014 W US0214014 W US 0214014W WO 02091305 A2 WO02091305 A2 WO 02091305A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- model
- tracking
- stereo data
- processor
- shape
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/251—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/20—Image signal generators
- H04N13/204—Image signal generators using stereoscopic image cameras
- H04N13/239—Image signal generators using stereoscopic image cameras using two 2D image sensors having a relative position equal to or related to the interocular distance
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/189—Recording image signals; Reproducing recorded image signals
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- the present invention relates generally to the field of image processing. More particularly, the present invention relates to a method and system using a data-driven model for monocular face tracking.
- BACKGROUND OF THE INVENTION [0002]
- Monocular face tracking is the process of estimating facial motion, position, and shape based on monocular image sequences from a stationary camera.
- Monocular face tracking is a main process in many image processing systems such as a video conferencing system. For instance, in a video conferencing system, by estimating facial motion or position, the amount of facial data or information that needs to be exchanged or processed is reduced. That is, parameters related to the estimated facial motion, position, and shape can be exchanged or processed for outputting an image sequence instead of exchanging or processing large amounts of image data.
- One type of face tracking system is a face tracking system based on markers ("marker face tracking system").
- marker face tracking system a user is required to wear color "markers” at known locations. The movement of the markers are thus parameterized to estimate facial position and shape.
- a disadvantage of the marker face tracking system is that it is invasive on the user. In particular, the user must place a number of color markers on varying positions of the face. Furthermore, the user must spend time putting on the markers, which adds a further complexity to using such a system.
- Another type of face tracking system is a model-based face tracking system.
- a model-based face tracking system uses a parameterized face shape model that can be used to estimate facial position and motion.
- prior model-based face tracking systems parameterized models are built using a manual process, e.g., by using a 3D scanner or a computer aided design (CAD) modeler.
- CAD computer aided design
- FIG. 1 illustrates an exemplary computing system for practicing the present invention
- FIG. 2 illustrates a flow diagram of an operation to perform monocular tracking using a data-driven model according to one embodiment
- FIG. 3 illustrates exemplary stereo input image sequences for stereo tracking to build the data-driven model of FIG. 2;
- FIG. 4 illustrates a four dimensional space of exemplary deformations learned from stereo input sequences
- FIG. 5 illustrates exemplary input image sequences for monocular tracking
- FIG. 6 illustrates a flow diagram of the operation to perform stereo tracking in FIG. 2 according to one embodiment
- FIG. 7 illustrates a flow diagram to calculate principal shape vectors in FIG. 2 according to one embodiment
- FIG. 8 illustrates a flow diagram to perform monocular tracking in FIG. 2 according to one embodiment.
- a method and system using a data-driven model for monocular face tracking are described, which provide a versatile system for tracking a three-dimensional (3D) object, e.g., a face, in an image sequence acquired using a single camera.
- stereo data based on input image sequences is obtained.
- a 3D model is built using the obtained stereo data.
- a monocular image sequence is tracked using the built 3D model.
- Principal Component Analysis is applied to the stereo data to learn, e.g., possible facial deformations, and to build a data-driven 3D model ("3D face model").
- the 3D face model can be used to approximate a generic shape (e.g., facial pose) as a linear combination of shape basis vectors based on the PCA analysis.
- a small number of shape basis vectors can be computed to build the 3D model, which provides a number of advantages. For instance, only a small number (e.g., 3 or 4) of shape basis vectors can be used to span, e.g., a variety of facial expressions such as smiling, talking, raising eyebrows, and etc.
- a 3D model can be built and stored in a database using stereo data from one or more users in which, e.g., a face of a new user can be tracked even though stereo data from the new user is not stored in the database.
- the following embodiments describes a system that tracks both 3D pose and shape of a facial image ("face") in front of a single video camera without using intrusive markers.
- the system also provides robust and accurate monocular tracking using a data- driven model.
- the system also provides generalization properties to enable face tracking of multiple persons with the same 3D model.
- monocular tracking techniques are described with respect to tracking of a 3D facial image. Nevertheless, the monocular tracking techniques described herein are not intended to be limited to any particular type of image and can be implemented with other types of 3D images such as moving body parts or inanimate objects.
- FIG. 1 illustrates an exemplary computing system 100 for practicing the present invention.
- the 3D model building techniques and monocular tracking techniques described herein can be implemented and utilized by computing system 100.
- Computing system 100 can represent, for example, a general purpose computer, workstation, portable computer, hand-held computing device, or other like computing device.
- the components of computing system 100 are exemplary in which one or more components can be omitted or added.
- a plurality of camera devices 128 can be used with computing system 100.
- computing system 100 includes a main unit 110 having a central processing unit (CPU) 102 and a co-processor 103 coupled to a display circuit 105, main memory 104, static memory 106, and flash memory 107 via bus 101.
- Main unit 110 of computing system 100 can also be coupled to a display 121, keypad input 122, cursor control 123, hard copy device 124, input/output (I/O) devices 125, and mass storage device
- Bus 101 is a standard system bus for communicating information and signals.
- CPU 102 and co-processor 103 are processing units for computing system 100.
- CPU 102 or co-processor 103 or both can be used to process information and/or signals for computing system 100.
- CPU 102 can be used to process code or instructions to perform the 3D data-driven model building techniques and monocular tracking techniques described herein.
- co-processor 103 can be used to process code or instructions to perform same techniques as CPU 102.
- CPU 102 includes a control unit 131, an arithmetic logic unit (ALU) 132, and several registers 133, which can be used by CPU 102 for data and information processing purposes.
- Co-processor 103 can also include similar components as CPU 102.
- Main memory 104 can be, e.g., a random access memory (RAM) or some other dynamic storage device, for storing data, code, or instructions to be used by computing system 100.
- main memory 104 can store data related to input stereo image sequences and/or a 3D data-driven model as will be described in further detail below.
- Main memory 104 may also store temporary variables or other intermediate data during execution of code or instructions by CPU 102 or co-processor 103.
- Static memory 106 can be, e.g., a read only memory (ROM) and/or other static storage devices, which can store data and/or code or instructions to be used by computing system 100.
- ROM read only memory
- Flash memory 107 is a memory device that can be used to store basic input/output system (BIOS) code or instructions for computing system 100.
- Display 121 can be, e.g., a cathode ray tube (CRT) or liquid crystal display (LCD).
- Display device 121 can display images, information, or graphics to a user.
- Main unit 110 of computing system 100 can interface with display 121 via display circuit 105.
- Keypad input 122 is a alphanumeric input device for communicating information and command selections for computing system 100.
- Cursor control 123 can be, e.g., a mouse, touchpad, trackball, or cursor direction keys, for controlling movement of an object on display 121.
- Hard copy device 124 can be, e.g., a laser printer, for printing information on paper, film, or some other like medium.
- Any number of input/output (I/O) devices 125 can be coupled to computing system 100.
- an I/O device such as a speaker can be coupled to computing system 100.
- Mass storage device 126 can be, e.g., a hard disk, read/writable CD or DVD player, or other large volume storage device.
- Camera devices 128 can be video image capturing devices, which can be used for the image processing techniques described herein.
- camera devices 128 include DigiclopsTM camera systems, which provide an average frame rate of 4fps with color images having a size 640x480.
- the 3D data-driven model building techniques and monocular tracking techniques described herein can be performed by the hardware and/or software modules contained within computing system 100.
- CPU 102 or coprocessor 103 can execute code or instructions stored in a machine-readable medium, e.g., main memory 104 or static memory 106, to process stereo input sequences to build a 3D data-driven model as described herein.
- a machine-readable medium may include a mechanism that provides (i.e., stores and/or transmits) information in a form readable by a machine such as computer or digital processing device.
- the machine-readable medium may include a read only memory (ROM), random access memory (RAM), magnetic disk storage media, optical storage media, flash memory devices, or other like memory devices.
- the code or instructions can be represented by carrier wave signals, infrared signals, digital signals, and by other like signals.
- a machine-readable medium can also be used to store a database for the 3D data-driven model described herein.
- FIG. 2 illustrates a functional flow diagram of an operation 200 for performing monocular tracking using a data-driven model according to one embodiment.
- operation 200 includes two stages.
- the first stage refers to operation block 210 or learning stage 210.
- Learning state 210 learns the space of possible facial deformations by applying Principal Component Analysis (PCA) processing on real stereo tracking data to build a 3D data-driven model for monocular tracking.
- PCA Principal Component Analysis
- the 3D data-driven model can be used to approximate a generic shape as a linear combination of shape basis vectors.
- the second stage refers to operation block 220 in which monocular tracking is performed using the 3D data-driven model built in the learning stage.
- a stereo sequence is inputted.
- camera devices 128 can include a first camera and a second camera to capture images sequences from a left perspective and a right perspective such as that shown in FIG. 3.
- a first and second camera can capture images sequences, e.g., frame 1 to 100, of a person exhibiting varying facial movement and poses from a left and right perspective.
- the stereo input sequences can be inputted into computing system 100 for processing.
- the input stereo sequence is tracked.
- a low complexity face mesh e.g., the nineteen points at varying positions of the face as shown in FIG. 3
- each point is tracked independently to obtain a facial shape trajectory.
- PCA Principal Component Analysis
- the principal shape vectors are calculated, which will be explained in further detail below. Once the principal shape vectors are calculated, any facial movement or pose during monocular tracking can be approximated as a linear combination of the principal shape vectors.
- a monocular sequence is a sequence of images from a single camera. For example, as shown in FIG. 5, at each frame of the monocular input sequence (e.g., frames 1 through 72), the shape of the face can be approximated by a linear combination of the principal shape vectors of the computed- model built in the learning stage 210. In particular, while a person changes facial expression and pose, the resulting optical flow information of the sequence can be used with the computed-model to track the changes in pose and facial expression. [0032] The above operation can be implemented within by exemplary computing system 100.
- CPU 102 can execute code or instructions to build the 3D model and to perform the PCA processing, which will be described in further detail below.
- the data-driven 3D model can also be stored within memory storage devices of computing system 100.
- the data-driven 3D model is a "deformable face model," which will now be described. Deformable Face Model
- the monocular facial sequence can be tracked in 3D space using the deformable face model described herein.
- I n be the n th image of the monocular facial sequence, as shown in FIG. 5, having seventy-two frames. A 3D structure of each face in each frame at time n
- R n is a 3 X 3 rotation matrix
- t n is a translation vector
- n n n one embodiment, a traditional pinhole camera model can be used to determine the image
- Monocular tracking can thus be equivalent to inverting the projection map ⁇ for recovering the 3D shape X'(n) and pose ⁇ admirt n ⁇ .
- the non-rigid shapes can be based on a linear combination of rigid shapes.
- the shape coordinate vector X 1 can be basing non-rigid shapes as a linear combination of rigid shapes.
- image projection map can be reduced to a function of pose parameters a ⁇ n and t n and a
- deformation vector a n having a plurality of "deformation coefficients" such as
- i __ x ⁇ ,.( n ,m n ,t n ). Equation 2 n
- a monocular tracking procedure can thus be performed by combining optical flow constraints (e.g., Lucas-Kanade) with the specific form of the deformable model, which is represented by Equation 1, for simultaneous estimation of the deformation vectoriser and the pose parameters m n and t n at every frame.
- optical flow constraints e.g., Lucas-Kanade
- Equation 1 the specific form of the deformable model
- K can be used that can avoid manual construction of a non-rigid model.
- the principal shape basis vectors are derived from real 3D tracked data, which is also performed in the learning stage 210 as shown in FIG. 2.
- calibrated stereo cameras are used to track in 3D varying facial expressions and poses.
- a short stereo input sequence (e.g., as shown in FIG. 3) of approximately 100 to 150 frames can be used.
- principal shape basis vectors X can be computed from the tracked k
- operation blocks 202 and 204 of FIG. 2 using Principal Component Analysis (PCA) processing, which will be described in detail below.
- PCA Principal Component Analysis
- FIG. 6 illustrates a flow diagram of the operation 204 of FIG. 2 to perform stereo tracking according to one embodiment.
- operation 204 begins at operation block 604.
- a set of points on a left camera image and a right camera image is initialized.
- a set of N 19 points P 1 located on the eyes (2), nose (3), mouth (8), eyebrow (6) are initialized on the left camera image and the right camera image as shown in FIG. 2.
- varying facial deformations are provided independently from pose such that the user is to maintain head pose as fixed as possible throughout the sequence while making a variety of different facial expressions, e.g., opening/closing mouth, smiling, raising eyebrows, etc.
- the set of points are indicated by a user of computing system 100 on the first and left and right camera images.
- the stereo image sequence can be tracked using these points.
- all the points do not need to fall in textured areas of the image. This is a requirement for independent feature point tracking (to declare a point "good to track,” but not for model-based tracking. For example, the point at the tip of nose falls in a totally textureless region, and the points on the outline of the mouth and on the eyebrows are edge features. All those points would be impossible to track individually using traditional optical flow techniques.
- the set of points is tracked by stereo triangulation. The
- stereo tracking is performed in 3D such that each point location X (n) (in the left camera
- reference frame is updated so that its current left and right image projections are to match approximately a previous image projection (i.e., temporal tracking).
- the left and the right image projections are to match approximately by considering a cost function measured between left and right images.
- stereo tracking of the points P 1 from frame n-1 to frame n is established by minimizing a cost function E-, which is shown in Equation 3 below.
- Equation 3 / and I refer to vectors for the left and the right images at n n
- Equation 3 The first and second terms of Equation 3 represent traditional image matching costs accounting terms for independent left and right temporal tracking.
- the third terms is used to maintain correspondence between the right and left images.
- the three coefficients ( ⁇ l, ⁇ 2, and ⁇ 3 ) for the three terms are fixed weighting coefficients (i.e., the same for all the points) user for variable reliability between the three terms.
- the value for the ⁇ 3 coefficient is kept smaller than the ⁇ l coefficient and the ⁇ 2 coefficient and the ratios ⁇ l/ ⁇ 3 and ⁇ 2/ ⁇ 3 are typically kept at a ratio value of 20.
- the values for the ⁇ l, ⁇ 2, ⁇ 3 coefficients can be hardcoded separately for each of the 19 points on the face mesh as shown in FIG. 2. In one embodiment, each connected pair of points in the face mesh is considered separately.
- the values for ⁇ l, ⁇ 2, and ⁇ 3 can be 1, 1, 0.05, respectively, for an average image area of approximately 100 pixels.
- E ⁇ (n) E ⁇ (n) + E ⁇ (n) + Es(n) + E A OI.
- E ⁇ (n) is a temporal smoothing term, which is used to minimize minimizes the amplitude of 3D velocity at each point.
- Es(n) term is a shape smoothing term, which is used to minimize the differences of velocities of neighboring points.
- the E A ( ⁇ ) term is an anthropometric energy cost term, which is used to keep segment lengths as close as possible to their values computed in the first frame and to prevent drifts over long tracking sequences.
- all segments [P 1 ; P * ] that are subject to large stretches are assigned lower ⁇ tJ and £ values.
- a segment [P 1 ; P * ] that are subject to large stretches are assigned lower ⁇ tJ and £ values.
- points and segments that are known to be quite rigid will be assigned higher values for p t , ⁇ and ⁇ y penalizing a lot any movement and stretch applied on them.
- values for p,, ⁇ i j and ⁇ v are approximately 20000, 20000, and 100 for an average area of image
- dX D ⁇ x e
- dX is a 3N x 1 column vector consisting of all N vectors dX'(n) and D and e are a 3N x 3N matrix and a 3N x 1 vector
- FIG. 7 illustrates a flow diagram of the operation 208 of FIG. 2 to calculate principal shape vectors according to one embodiment. Initially, operation 208 begins at operation block 702.
- the mean shape X is computed.
- the p + 1 shape basis vectors X are computed using Singular Value
- X (n) X (ri)- X .
- the resulting shape trajectory X'(n) is then stored in a 3N x M c e o matrix ("M").
- a Singular Value Decomposition is applied on M.
- M USV T
- U [ui ui ... U 3N ]
- V [v* V] ... V M ].
- the sum for M is truncated from 3N to p terms, which results in an optimal least squares approximation of the matrix M given a fixed budget of p vectors. This is equivalent to approximating each column vector of M (i.e. each 3D shape in the sequence) by its orthogonal projection onto the linear subspace spanned by the first p vectors u l9 ..., u p . These vectors are precisely the remaining p deformation shape vectors X : for k - ⁇ , k
- the resulting model of principal shape vectors is suitable for the monocular tracking stage. For instance, if a user produces a variety of facial expressions, the facial expressions can be tracked based on facial expressions that have been exposed to the system during the learning stage 210. It should be noted that that since the vectors U k are
- Equations 1 and 2 are in units of the mean n
- the units are in centimeters and four principal shape o vectors are used to cover most common facial expressions (e.g., mouth and eyebrow movements). Nevertheless, the number of principal shape vectors used can change based on the diversity of facial expressions performed for tracking.
- a four dimensional space of deformations 411 through 414 are illustrated in which the deformations are computed from the stereo sequence shown in FIG. 3.
- the principal shape vectors can correspond to combinations of four main facial movements, e.g., smile, open/close mouth, left and right raised eyebrows.
- FIG. 8 illustrates a flow diagram of the operation 220 of FIG. 2 to perform monocular tracking using the computed-model in the learning stage 210 according to one embodiment.
- operation 220 begins at operation block 802 for an image sequence such as that shown in FIG. 5.
- parameters for shape and pose using image measurements are estimated from the image sequence.
- optical flow tracking techniques can be used to compute translational displacement of every point in an image given two successive frames, e.g. frames 1 and 2.
- Each image point can then be processed independently.
- all the points in the model are linked to each other through the parameterized 3D model given by Equation 1.
- the parameters defining the 3D model configuration are estimated all at once from image measurements.
- Such parameters include n for shape and ⁇ ⁇ n ,t n ⁇ for pose.
- an optimal shape and pose are sought using a face model that best fits the subsequent frame. For instance, assume that the face model has been tracked from the first frame of the sequence I) to the (n-l) th frame I n-1 . The objective is then to find the optimal pose ⁇ &character,tflower ⁇ and deformation a n of the face model that best fit the
- Equation 5 where ⁇ t is the model-based image projection map defined in Equation 2.
- Equation 4 summation for Equation 4 is performed over small pixel windows, e.g., Region of Interest i i i
- the first term in Equation 4 is a standard matching cost term, that is, the first term measures overall image mismatch between two successive images at the model points.
- the second term measures image mismatch between the current image I n and the first image 1 ⁇ .
- This additional term weakly enforces every facial feature to appear the same on the images from the beginning to the end of the sequence (in an image neighborhood sense). As such, it avoids tracking drifts and increases robustness. It is referred to as drift monitoring energy term.
- the two energy terms are weighted relative to the other by the scalar variable "e.”
- the variable e 0.2, which is to emphasize for tracking cost over monitoring cost.
- Equation 6 is thus solved for "s" while assuming small motion between two consecutive frames.
- I t * be the extended temporal derivative defined as follows:
- V7 be the derivative of the image brightness I n with respect to s in the neighborhood
- a unique tracking solution "s" is computed for the overall model all at once, while in its original form, each image point is processed individually.
- a 3D model is used for tracking that is built from real data and parameterized with few coefficients.
- pose and deformation are l ⁇ iown at time frame n.
- the same procedure can be reiterated multiple times (e.g., 4 or 5 times) at the fixed time frame n to refine the estimate.
- the same overall process is then repeated over the subsequent frames.
- region of interest (ROI) of each model point is not kept constant throughout the sequence. Instead, its size and geometry are recomputed at every frame based on the distance (depth) and orientation (local surface normal) of the point in space. The resulting regions of interest are small parallelograms as shown in FIG. 5. In particular, points that face away from the camera are declared “non visible”, have a zero-size region of interest assigned to them, and therefore do not contribute to the tracking update. [0069] Thus, a method and two-stage system for 3D tracking of pose and deformation, e.g., of the human face in monocular image sequences without the use of invasive special markers, have been described.
- the first stage of the system learns the spaces of all possible facial deformations by applying Principal Component Analysis on real stereo tracking data.
- the resulting model approximates any generic shape as a linear combination of shape basis vectors.
- the second stage of the system uses this low- complexity deformable model for simultaneous estimation of pose and deformation of the face from a single image sequence. This stage is known as model-based monocular tracking.
- the data-driven approach for model construction is suitable for 3D tracking of non-rigid objects and offers an elegant and practical alternative to the task of manual construction of models using 3D scanners or CAD modelers.
- creating a model from real data allows for a large variety of facial deformations to be tracked with less parameters than handcrafted models and leads to increased robustness and tracking accuracy.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB0328400A GB2393065B (en) | 2001-05-09 | 2002-05-02 | Method and system using a data-driven model for monocular face tracking |
AU2002303611A AU2002303611A1 (en) | 2001-05-09 | 2002-05-02 | Method and system, using a data-driven model for monocular face tracking |
KR1020037014619A KR100571115B1 (en) | 2001-05-09 | 2002-05-02 | Method and system using a data-driven model for monocular face tracking |
HK04104981A HK1062067A1 (en) | 2001-05-09 | 2004-07-08 | Method and system using a data-driven model for monocular face tracking. |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/852,398 | 2001-05-09 | ||
US09/852,398 US9400921B2 (en) | 2001-05-09 | 2001-05-09 | Method and system using a data-driven model for monocular face tracking |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002091305A2 true WO2002091305A2 (en) | 2002-11-14 |
WO2002091305A3 WO2002091305A3 (en) | 2003-09-18 |
Family
ID=25313204
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2002/014014 WO2002091305A2 (en) | 2001-05-09 | 2002-05-02 | Method and system, using a data-driven model for monocular face tracking |
Country Status (7)
Country | Link |
---|---|
US (1) | US9400921B2 (en) |
KR (1) | KR100571115B1 (en) |
CN (1) | CN1294541C (en) |
AU (1) | AU2002303611A1 (en) |
GB (1) | GB2393065B (en) |
HK (1) | HK1062067A1 (en) |
WO (1) | WO2002091305A2 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2006015092A2 (en) | 2004-07-30 | 2006-02-09 | Euclid Discoveries, Llc | Apparatus and method for processing video data |
EP2132680A1 (en) * | 2007-03-05 | 2009-12-16 | Seeing Machines Pty Ltd | Efficient and accurate 3d object tracking |
US9400921B2 (en) | 2001-05-09 | 2016-07-26 | Intel Corporation | Method and system using a data-driven model for monocular face tracking |
Families Citing this family (38)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE10025922A1 (en) * | 2000-05-27 | 2001-12-13 | Robert Massen | Automatic photogrammetric digitization of bodies and objects |
GB2382289B (en) * | 2001-09-28 | 2005-07-06 | Canon Kk | Method and apparatus for generating models of individuals |
GB2389289B (en) * | 2002-04-30 | 2005-09-14 | Canon Kk | Method and apparatus for generating models of individuals |
DE10235657A1 (en) * | 2002-08-02 | 2004-02-12 | Leica Microsystems Heidelberg Gmbh | Process, arrangement and software for optimizing the image quality of moving objects taken with a microscope |
EP1556805B1 (en) * | 2002-10-22 | 2011-08-24 | Artoolworks | Tracking a surface in a 3-dimensional scene using natural visual features of the surface |
JP4210926B2 (en) * | 2004-01-16 | 2009-01-21 | 株式会社デンソー | Occupant protection system |
WO2006011153A2 (en) * | 2004-07-30 | 2006-02-02 | Extreme Reality Ltd. | A system and method for 3d space-dimension based image processing |
WO2007090945A1 (en) * | 2006-02-07 | 2007-08-16 | France Telecom | Method of tracking the position of the head in real time in a video image stream |
US8026931B2 (en) | 2006-03-16 | 2011-09-27 | Microsoft Corporation | Digital video effects |
CN100449567C (en) * | 2006-11-02 | 2009-01-07 | 中山大学 | 2-D main-element human-face analysis and identifying method based on relativity in block |
CN100423020C (en) * | 2006-12-15 | 2008-10-01 | 中山大学 | Human face identifying method based on structural principal element analysis |
KR100896065B1 (en) * | 2007-12-17 | 2009-05-07 | 한국전자통신연구원 | Method for producing 3d facial animation |
JP5239396B2 (en) * | 2008-02-28 | 2013-07-17 | セイコーエプソン株式会社 | Image output method, image output apparatus, and image output program |
US8525871B2 (en) * | 2008-08-08 | 2013-09-03 | Adobe Systems Incorporated | Content-aware wide-angle images |
US8538072B2 (en) * | 2008-08-27 | 2013-09-17 | Imprivata, Inc. | Systems and methods for operator detection |
CN101425183B (en) * | 2008-11-13 | 2012-04-25 | 上海交通大学 | Deformable body three-dimensional tracking method based on second order cone programing |
US8624962B2 (en) * | 2009-02-02 | 2014-01-07 | Ydreams—Informatica, S.A. Ydreams | Systems and methods for simulating three-dimensional virtual interactions from two-dimensional camera images |
JP2011039801A (en) * | 2009-08-12 | 2011-02-24 | Hitachi Ltd | Apparatus and method for processing image |
CN101763636B (en) * | 2009-09-23 | 2012-07-04 | 中国科学院自动化研究所 | Method for tracing position and pose of 3D human face in video sequence |
US8937592B2 (en) * | 2010-05-20 | 2015-01-20 | Samsung Electronics Co., Ltd. | Rendition of 3D content on a handheld device |
US8259994B1 (en) * | 2010-09-14 | 2012-09-04 | Google Inc. | Using image and laser constraints to obtain consistent and improved pose estimates in vehicle pose databases |
CN102122239B (en) * | 2011-03-21 | 2013-03-20 | 日照市活点网络科技有限公司 | Method for processing 3D images of Internet of things |
CN103814384B (en) * | 2011-06-09 | 2017-08-18 | 香港科技大学 | Tracking based on image |
JP5786259B2 (en) * | 2011-08-09 | 2015-09-30 | インテル・コーポレーション | Parameterized 3D face generation |
US9123144B2 (en) * | 2011-11-11 | 2015-09-01 | Microsoft Technology Licensing, Llc | Computing 3D shape parameters for face animation |
US9129147B1 (en) * | 2012-05-22 | 2015-09-08 | Image Metrics Limited | Optimal automatic capture of facial movements and expressions in video sequences |
KR101311600B1 (en) * | 2012-10-26 | 2013-09-26 | 동국대학교 산학협력단 | Medical position tracking apparatus |
US9406135B2 (en) | 2012-10-29 | 2016-08-02 | Samsung Electronics Co., Ltd. | Device and method for estimating head pose |
FR2998402B1 (en) * | 2012-11-20 | 2014-11-14 | Morpho | METHOD FOR GENERATING A FACE MODEL IN THREE DIMENSIONS |
EP2824913A1 (en) * | 2013-07-09 | 2015-01-14 | Alcatel Lucent | A method for generating an immersive video of a plurality of persons |
US20160070952A1 (en) * | 2014-09-05 | 2016-03-10 | Samsung Electronics Co., Ltd. | Method and apparatus for facial recognition |
GB201419441D0 (en) * | 2014-10-31 | 2014-12-17 | Microsoft Corp | Modifying video call data |
US9747573B2 (en) * | 2015-03-23 | 2017-08-29 | Avatar Merger Sub II, LLC | Emotion recognition for workforce analytics |
US10441604B2 (en) * | 2016-02-09 | 2019-10-15 | Albireo Ab | Cholestyramine pellets and methods for preparation thereof |
US11736756B2 (en) * | 2016-02-10 | 2023-08-22 | Nitin Vats | Producing realistic body movement using body images |
EP3324254A1 (en) * | 2016-11-17 | 2018-05-23 | Siemens Aktiengesellschaft | Device and method for determining the parameters of a control device |
WO2018227001A1 (en) * | 2017-06-07 | 2018-12-13 | Google Llc | High speed, high-fidelity face tracking |
CN109145758A (en) * | 2018-07-25 | 2019-01-04 | 武汉恩智电子科技有限公司 | A kind of recognizer of the face based on video monitoring |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0807902A2 (en) * | 1996-05-16 | 1997-11-19 | Cyberclass Limited | Method and apparatus for generating moving characters |
US6163322A (en) * | 1998-01-19 | 2000-12-19 | Taarna Studios Inc. | Method and apparatus for providing real-time animation utilizing a database of postures |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1188948A (en) * | 1996-12-27 | 1998-07-29 | 大宇电子株式会社 | Method and apparatus for encoding facial movement |
US6014625A (en) * | 1996-12-30 | 2000-01-11 | Daewoo Electronics Co., Ltd | Method and apparatus for producing lip-movement parameters in a three-dimensional-lip-model |
JP3055666B2 (en) * | 1997-03-11 | 2000-06-26 | 株式会社エイ・ティ・アール知能映像通信研究所 | 3D image creation device |
US6047078A (en) * | 1997-10-03 | 2000-04-04 | Digital Equipment Corporation | Method for extracting a three-dimensional model using appearance-based constrained structure from motion |
CA2312315A1 (en) * | 1997-12-01 | 1999-06-10 | Arsev H. Eraslan | Three-dimensional face identification system |
US6272231B1 (en) * | 1998-11-06 | 2001-08-07 | Eyematic Interfaces, Inc. | Wavelet-based facial motion capture for avatar animation |
US6301370B1 (en) * | 1998-04-13 | 2001-10-09 | Eyematic Interfaces, Inc. | Face recognition from video images |
WO1999064944A2 (en) * | 1998-06-08 | 1999-12-16 | Microsoft Corporation | Compression of time-dependent geometry |
US6072496A (en) * | 1998-06-08 | 2000-06-06 | Microsoft Corporation | Method and system for capturing and representing 3D geometry, color and shading of facial expressions and other animated objects |
US6198485B1 (en) * | 1998-07-29 | 2001-03-06 | Intel Corporation | Method and apparatus for three-dimensional input entry |
IT1315446B1 (en) * | 1998-10-02 | 2003-02-11 | Cselt Centro Studi Lab Telecom | PROCEDURE FOR THE CREATION OF THREE-DIMENSIONAL FACIAL MODELS TO START FROM FACE IMAGES. |
JP2000293687A (en) * | 1999-02-02 | 2000-10-20 | Minolta Co Ltd | Three-dimensional shape data processor and three- dimensional shape data processing method |
US6200139B1 (en) * | 1999-02-26 | 2001-03-13 | Intel Corporation | Operator training system |
EP1039417B1 (en) * | 1999-03-19 | 2006-12-20 | Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. | Method and device for the processing of images based on morphable models |
JP3639476B2 (en) * | 1999-10-06 | 2005-04-20 | シャープ株式会社 | Image processing apparatus, image processing method, and recording medium recording image processing program |
JP2001142380A (en) * | 1999-11-12 | 2001-05-25 | Sony Corp | Device and method for producing hologram and hologram |
ATE393936T1 (en) * | 2000-03-08 | 2008-05-15 | Cyberextruder Com Inc | DEVICE AND METHOD FOR GENERATING A THREE-DIMENSIONAL REPRESENTATION FROM A TWO-DIMENSIONAL IMAGE |
US6807290B2 (en) * | 2000-03-09 | 2004-10-19 | Microsoft Corporation | Rapid computer modeling of faces for animation |
JP2001331799A (en) * | 2000-03-16 | 2001-11-30 | Toshiba Corp | Image processor and image processing method |
IT1320002B1 (en) * | 2000-03-31 | 2003-11-12 | Cselt Centro Studi Lab Telecom | PROCEDURE FOR THE ANIMATION OF A SYNTHESIZED VOLTOHUMAN MODEL DRIVEN BY AN AUDIO SIGNAL. |
KR20020022504A (en) * | 2000-09-20 | 2002-03-27 | 박종만 | System and method for 3D animation authoring with motion control, facial animation, lip synchronizing and lip synchronized voice |
US6950104B1 (en) * | 2000-08-30 | 2005-09-27 | Microsoft Corporation | Methods and systems for animating facial features, and methods and systems for expression transformation |
US6850872B1 (en) * | 2000-08-30 | 2005-02-01 | Microsoft Corporation | Facial image processing methods and systems |
US7127081B1 (en) * | 2000-10-12 | 2006-10-24 | Momentum Bilgisayar, Yazilim, Danismanlik, Ticaret, A.S. | Method for tracking motion of a face |
US6975750B2 (en) * | 2000-12-01 | 2005-12-13 | Microsoft Corp. | System and method for face recognition using synthesized training images |
US9400921B2 (en) * | 2001-05-09 | 2016-07-26 | Intel Corporation | Method and system using a data-driven model for monocular face tracking |
US7027054B1 (en) * | 2002-08-14 | 2006-04-11 | Avaworks, Incorporated | Do-it-yourself photo realistic talking head creation system and method |
US6919892B1 (en) * | 2002-08-14 | 2005-07-19 | Avaworks, Incorporated | Photo realistic talking head creation system and method |
CA2654960A1 (en) * | 2006-04-10 | 2008-12-24 | Avaworks Incorporated | Do-it-yourself photo realistic talking head creation system and method |
WO2008141125A1 (en) * | 2007-05-10 | 2008-11-20 | The Trustees Of Columbia University In The City Of New York | Methods and systems for creating speech-enabled avatars |
-
2001
- 2001-05-09 US US09/852,398 patent/US9400921B2/en not_active Expired - Fee Related
-
2002
- 2002-05-02 CN CNB028094204A patent/CN1294541C/en not_active Expired - Fee Related
- 2002-05-02 KR KR1020037014619A patent/KR100571115B1/en not_active IP Right Cessation
- 2002-05-02 GB GB0328400A patent/GB2393065B/en not_active Expired - Fee Related
- 2002-05-02 AU AU2002303611A patent/AU2002303611A1/en not_active Abandoned
- 2002-05-02 WO PCT/US2002/014014 patent/WO2002091305A2/en not_active Application Discontinuation
-
2004
- 2004-07-08 HK HK04104981A patent/HK1062067A1/en not_active IP Right Cessation
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0807902A2 (en) * | 1996-05-16 | 1997-11-19 | Cyberclass Limited | Method and apparatus for generating moving characters |
US6163322A (en) * | 1998-01-19 | 2000-12-19 | Taarna Studios Inc. | Method and apparatus for providing real-time animation utilizing a database of postures |
Non-Patent Citations (2)
Title |
---|
GUENTER BRIAN ET AL: "Making faces" COMPUTER GRAPHICS. SIGGRAPH 98 CONFERENCE PROCEEDINGS. ORLANDO, FL, JULY 19- - 24, 1998, COMPUTER GRAPHICS PROCEEDINGS. SIGGRAPH, NEW YORK, NY: ACM, US, 19 July 1998 (1998-07-19), pages 55-66, XP002205956 ISBN: 0-89791-999-8 * |
KOUADIO C ET AL: "Real-time facial animation based upon a bank of 3D facial expressions" COMPUTER ANIMATION 98. PROCEEDINGS PHILADELPHIA, PA, USA 8-10 JUNE 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 8 June 1998 (1998-06-08), pages 128-136, XP010285090 ISBN: 0-8186-8541-7 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9400921B2 (en) | 2001-05-09 | 2016-07-26 | Intel Corporation | Method and system using a data-driven model for monocular face tracking |
WO2006015092A2 (en) | 2004-07-30 | 2006-02-09 | Euclid Discoveries, Llc | Apparatus and method for processing video data |
EP1779294A2 (en) * | 2004-07-30 | 2007-05-02 | Euclid Discoveries, LLC | Apparatus and method for processing video data |
JP2008508801A (en) * | 2004-07-30 | 2008-03-21 | ユークリッド・ディスカバリーズ・エルエルシー | Apparatus and method for processing video data |
EP1779294A4 (en) * | 2004-07-30 | 2010-12-29 | Euclid Discoveries Llc | Apparatus and method for processing video data |
EP2602742A1 (en) * | 2004-07-30 | 2013-06-12 | Euclid Discoveries, LLC | Apparatus and method for processing video data |
EP2132680A1 (en) * | 2007-03-05 | 2009-12-16 | Seeing Machines Pty Ltd | Efficient and accurate 3d object tracking |
EP2132680A4 (en) * | 2007-03-05 | 2011-12-21 | Seeing Machines Pty Ltd | Efficient and accurate 3d object tracking |
US8848975B2 (en) | 2007-03-05 | 2014-09-30 | Seeing Machines Pty Ltd | Efficient and accurate 3D object tracking |
Also Published As
Publication number | Publication date |
---|---|
AU2002303611A1 (en) | 2002-11-18 |
KR100571115B1 (en) | 2006-04-13 |
GB0328400D0 (en) | 2004-01-14 |
GB2393065A (en) | 2004-03-17 |
CN1294541C (en) | 2007-01-10 |
HK1062067A1 (en) | 2004-10-15 |
US9400921B2 (en) | 2016-07-26 |
WO2002091305A3 (en) | 2003-09-18 |
US20030012408A1 (en) | 2003-01-16 |
GB2393065B (en) | 2005-04-20 |
CN1509456A (en) | 2004-06-30 |
KR20040034606A (en) | 2004-04-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9400921B2 (en) | Method and system using a data-driven model for monocular face tracking | |
DeCarlo et al. | The integration of optical flow and deformable models with applications to human face shape and motion estimation | |
Gokturk et al. | A data-driven model for monocular face tracking | |
Smolyanskiy et al. | Real-time 3D face tracking based on active appearance model constrained by depth data | |
Bickel et al. | Multi-scale capture of facial geometry and motion | |
Cai et al. | 3d deformable face tracking with a commodity depth camera | |
Brand et al. | Flexible flow for 3d nonrigid tracking and shape recovery | |
EP2710557B1 (en) | Fast articulated motion tracking | |
US6047078A (en) | Method for extracting a three-dimensional model using appearance-based constrained structure from motion | |
DeCarlo et al. | Deformable model-based shape and motion analysis from images using motion residual error | |
US20060285770A1 (en) | Direct method for modeling non-rigid motion with thin plate spline transformation | |
Kervrann et al. | Statistical deformable model-based segmentation of image motion | |
Agudo et al. | Real-time 3D reconstruction of non-rigid shapes with a single moving camera | |
Sung et al. | Pose-Robust Facial Expression Recognition Using View-Based 2D $+ $ 3D AAM | |
JP2002319026A (en) | Method for directly modeling non-rigid three-dimensional object from image sequence | |
Ye et al. | 3d morphable face model for face animation | |
Chen et al. | Single and sparse view 3d reconstruction by learning shape priors | |
Achenbach et al. | Accurate Face Reconstruction through Anisotropic Fitting and Eye Correction. | |
Chen et al. | 3D active appearance model for aligning faces in 2D images | |
Wang et al. | Template-free 3d reconstruction of poorly-textured nonrigid surfaces | |
Pham et al. | Robust real-time performance-driven 3D face tracking | |
Aguiar et al. | Factorization as a rank 1 problem | |
Marks et al. | Tracking motion, deformation, and texture using conditionally gaussian processes | |
CN111460741A (en) | Fluid simulation method based on data driving | |
Hou et al. | Smooth adaptive fitting of 3D face model for the estimation of rigid and nonrigid facial motion in video sequences |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG UZ VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
ENP | Entry into the national phase |
Ref document number: 0328400 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20020502 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 028094204 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020037014619 Country of ref document: KR |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: JP |