Recherche Images Maps Play YouTube Actualités Gmail Drive Plus »
Connexion
Les utilisateurs de lecteurs d'écran peuvent cliquer sur ce lien pour activer le mode d'accessibilité. Celui-ci propose les mêmes fonctionnalités principales, mais il est optimisé pour votre lecteur d'écran.

Brevets

  1. Recherche avancée dans les brevets
Numéro de publicationUSRE36041 E
Type de publicationOctroi
Numéro de demandeUS 08/340,615
Date de publication12 janv. 1999
Date de dépôt16 nov. 1994
Date de priorité1 nov. 1990
État de paiement des fraisPayé
Autre référence de publicationDE69130616D1, DE69130616T2, EP0555380A1, EP0555380A4, EP0555380B1, US5164992, WO1992008202A1
Numéro de publication08340615, 340615, US RE36041 E, US RE36041E, US-E-RE36041, USRE36041 E, USRE36041E
InventeursMatthew Turk, Alex P. Pentland
Cessionnaire d'origineMassachusetts Institute Of Technology
Exporter la citationBiBTeX, EndNote, RefMan
Liens externes: USPTO, Cession USPTO, Espacenet
Face recognition system
US RE36041 E
Résumé
A recognition system for identifying members of an audience, the system including an imaging system which generates an image of the audience; a selector module for selecting a portion of the generated image; a detection means which analyzes the selected image portion to determine whether an image of a person is present; and a recognition module responsive to the detection means for determining whether a detected image of a person identified by the detection means resembles one of a reference set of images of individuals.
Images(1)
Previous page
Next page
Revendications(25)
What is claimed is:
1. A recognition system for identifying members of an audience, the system comprising:
an imaging system which generates an image of the audience;
a selector module for selecting a portion of said generated image;
means for representing a reference set of images of individuals as a set of eigenvectors in a multi-dimensional image space;
a detection means which determines whether the selected image portion contains an image that can be classified as an image of a person, said detection means including means for representing said selected image portion as an input vector in said multi-dimensional image space and means for computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors, wherein said detection means uses the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and
a recognition module responsive to said detection means for determining whether a detected image of a person identified by said detection means resembles one of the reference set of images of individuals.
2. The recognition system of claim 1 wherein said detection means further comprises a thresholding means for determining whether an image of a person is present by comparing said computed distance to a preselected threshold.
3. The recognition system of claim 1 wherein said . .selection means.!. .Iadd.selector module .Iaddend.comprises a motion detector for identifying the selected portion of said image by detector motion.
4. The recognition system of claim 3 wherein said . .selection means.!. .Iadd.selector module .Iaddend.further comprises a locator module for locating the portion of said image corresponding to a face of the person based on motion detected by said motion detector.
5. The recognition system of claim 1 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.
6. The recognition system of claim 1 wherein said recognition module comprises means for representing each member of said reference set as a corresponding point in said subspace.
7. The recognition system of claim 6 wherein the location of each point in subspace associated with a corresponding member of said reference set is determined by projecting a vector associated with that member onto said subspace.
8. The recognition system of claim 7 wherein said recognition module further comprises means for projecting said input vector onto said subspace.
9. The recognition system of claim 8 wherein said recognition module further comprises means for selecting a particular member of said reference set and means for computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member.
10. The recognition system of claim 8 wherein said recognition module further comprises means for determining for each member of said reference set a distance in subspace between the location associated with that member in subspace and the point identified by the projection of said input vector onto said subspace.
11. The recognition system of claim 10 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals.
12. A method for identifying members of an audience, the method comprising:
generating an image of the audience;
selecting a portion of said generated image;
representing a reference set of images of individuals as a set of eigenevectors in a multi-dimensional image space;
representing said selected image portion as an input vector in said multi-dimensional image space;
computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors;
using the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and
if it is determined that the selected image contains an image that can be classified as an image of a person determining whether said image of a person resembles one of a reference set of images of individuals.
13. The method of claim 12 further comprising the step of determining which one, if any, of the members of said reference set said image of a person resembles.
14. The method of claim 12 wherein the image of the audience is a sequence of image frames and wherein the method further comprises detecting motion within the sequence of image frames and wherein the selected image portion is determined on the basis of the detected motion.
15. The method of claim 12 wherein the step of determining whether the selected image portion contains an image that can be classified as an image of a person further comprises comparing said computed distance to a preselected threshold.
16. The method of claim 15 wherein the step of determining whether said image of a person resembles a member of said reference set comprises representing each member of said reference set as a corresponding point in said subspace.
17. The method of claim 16 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises determining the location of each point in subspace associated with a corresponding member of said reference set by projecting a vector associated with that member onto said subspace.
18. The method of claim 17 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises projecting said input vector onto said subspace.
19. The method of claim 18 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises selecting a member of said reference set and computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member.
20. The method of claim 18 wherein the step of determining whether said image of a person resembles a member of said reference set further comprises determining for each member of said reference set a distance in subspace between the location for that member in subspace and the point identified by the projection of said input vector onto said subspace.
21. The method of claim 20 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals. .Iadd.
22. A recognition system comprising:
an imaging system which generates an image;
a selector module for selecting a portion of said generated image;
means for representing a reference set of images of individuals as a set of eigenvectors in a multi-dimensional image space;
a detection means which determines whether the selected image portion contains an image that can be classified as an image of a person, said detection means including means for representing said selected image portion as an input vector in said multi-dimensional image space and means for computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors, wherein said detection means uses the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person; and
a recognition module responsive to said detection means for determining whether a detected image of a person identified by said detection means resembles one of the reference set of images of individuals. .Iaddend..Iadd.23. The recognition system of claim 22 wherein said detection means further comprises a thresholding means for determining whether an image of a person is present by comparing said computed distance to a preselected threshold. .Iaddend..Iadd.24. The recognition system of claim 22 wherein said image of a person is an image of a person's face and wherein said reference set comprises images of faces of said individuals. .Iaddend..Iadd.25. The recognition system of claim 22 wherein said recognition module comprises means for representing each member of said reference set as a corresponding point in said subspace.
.Iaddend..Iadd.26. The recognition system of claim 25 wherein the location of each point in subspace associated with a corresponding member of said reference set is determined by projecting a vector associated with that member onto said subspace. .Iaddend..Iadd.27. The recognition system of claim 26 wherein said recognition module further comprises means for projecting said input vector onto said subspace. .Iaddend..Iadd.28. The recognition system of claim 27 wherein said recognition module further comprises means for selecting a particular member of said reference set and means for computing a distance within said subspace between a point identified by the projection of said input vector onto said subspace and the point in said subspace associated with said selected member. .Iaddend..Iadd.29. The recognition system of claim 27 wherein said recognition module further comprises means for determining for each member of said reference set a distance in subspace between the location associated with that member in subspace and the point identified by the projection of said input vector onto said subspace. .Iaddend..Iadd.30. The recognition system of claim 24 wherein said means for representing said reference set includes means for adding a member to said reference set by protecting into said subspace an input vector having a computed distance indicative of an image of a face. .Iaddend..Iadd.31. A method comprising:
generating an image;
selecting a portion of said generated image;
representing a reference set of images of faces of individuals as a set of eigenvectors in a multi-dimensional image space;
representing said selected image portion as an input vector in said multi-dimensional image space;
computing the distance between a point identified by said input vector and a multi-dimensional subspace defined by said set of eigenvectors;
using the computed distance to determine whether the selected image portion contains an image that can be classified as an image of a person's face; and
if it is determined that the selected image contains an image that can be classified as an image of a person's face, determining whether said image of a person's face resembles one of a reference set of images of faces of
individuals. .Iaddend..Iadd.32. The method of claim 31 further comprising the step of determining which one, if any, of the members of said reference set said image of a person's face resembles. .Iaddend..Iadd.33. The method of claim 31 wherein the step of determining whether the selected image portion contains an image that can be classified as an image of a person's face further comprises comparing said computed distance to a preselected threshold. .Iaddend..Iadd.34. The method of claim 33 wherein the step of determining whether said image of a person's face resembles a member of said reference set comprises representing each member of said reference set as a corresponding point in said subspace. .Iaddend..Iadd.35. The method of claim 34 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises determining the location of each point in subspace associated with a corresponding member of said reference set by projecting a vector associated with that member onto said subspace.
.Iaddend..Iadd. The method of claim 35 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises projecting said input vector onto said subspace. .Iaddend..Iadd.37. The method of claim 36 wherein the step of determining whether said image of a person's face resembles a member of said reference set further comprises determining for each member of said reference set a distance in subspace between the location for that member in subspace and the point identified by the projection of said input vector onto said subspace. .Iaddend.
Description
BACKGROUND OF THE INVENTION

The invention relates to a system for identifying members of a viewing audience.

For a commercial television network, the cost of its advertising time depends critically on the popularity of its programs among the television viewing audience. Popularity, in this case, is typically measured in terms of the program's share of the total audience viewing television at the time the program airs. As a general rule of thumb, advertisers prefer to place their advertisements where they will reach the greatest number of people. Thus, there is a higher demand among commercial advertisers for advertising time slots along side more popular programs. Such time slots can also demand a higher price.

Because the economics of television advertising depends so critically on the tastes and preferences of the television audience, the television industry invests a substantial amount of time, effort and money in measuring those tastes and preferences. One preferred approach involves monitoring the actual viewing habits of a group of volunteer families which represent a cross-section of all people who watch television. Typically, the participants in such a study allow monitoring equipment to be placed in their homes. Whenever a participant watches a television program, the monitoring equipment records the time, the identity of the program and the identity of the members of the viewing audience. Many of these systems require active participation by the television viewer to obtain the monitoring information. That is, the viewer must in some way interact with the equipment to record his presence in the viewing audience. If the viewer forgets to record his presence the monitoring statistics will be incomplete. In general, the less manual intervention required by the television viewer, the more likely it is that the gathered statistics on viewing habits will be complete and error free.

Systems have been developed which automatically identify members of the viewing audience without requiring the viewer to enter any information. For example, U.S. Pat. No. 4,858,000 to Daozehng Lu, issued Aug. 15, 1989 describes such a system. In the system, a scanner using infrared detectors locates a member of the viewing audience, captures an image of the located member, extracts a pattern signature for the captured image and then compares the extracted pattern signature to a set of stored pattern image signatures to identify the audience member.

SUMMARY OF THE INVENTION

In general, in one aspect, the invention is a recognition system for identifying members of an audience. The invention includes an imaging system which generates an image of the audience; a selector module for selecting a portion of the generated image; a detection means which analyzes the selected image portion to determine whether an image of a person is present; and a recognition module for determining whether a detected image of a person resembles one of a reference set of images of individuals.

Preferred embodiments include the following features. The recognition module also determines which one, if any, of the individuals in the reference set the detected image resembles. The selection means includes a motion detector for identifying the selected portion of the image by detecting motion and it includes a locator module for locating the portion of the image corresponding to the face of the person detected. In the recognition system, the detection means and the recognition module employ a first and second pattern recognition techniques, respectively, to determine whether an image of a person is present in the selected portion of the image and both pattern recognition techniques employ a set of eigenvectors in a multi-dimensional image space to characterize the reference set. In addition, the second pattern recognition technique also represents each member of the reference set as a point in a subspace defined by the set of eigenvectors. Also, the image of a person is an image of a person's face and the reference set includes images of faces of the individuals.

Also in preferred embodiments, the recognition system includes means for representing the reference set as a set of eigenvectors in a multi-dimensional image space and the detection means includes means for representing the selected image portion as an input vector in the multi-dimensional image space and means for computing the distance between a point identified by the input vector and a subspace defined by the set of eigenvectors. The detection means also includes a thresholding means for determining whether an image of a person is present by comparing the computed distance to a preselected threshold. The recognition module includes means for representing each member of the reference set as a corresponding point in the subspace. To determine the location of each point in subspace associated with a corresponding member of the reference set, a vector associated with that member is projected onto the subspace.

The recognition module also includes means for projecting the input vector onto the subspace, means for selecting a particular member of the reference set, and means for computing a distance within the subspace between a point identified by the projection of the input vector onto the subspace and the point in the subspace associated with the selected member.

In general, in another aspect, the invention is a method for identifying members of an audience. The invention includes the steps of generating an image of the audience; selecting a portion of the generated image; analyzing the selected image portion to determine whether an image of a person is present; and if an image of a person is determined to be present, determining whether the image of a person resembles one of a reference set of images of individuals.

One advantage of the invention is that it is fast, relatively simple and works well in a constrained environment, i.e., an environment for which the associated image remains relatively constant except for the coming and going of people. In addition, the invention determines whether a selected portion of an image actually contains an image of a face. If it is determined that the selected image portion contains an image of a face, the invention then determine which one of a reference set of known faces the detected face image most resembles. If the detected face image is not present among the reference set, the invention reports the presence of a unknown person in the audience. The invention has the ability to discriminate face images from images of other objects.

Other advantages and features will become apparent from the following description of the preferred embodiment and from the claims.

DESCRIPTION OF THE PREFERRED EMBODIMENT

FIG. 1 is a block diagram of a face recognition system;

FIG. 2 is a flow diagram of an initialization procedure for the face recognition module;

FIG. 3 is a flow diagram of the operation of the face recognition module; and

FIG. 4 is a block diagram of a motion detection system for locating faces within a sequence of images.

STRUCTURE AND OPERATION

Referring to FIG. 1, in an audience monitoring system 2, a video camera 4, which is trained on an area where members of a viewing audience generally sit to watch the TV, sends a sequence of video image frames to a motion detection module 6. Video camera 4, which may, for example, be installed in the home of a family that has volunteered to participate in a study of public viewing habits, generates images of TV viewing audience. Motion detection module 6 processes the sequence of image frames to identify regions of the recorded scene that contain motion, and thus may be evidence of the presence of a person watching TV. In general, motion detection module 6 accomplishes this by comparing successive frames of the image sequence so as to find those locations containing image data that changes over time. Since the image background (i.e., images of the furniture and other objects in the room) will usually remain unchanged from frame to frame, the areas of movement will generally be evidence of the presence of a person in the viewing audience.

When movement is identified, a head locator module 8 selects a block of the image frame containing the movement and sends it to a face recognition module 10 where it is analyzed for the presence of recognizable faces. Face recognition module 10 performs two functions. First, it determines whether the image data within the selected block resembles a face. Then, if it does resemble a face, module 10 determines whether the face is one of a reference set of faces. The reference set may include, for example, the images of faces of all members of the family in whose house the audience monitoring system has been installed.

To perform its recognition functions, face recognizer 10 employs a multi-dimensional representation in which face images are characterized by a set of eigenvectors or "eigenfaces". In general, according to this technique, each image is represented as a vector (or a point) in very high dimensional image space in which each pixel of the image is represented by a corresponding dimension or axis. The dimension of this image space thus depends upon the size of the image being represented and can become very large for any reasonably sized image. For example, if the block of image data is N pixels by N pixels, then the multi-dimensional image space has dimension N2. The image vector which represents the N×N block of image data in this multi-dimensional image space is constructed by simply concatenating the rows of the image data to generate a vector of length N2.

Face images, like all other possible images, are represented by points within this multi-dimensional image space. The distribution of faces, however, tends to be grouped within a region of the image space. Thus, the distribution of faces of the reference set can be characterized by using principal component analysis. The resulting principal components of the distribution of faces, or the eigenvectors of the covariance matrix of the set of face images, defines the variation among the set of face images. These eigenvectors are typically ordered, each one accounting for a different amount of variation among the face images. They can be thought of as a set of features which together characterize the variation between face images within the reference set. Each face image location within the multi-dimensional image space contributes more or less to each eigenvector, so that each eigenvector represents a sort of ghostly face which is referred to herein as an eigenface.

Each individual face from the reference set can be represented exactly in terms of a linear combination of M non-zero eigenfaces. Each face can also be approximated using only the M' "best" faces, i.e., those that have the largest eigenvalues, and which therefore account for the most variance within the set of face images. The best M' eigenfaces span an M'-dimensional subspace (referred to hereinafter as "face space") of all possible images.

This approach to face recognition involves the initialization operations shown in FIG. 2 to "train" recognition module 10. First, a reference set of face images is obtained and each of the faces of that set is represented as a corresponding vector or point in the multi-dimensional image space (step 100). Then, using principal component analysis, the distribution of points for the reference set of faces is characterized in terms of a set of eigenvectors (or eigenfaces) (step 102). If a full characterization of the distribution of points is performed, it will yield N2 eigenfaces of which M are non-zero. Of these, only the M' eigenfaces corresponding to the highest eigenvalues are chosen, where M'<M<<N2. This subset of eigenfaces is used to define a subspace (or face space) within the multidimensional image space. Finally, each member of the reference set is represented by a corresponding point within face space (step 104). For a given face, this is accomplished by projecting its point in the higher dimensional image space onto face space.

If additional faces are added to the reference set at a later time, these operations are repeated to update the set of eigenfaces characterizing the reference set.

After face recognition module 10 is initialized, it implements the steps shown in FIG. 3 to recognize face images supplied by face locator module 8. First, face recognition module 10 projects the input image (i.e., the image presumed to contain a face) onto face space by projecting it onto each of the M' eigenfaces (step 200). Then, module 10 determines whether the input image is a face at all (whether known or unknown) by checking to see if the image is sufficiently close to "face space" (step 202). That is, module 10 computes how far the input image in the multi-dimensional image space is from the face space and compares this to a preselected threshold. If the computed distance is greater than the preselected threshold, module 10 indicates that it does not represent a face image and motion detection module 6 locates the next block of the overall image which may contain a face image.

If the computed distance is sufficiently close to face space (i.e., less than the preselected threshold), recognition module 10 treats it as a face image and proceeds with determining whose face it is (step 206). This involves computing distances between the projection of the input image onto face space and each of the reference face images in face space. If the projected input image is sufficiently close to any one of the reference faces (i.e., the computed distance in face space is less than a predetermined distance), recognition module 10 identifies the input image as belonging to the individual associated with that reference face. If the projected input image is not sufficently close to any one of the reference faces, recognition module 10 reports that a person has been located but the identity of the person is unknown.

The mathematics underlying each of these steps will now be described in greater detail.

Calculating Eigenfaces

Let a face image I(x,y) be a two-dimensional N by N array of (8-bit) intensity values. The face image is represented in the multi-dimensional image space as a vector of dimension N2. Thus, a typical image of size 256 by 256 becomes a vector of dimension 65,536, or, equivalently, a point in 65,536-dimensional image space. An ensemble of images, then, maps to a collection of points in this huge space.

Images of faces, being similar in overall configuration, are not randomly distributed in this huge image space and thus can be described by a relatively low dimensional subspace. Using principal component analysis, one identifies the vectors which best account for the distribution of face images within the entire image space. These vectors, namely, the "eigenfaces", define the "face space". Each vector is of length N2, describes an N by N image, and is a linear combination of the original face images of the reference set.

Let the training set of face images be Γ1, Γ2, Γ3, . . . , Γm. The average face of the set is defined by

Ψ=(M)-1 Σn Γn,              (1)

where the summation is from n=1 to M. Each face differs from the average by the vector Φii -Ψ. This set of very large vectors is then subject to principal component analysis, which seeks a set of M orthonormal vectors, un, which best describes the distribution of the data. The kth vector, uk, is chosen such that:

λk =(M)-1 Σn (uk T Φn)2 (2)

is a maximum, subject to: ##EQU1##

The vectors uk and scalars λk are the eigenvectors and eigenvalues, respectively, of the covariance matrix ##EQU2## where the matrix A= Φ1 Φ2 . . . ΦM !. The matrix C, however, is N2 by N2, and determining the N2 eigenvectors and eigenvalues can become an intractable task for typical image sizes.

If the number of data points in the face space is less than the dimension of the overall image space (namely, if, M<N2), there will be only M-1, rather than N2, meaningful eigenvectors. (The remaining eigenvectors will have associated eigenvalues of zero.) One can solve for the N2 -dimensional eigenvectors in this case by first solving for the eigenvectors of an M by M matrix--e.g. solving a 16×16 matrix rather than a 16,384 by 16,384 matrix--and then taking appropriate linear combinations of the face images Φi. Consider the eigenvectors vi of AT A such that:

AT Avii vi                       (5)

Premultiplying both sides by A, yields:

AAT Avii Avi                     (6)

from which it is apparent that Avi are the eigenvectors of C=AAT.

Following this analysis, it is possible to construct the M by M matrix L=AT A, where Lmnm T Φn, and find the M eigenvectors, v1, of L. These vectors determine linear combinations of the M training set face images to form the eigenfaces u1 : ##EQU3##

With this analysis the calculations are greatly reduced, from the order of the number of pixels in the images (N2) to the order of the number of images in the training set (M). In practice, the training set of face images will be relatively small (M<<N2), and the calculations become quite manageable. The associated eigenvalues provide a basis for ranking the eigenvectors according to their usefulness in characterizing the variation among the images.

In practice, a smaller M' is sufficient for identification, since accurate construction of the image is not a requirement. In this framework, identification becomes a pattern recognition task. The eigenfaces span an M'-dimensional subspace of the original N2 image space. The M' significant eigenvectors of the L matrix are chosen as those with the largest associated eigenvalues. In test cases based upon M=16 face images, M'=7 eigenfaces were found to yield acceptable results, i.e., a level of accuracy sufficient for monitoring a TV audience for purposes of studying viewing habits and tastes.

A new face image (Γ) is transformed into its eigenface components (i.e., projected into "face space") by a simple operation,

ωk =uk T (Γ-Ψ),              (8)

for k=1, . . . , M'. This describes a set of point-by-point image multiplications and summations, operations which may be performed at approximately frame rate on current image processing hardware.

The weights form a vector ΩT = ω1 ω2 . . . ωM,! that describes the contribution of each eigenface in representing the input face image, treating the eigenfaces as a basis set for face images. The vector may then be used in a standard pattern recognition algorithm to find which of a number of pre-defined face classes, if any, best describes the face. The simplest method for determining which face class provides the best description of an input face image is to find the face class k that minimizes the Euclidian distance

εk =∥(Ω-Ωk)∥2, (9)

where Ωk is a vector describing the kth face class. The face classes Ωi are calculated by averaging the results of the eigenface representation over a small number of face images (as few as one) of each individual. A face is classified as belonging to class k when the minimum εk is below some chosen threshold θ.sub.ε. Otherwise the face is classified as "unknown", and optionally used to create a new face class.

Because creating the vector of weights is equivalent to projecting the original face image onto the low-dimensional face space, many images (most of them looking nothing like a face) will project onto a given pattern vector. This is not a problem for the system, however, since the distance ε between the image and the face space is simply the squared distance between the mean-adjusted input image Φ=Γ-Ψ and Φf=Σωk uk, its projection onto face space (where the summation is over k from 1 to M'):

ε2 =∥Φ-Φf2 (10)

Thus, there are four possibilities for an input image and its pattern vector: (1) near face space and near a face class; (2) near face space but not near a known face class; (3) distant from face space and near a face class; and (4) distant from face space and not near a known face class.

In the first case, an individual is recognized and identified. In the second case, an unknown individual is present. The last two cases indicate that the image is not a face image. Case three typically shows up as a false positive in most other recognition systems. In the described embodiment, however, the false recognition may be detected because of the significant distance between the image and the subspace of expected face images.

Summary of Eigenface Recognition Procedure

To summarize, the eigenfaces approach to face recognition involves the following steps:

1. Collect a set of characteristic face images of the known individuals. This set may include a number of images for each person, with some variation in expression and in lighting. (Say four images of ten people, so M=40.)

2. Calculate the (40×40) matrix L, find its eigenvectors and eigenvalues, and choose the M' eigenvectors with the highest associated eigenvalues. (Let M'=10 in this example.)

3. Combine the normalized training set of images according to Eq. 7 to produce the (M'=10) eigenfaces uk.

4. For each known individual, calculate the class vector Ωk by averaging the eigenface pattern vectors Ω (from Eq. 9) calculated from the original (four) images of the individual. Choose a threshold θ.sub.ε which defines the maximum allowable distance from any face class, and a threshold θt which defines the maximum allowable distance from face space (according to Eq. 10).

5. For each new face image to be identified, calculate its pattern vector φ, the distances εi to each known class, and the distance ε to face space. If the distance ε>θt, classify the input image as not a face. If the minimum distance εk ≦θ.sub.ε and the distance ε≦θ1, classify the input face as the individual associated with class vector Ωk. If the minimum distance εk >θε and ε≦θ1, then the image may be classified as "unknown", and optionally used to begin a new face class.

6. If the new image is classified as a known individual, this image may be added to the original set of familiar face images, and the eigenfaces may be recalculated (steps 1-4). This gives the opportunity to modify the face space as the system encounters more instances of known faces.

In the described embodiment, calculation of the eigenfaces is done offline as part of the training. The recognition currently takes about 400 msec running rather inefficiently in Lisp on a Sun 4, using face images of size 128×128. With some special-purpose hardware, the current version could run at close to frame rate (33 msec).

Designing a practical system for face recognition within this framework requires assessing the tradeoffs between generality, required accuracy, and speed. If the face recognition task is restricted to a small set of people (such as the members of a family or a small company), a small set of eigenfaces is adequate to span the faces of interest. If the system is to learn new faces or represent many people, a larger basis set of eigenfaces will likely be required.

Motion Detection And Head Tracking

In the described embodiment, motion detection module 6 and head locator module 8 locates and tracks the position of the head of any person within the scene viewed by video camera 4 by implementing the tracking algorithm depicted in FIG. 4. A sequence of image frames 30 from video camera 4 first passes through a spatio-temporal filtering module 32 which accentuates image locations which change with time. Spatio-temporal filtering module 32 identifies the locations of motion by performing a differencing operation on successive frames of the sequence of image frames. In the output of the spatio-temporal filter module 32, a moving person "lights up" whereas the other areas of the image containing no motion appear as black.

The spatio-temporal filtered image passes to a thresholding module 34 which produces a binary motion image identifying the locations of the image for which the motion exceeds a preselected threshold. That is, it locates the areas of the image containing the most motion. In all such areas, the presence of a person is postulated.

A motion analyzer module 36 analyzes the binary motion image to watch how "motion blobs" change over time to decide if the motion is caused by a person moving and to determine head position. A few simple rules are applied, such as "the head is the small upper blob above a larger blob (i.e., the body)", and "head motion must be reasonably slow and contiguous" (i.e., heads are not expected to jump around the image erratically).

The motion image also allows for an estimate of scale. The size of the blob that is assumed to be the moving head determines the size of the subimage to send to face recognition module 10 (see FIG. 1). This subimage is rescaled to fit the dimensions of the eigenfaces.

Using "Face Space" To Locate The Face

Face space may also be used to locate faces in single images, either as an alternative to locating faces from motion (e.g. if there is too little motion or many moving objects) or as a method of achieving more precision than is possible by use of motion tracking alone.

Typically, images of faces do not change radically when projected into the face space; whereas, the projection of non-face images appear quite different. This basic idea may be used to detect the presence of faces in a scene. To implement this approach, the distance ε between the local subimage and face space is calculated at every location in the image. This calculated distance from face space is then used as a measure of "faceness". The result of calculating the distance from face space at every point in the image is a "face map" ε(x,y) in which low values (i.e., the dark areas) indicate the presence of a face.

Direct application of Eq. 10, however, is rather expensive computationally. A simpler, more efficient method of calculating the face map ε(x,y) is as follows.

To calculate the face map at every pixel of an image I(x,y), the subimage centered at that pixel is projected onto face space and the projection is then subtracted from the original subimage. To project a subimage Γ onto face space, one first subtracts the mean image (i.e., Ψ), resulting in Φ=Γ-Ψ. With Φf being the projection of Φ onto face space, the distance measure at a given image location is then: ##EQU4## since Φf ⊥(Φ-Φf). Because Φf is a linear combination of the eigenfaces (Φfi ωi ui) and the eigenfaces are orthonormal vectors,

Φf T Φfi ωi 2 (12)

and

ε2 (x,y)=ΦT (x,y) Φ(x,y)-Σωi 2 (x,y)               (13)

where ε(x,y) and ωi (x,y) are scalar functions of image location, and Φ(x,y) is a vector function of image location.

The second term of Eq. 13 is calculated in practice by a correlation with the L eigenfaces: ##EQU5## where x the correlation operator. The first term of Eq. 13 becomes ##EQU6## Since the average face Ψ and the eigenfaces ui are fixed, the terms ΨT Ψ and Ψxui may be computed ahead of time.

Thus, the computation of the face map involves only L+1 correlations over the input image and the computation of the first term ΓT (x,y)Γ(x,y). This is computed by squaring the input image I(x,y) and, at each image location, summing the squared values of the local subimage.

Scale Invariance

Experiments reveal that recognition performance decreases quickly as the head size, or scale, is mis-judged. It is therefore desirable for the head size in the input image must be close to that of the eigenfaces. The motion analysis can give an estimate of head size, from which the face image is rescaled to the eigenface size.

Another approach to the scale problem, which may be separate from or in addition to the motion estimate, is to use multiscale eigenfaces, in which an input face image is compared with eigenfaces at a number of scales. In this case the image will appear to be near the face space of only the closest scale eigenfaces. Equivalently, the input image (i.e., the portion of the overall image selected for analysis) can be scaled to multiple sizes and the scale which results in the smallest distance measure to face space used.

Other embodiments are within the following claims. For example, although the eigenfaces approach to face recognition has been presented as an information processing model, it may also be implemented using simple parallel computing elements, as in a connectionist system or artificial neural network.

Citations de brevets
Brevet cité Date de dépôt Date de publication Déposant Titre
US4636862 *7 févr. 198513 janv. 1987Kokusai Denshin Denwa Kabushiki KaishaSystem for detecting vector of motion of moving objects on picture
US4651289 *24 janv. 198317 mars 1987Tokyo Shibaura Denki Kabushiki KaishaPattern recognition apparatus and method for making same
US4752957 *7 sept. 198421 juin 1988Kabushiki Kaisha ToshibaApparatus and method for recognizing unknown patterns
US4838644 *15 sept. 198713 juin 1989The United States Of America As Represented By The United States Department Of EnergyPosition, rotation, and intensity invariant recognizing method
US4858000 *14 sept. 198815 août 1989A. C. Nielsen CompanyImage recognition audience measurement system and method
US4926491 *6 juin 198815 mai 1990Kabushiki Kaisha ToshibaPattern recognition device
US4930011 *2 août 198829 mai 1990A. C. Nielsen CompanyMethod and apparatus for identifying individual members of a marketing and viewing audience
US4998286 *20 janv. 19885 mars 1991Olympus Optical Co., Ltd.Correlation operational apparatus for multi-dimensional images
US5031228 *14 sept. 19889 juil. 1991A. C. Nielsen CompanyImage recognition system and method
Citations hors brevets
Référence
1L. Sirovich et al., 1987 Optical Society of America, "Low-dimensional procedure for the characterization of human faces", pp. 519-524.
2 *L. Sirovich et al., 1987 Optical Society of America, Low dimensional procedure for the characterization of human faces , pp. 519 524.
Référencé par
Brevet citant Date de dépôt Date de publication Déposant Titre
US6445810 *1 déc. 20003 sept. 2002Interval Research CorporationMethod and apparatus for personnel detection and tracking
US6456320 *26 mai 199824 sept. 2002Sanyo Electric Co., Ltd.Monitoring system and imaging system
US6501857 *20 juil. 199931 déc. 2002Craig GotsmanMethod and system for detecting and classifying objects in an image
US652962012 sept. 20014 mars 2003Pinotage, L.L.C.System and method for obtaining and utilizing maintenance information
US6535620 *12 mars 200118 mars 2003Sarnoff CorporationMethod and apparatus for qualitative spatiotemporal data processing
US6597801 *20 déc. 199922 juil. 2003Hewlett-Packard Development Company L.P.Method for object registration via selection of models with dynamically ordered features
US6618490 *20 déc. 19999 sept. 2003Hewlett-Packard Development Company, L.P.Method for efficiently registering object models in images via dynamic ordering of features
US6628811 *18 mars 199930 sept. 2003Matsushita Electric Industrial Co. Ltd.Method and apparatus for recognizing image pattern, method and apparatus for judging identity of image patterns, recording medium for recording the pattern recognizing method and recording medium for recording the pattern identity judging method
US6628834 *11 juil. 200230 sept. 2003Hewlett-Packard Development Company, L.P.Template matching system for images
US6690414 *12 déc. 200010 févr. 2004Koninklijke Philips Electronics N.V.Method and apparatus to reduce false alarms in exit/entrance situations for residential security monitoring
US672492021 juil. 200020 avr. 2004Trw Inc.Application of human facial features recognition to automobile safety
US67955675 mai 200021 sept. 2004Hewlett-Packard Development Company, L.P.Method for efficiently tracking object models in video sequences via dynamic ordering of features
US681013529 juin 200026 oct. 2004Trw Inc.Optimized human presence detection through elimination of background interference
US681608517 nov. 20009 nov. 2004Michael N. HaynesMethod for managing a parking lot
US6865296 *5 juin 20018 mars 2005Matsushita Electric Industrial Co., Ltd.Pattern recognition method, pattern check method and pattern recognition apparatus as well as pattern check apparatus using the same methods
US687374329 mars 200229 mars 2005Fotonation Holdings, LlcMethod and apparatus for the automatic real-time detection and correction of red-eye defects in batches of digital images or in handheld appliances
US690416822 oct. 20017 juin 2005Fotonation Holdings, LlcWorkflow system for detection and classification of images suspected as pornographic
US690434729 juin 20007 juin 2005Trw Inc.Human presence detection, identification and tracking using a facial feature image sensing system for airbag deployment
US6965694 *27 nov. 200115 nov. 2005Honda Giken Kogyo Kabushiki KaisaMotion information recognition system
US6975763 *11 juil. 200113 déc. 2005Minolta Co., Ltd.Shade component removing apparatus and shade component removing method for removing shade in image
US705008424 sept. 200423 mai 2006Avaya Technology Corp.Camera frame display
US705446822 juil. 200230 mai 2006Honda Motor Co., Ltd.Face recognition using kernel fisherfaces
US706830124 avr. 200227 juin 2006Pinotage L.L.C.System and method for obtaining and utilizing maintenance information
US708577430 août 20011 août 2006Infonox On The WebActive profiling system for tracking and quantifying customer conversion efficiency
US71032157 mai 20045 sept. 2006Potomedia Technologies LlcAutomated detection of pornographic images
US711057021 juil. 200019 sept. 2006Trw Inc.Application of human facial features recognition to automobile security and convenience
US7188307 *8 nov. 20016 mars 2007Canon Kabushiki KaishaAccess system
US722756714 sept. 20045 juin 2007Avaya Technology Corp.Customizable background for video communications
US726929226 juin 200311 sept. 2007Fotonation Vision LimitedDigital image adjustable compression and resolution using face detection information
US7295687 *31 juil. 200313 nov. 2007Samsung Electronics Co., Ltd.Face recognition method using artificial neural network and apparatus thereof
US731563026 juin 20031 janv. 2008Fotonation Vision LimitedPerfecting of digital image rendering parameters within rendering devices using face detection
US731781526 juin 20038 janv. 2008Fotonation Vision LimitedDigital image processing composition using face detection information
US733167129 mars 200419 févr. 2008Delphi Technologies, Inc.Eye tracking method based on correlation and detected eye movement
US736236826 juin 200322 avr. 2008Fotonation Vision LimitedPerfecting the optics within a digital image acquisition device using face detection
US736288520 avr. 200422 avr. 2008Delphi Technologies, Inc.Object tracking and eye state identification method
US737960216 juil. 200327 mai 2008Honda Giken Kogyo Kabushiki KaishaExtended Isomap using Fisher Linear Discriminant and Kernel Fisher Linear Discriminant
US7382903 *19 nov. 20033 juin 2008Eastman Kodak CompanyMethod for selecting an emphasis image from an image collection based upon content recognition
US738897123 oct. 200317 juin 2008Northrop Grumman CorporationRobust and low cost optical system for sensing stress, emotion and deception in human subjects
US744059326 juin 200321 oct. 2008Fotonation Vision LimitedMethod of improving orientation and color balance of digital images using face detection information
US746015014 mars 20052 déc. 2008Avaya Inc.Using gaze detection to determine an area of interest within a scene
US74668665 juil. 200716 déc. 2008Fotonation Vision LimitedDigital image adjustable compression and resolution using face detection information
US747184626 juin 200330 déc. 2008Fotonation Vision LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US751257126 août 200331 mars 2009Paul RudolfAssociative memory device and method based on wave propagation
US756447613 mai 200521 juil. 2009Avaya Inc.Prevent video calls based on appearance
US756503027 déc. 200421 juil. 2009Fotonation Vision LimitedDetecting orientation of digital images using face detection information
US757078529 nov. 20074 août 2009Automotive Technologies International, Inc.Face monitoring system and method for vehicular occupants
US757401626 juin 200311 août 2009Fotonation Vision LimitedDigital image processing using face detection information
US761623326 juin 200310 nov. 2009Fotonation Vision LimitedPerfecting of digital image capture parameters within acquisition devices using face detection
US762021614 juin 200617 nov. 2009Delphi Technologies, Inc.Method of tracking a human eye in a video image
US762021817 juin 200817 nov. 2009Fotonation Ireland LimitedReal-time face tracking with reference images
US763052720 juin 20078 déc. 2009Fotonation Ireland LimitedMethod of improving orientation and color balance of digital images using face detection information
US763410930 oct. 200815 déc. 2009Fotonation Ireland LimitedDigital image processing using face detection information
US765003414 déc. 200519 janv. 2010Delphi Technologies, Inc.Method of locating a human eye in a video image
US76525935 oct. 200626 janv. 2010Haynes Michael NMethod for managing a parking lot
US7660445 *17 avr. 20089 févr. 2010Eastman Kodak CompanyMethod for selecting an emphasis image from an image collection based upon content recognition
US766830425 janv. 200623 févr. 2010Avaya Inc.Display hierarchy of participants during phone call
US76846309 déc. 200823 mars 2010Fotonation Vision LimitedDigital image adjustable compression and resolution using face detection information
US768822522 oct. 200730 mars 2010Haynes Michael NMethod for managing a parking lot
US76933115 juil. 20076 avr. 2010Fotonation Vision LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US77021365 juil. 200720 avr. 2010Fotonation Vision LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US770657628 déc. 200427 avr. 2010Avaya Inc.Dynamic video equalization of images using face-tracking
US780916230 oct. 20085 oct. 2010Fotonation Vision LimitedDigital image processing using face detection information
US784407630 oct. 200630 nov. 2010Fotonation Vision LimitedDigital image processing using face detection and skin tone information
US784413510 juin 200930 nov. 2010Tessera Technologies Ireland LimitedDetecting orientation of digital images using face detection information
US784854930 oct. 20087 déc. 2010Fotonation Vision LimitedDigital image processing using face detection information
US785304314 déc. 200914 déc. 2010Tessera Technologies Ireland LimitedDigital image processing using face detection information
US785573726 mars 200821 déc. 2010Fotonation Ireland LimitedMethod of making a digital camera image of a scene including the camera user
US786027430 oct. 200828 déc. 2010Fotonation Vision LimitedDigital image processing using face detection information
US786499011 déc. 20084 janv. 2011Tessera Technologies Ireland LimitedReal-time face tracking in a digital image acquisition device
US791224520 juin 200722 mars 2011Tessera Technologies Ireland LimitedMethod of improving orientation and color balance of digital images using face detection information
US79168975 juin 200929 mars 2011Tessera Technologies Ireland LimitedFace tracking for controlling imaging parameters
US791697124 mai 200729 mars 2011Tessera Technologies Ireland LimitedImage processing method and apparatus
US795325116 nov. 201031 mai 2011Tessera Technologies Ireland LimitedMethod and apparatus for detection and correction of flash-induced eye defects within digital images using preview or other reference images
US80052658 sept. 200823 août 2011Tessera Technologies Ireland LimitedDigital image processing using face detection information
US803191411 oct. 20064 oct. 2011Hewlett-Packard Development Company, L.P.Face-based image clustering
US80504653 juil. 20081 nov. 2011DigitalOptics Corporation Europe LimitedReal-time face tracking in a digital image acquisition device
US805502918 juin 20078 nov. 2011DigitalOptics Corporation Europe LimitedReal-time face tracking in a digital image acquisition device
US805509014 sept. 20108 nov. 2011DigitalOptics Corporation Europe LimitedDigital image processing using face detection information
US806465329 nov. 200722 nov. 2011Viewdle, Inc.Method and system of person identification by facial image
US813518423 mai 201113 mars 2012DigitalOptics Corporation Europe LimitedMethod and apparatus for detection and correction of multiple image defects within digital images using preview or other reference images
US815539726 sept. 200710 avr. 2012DigitalOptics Corporation Europe LimitedFace tracking in a camera processor
US815540129 sept. 201010 avr. 2012DigitalOptics Corporation Europe LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US816031229 sept. 201017 avr. 2012DigitalOptics Corporation Europe LimitedPerfecting the effect of flash within an image acquisition devices using face detection
US816528230 août 200624 avr. 2012Avaya Inc.Exploiting facial characteristics for improved agent selection
US821373720 juin 20083 juil. 2012DigitalOptics Corporation Europe LimitedDigital image enhancement with reference images
US82240393 sept. 200817 juil. 2012DigitalOptics Corporation Europe LimitedSeparating a directional lighting variability in statistical face modelling based on texture space decomposition
US82431828 nov. 201014 août 2012DigitalOptics Corporation Europe LimitedMethod of making a digital camera image of a scene including the camera user
US825159715 oct. 201028 août 2012Wavecam Media, Inc.Aerial support structure for capturing an image of a target
US82706743 janv. 201118 sept. 2012DigitalOptics Corporation Europe LimitedReal-time face tracking in a digital image acquisition device
US832064119 juin 200827 nov. 2012DigitalOptics Corporation Europe LimitedMethod and apparatus for red-eye detection using preview or other reference images
US83260668 mars 20104 déc. 2012DigitalOptics Corporation Europe LimitedDigital image adjustable compression and resolution using face detection information
US833083116 juin 200811 déc. 2012DigitalOptics Corporation Europe LimitedMethod of gathering visual meta data using a reference image
US834511430 juil. 20091 janv. 2013DigitalOptics Corporation Europe LimitedAutomatic face and skin beautification using face detection
US83799172 oct. 200919 févr. 2013DigitalOptics Corporation Europe LimitedFace recognition performance using additional image features
US838479330 juil. 200926 févr. 2013DigitalOptics Corporation Europe LimitedAutomatic face and skin beautification using face detection
US838561011 juin 201026 févr. 2013DigitalOptics Corporation Europe LimitedFace tracking for controlling imaging parameters
US84330506 févr. 200630 avr. 2013Avaya Inc.Optimizing conference quality with diverse codecs
US849423225 févr. 201123 juil. 2013DigitalOptics Corporation Europe LimitedImage processing method and apparatus
US84942865 févr. 200823 juil. 2013DigitalOptics Corporation Europe LimitedFace detection in mid-shot digital images
US849845226 août 200830 juil. 2013DigitalOptics Corporation Europe LimitedDigital image processing using face detection information
US850380027 févr. 20086 août 2013DigitalOptics Corporation Europe LimitedIllumination detection using classifier chains
US850949616 nov. 200913 août 2013DigitalOptics Corporation Europe LimitedReal-time face tracking with reference images
US850956127 févr. 200813 août 2013DigitalOptics Corporation Europe LimitedSeparating directional lighting variability in statistical face modelling based on texture space decomposition
US85151388 mai 201120 août 2013DigitalOptics Corporation Europe LimitedImage processing method and apparatus
US857761616 déc. 20045 nov. 2013Aerulean Plant Identification Systems, Inc.System and method for plant identification
US859354217 juin 200826 nov. 2013DigitalOptics Corporation Europe LimitedForeground/background separation using reference images
US864960423 juil. 200711 févr. 2014DigitalOptics Corporation Europe LimitedFace searching and detection in a digital image acquisition device
US86759912 juin 200618 mars 2014DigitalOptics Corporation Europe LimitedModification of post-viewing parameters for digital images using region or feature information
US868209716 juin 200825 mars 2014DigitalOptics Corporation Europe LimitedDigital image enhancement with reference images
Classifications
Classification aux États-Unis382/118, 382/204, 382/201
Classification internationaleG07C9/00, A61B5/117, H04N7/28, H04H60/59, G06K9/62, H04N7/26, G06K9/00, H04H60/45, H04H1/00, H04H60/56
Classification coopérativeH04N19/00387, H04N19/00, H04N21/42201, H04N19/00963, G06K9/00241, H04H60/45, H04H60/56, G06K9/6232, G06K9/6247, G06K9/00228, A61B5/1176, G07C9/00158, H04H60/59, G06K9/00275
Classification européenneH04N21/422B, G07C9/00C2D, A61B5/117F, H04N7/26, G06K9/00F2H, H04N7/28, G06K9/00F1H, G06K9/62B4P, G06K9/62B4, H04N7/26J4, G06K9/00F1, H04H60/45, H04H60/56
Événements juridiques
DateCodeÉvénementDescription
25 juil. 2005PRDPPatent reinstated due to the acceptance of a late maintenance fee
Effective date: 19990112
30 juin 2005FPAYFee payment
Year of fee payment: 12
30 juin 2005SULPSurcharge for late payment
24 août 2000FPAYFee payment
Year of fee payment: 8
24 août 2000SULPSurcharge for late payment