WO2004086751A2 - Method for estimating logo visibility and exposure in video - Google Patents

Method for estimating logo visibility and exposure in video Download PDF

Info

Publication number
WO2004086751A2
WO2004086751A2 PCT/CH2004/000182 CH2004000182W WO2004086751A2 WO 2004086751 A2 WO2004086751 A2 WO 2004086751A2 CH 2004000182 W CH2004000182 W CH 2004000182W WO 2004086751 A2 WO2004086751 A2 WO 2004086751A2
Authority
WO
WIPO (PCT)
Prior art keywords
tuples
logo
pattern
points
invariants
Prior art date
Application number
PCT/CH2004/000182
Other languages
French (fr)
Other versions
WO2004086751A3 (en
Inventor
Sergei Startchik
Original Assignee
Sergei Startchik
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sergei Startchik filed Critical Sergei Startchik
Priority to EP20040722778 priority Critical patent/EP1611738A2/en
Publication of WO2004086751A2 publication Critical patent/WO2004086751A2/en
Publication of WO2004086751A3 publication Critical patent/WO2004086751A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • G06V20/635Overlay text, e.g. embedded captions in a TV program
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/757Matching configurations of points or features

Definitions

  • Keywords logo visibility, Multimedia, Advertisement, Video sequence, geometric invariants, chromatic invariants, point tuples.
  • the visibility and exposure of logos in video is of important commercial concern for advertisement on television.
  • a logo placed on the flat or round surface appears from different viewpoints, partially visible or occluded.
  • the logo visibility time, its size (relative to the screen), position (relative to the terrain) and percent of the non-occluded part allows to directly compute its advertising impact.
  • This patent discloses a method for automatic computation of the visible part of the given logo in the video sequence. For each frame of the video sequence the method computes four parameters that describe the visibility of a given logo.
  • First parameter is the percentage of the visible part with respect to the whole logo.
  • Second and third are the position and size of the logo with respect to the image.
  • the present invention belongs to the domain of registration (or matching) methods in computer vision.
  • methods of registration of two-dimensional (planar) patterns like logos under pro- jective transformation (modelling the image acquisition by a camera) are addressed.
  • the registration of patterns is based on registration of features of those patterns. Therefore, registration methods are classified with respect to features that are used.
  • some methods use template matching where an example of the logo original is matched to the image by applying all possible transformations to the original and evaluate the match. The best found match is the position of the logo.
  • This method is very sensitive to occlusion, size changes and blur.
  • Another family of methods use local features of the pattern. In particular, if the shape is composed of several curves and edges, comparison of their features can be used as registration. How- ever, the limitation of this method is the requirement for the pattern/logo to have special features th ⁇ t can not be guaranteed in practice. The algorithm is unable to locate logos that do not have such feature s .
  • the advantages of the new method is its capacity to work with real conditions including high occlusion of the logo, illumination changes and blur. These latter conditions are very frequent in sport events (see Fig. 2) thus making the method very useful for this application.
  • the method for registration of visible part of the logo is based on the dense and massive matching of point tuples.
  • Each point tuple is matched using the comparison of its properties described by values that are invariant to geometric transformation produced by a camera and chromatic transformations produced by illumination changes.
  • an example image is used to specify a user-defined zone.
  • the obtained representation is used to search for learned logo in a given video sequence by analysing each individual frame , .
  • the stage of computing visibility of a particular logo in 1 frame consist of registering numerous learned tuples corresponding to example image and verifying their consistency.
  • the stage of registration of individual tuple is explained in section III.1.
  • the stage for computing the registration of a particular logo from registered tuples is described in III.4.
  • the main advantage of the invention is that a top-down approach is used to look for point tuples and that no support region is used to find points. Therefore, high resolution matching can be performed independently of any occlusion.
  • Registering a tuple of several points P , in two images is done by comparing joint geometric and chromatic properties of those points. Depending on the complexity of transformation, involved in image formation a various number of points and their properties is required to perform the registration robustly.
  • the geometric transformation from the plane with the logo in the scene to the sensor of the camera can be modelled by 2D affine, 2D projective or more complex transformation. Some additional transformation is required to account for lens distortion, but in the case of television cameras the distance to the object reduces the distortion effect. If 2D affine transformation is used to model camera, four points are necessary to compute two values that are independent on that transformation and thus characterizes the tuple independently of that information. If a 2D projective transformation is used, already five points are necessary to compute two different geometric invariant values.
  • the known geometric invariant to projective transformation is the two cross-ratios of five points defined by:
  • Another property of the tuple of points that can be used as a stable characteristic is the value based on chromatic values of individual points and that is invariant to illumination changes. If the changing illumination (sun, clouds, reflections) is modelled by the scaling transformation in rgb domain, two points are needed to compute one invariant value. The invariant to this chromatic transformation is the ratio of several rgb values. Therefore, for five points selected above four invariant values can be computed.
  • the search should be done without relying on the pixel grid of the image being searched. Therefore, a sub- pixeling computations should be done during search. This is achieved by selecting instead of individual pixels the triples of neighbour pixels whose intensity (or color) values establish an interval containing the value that is searched for as shown in Fig. 7.
  • the algorithms with subpixel precision is described in the Fig. 6.
  • triangles of pixels (P Q t , R t as shown in Fig. 7) are selected that have values defining an interval containing the value. For example, if the searched intensity value is 128 in graylevel intensity, the triangle of pixels with values 140, 124, 127 does contain the 128 value somewhere between them and thus can be selected.
  • each point S t position can vary inside the triangle P,, Q t , R, , its coordinates are expressed in barycentric form with respect to the three points defining the triangle.
  • P t [x ( (w,, v f ), >>,( «,, v ( )] .
  • four points will be defined with eight parameters.
  • an interpolated intensity value can be computed c u t , v ( ) .
  • the geometric constraint is satisfied by checking whether the corresponding comers of triangles P v P 2 , P ⁇ , P , P 5 satisfy the invariant values (withing certain tolerance).
  • the chromatic value is computed by interpolation and the final expression for c 5 will depend on eight parameters c 5 ⁇ u ⁇ , v,, w 2 , v 2 , M 3 , V 3 , U 4 , V 4 ) .
  • c 5 ⁇ u ⁇ , v,, w 2 , v 2 , M 3 , V 3 , U 4 , V 4 ) .
  • the algorithm for learning a suitable representation of a given pattern is outlined in Fig. 4 and illustrated in Fig. 1.
  • an area Learning zone (10) that contains an unoccluded pattern Logo (6) is specified by the user as in Fig. 1. Then, this area is searched for tuples of points that are representative in this whole area. In other words, tuples need to be found whose properties (geometric and chromatic invariant values) are sufficiently different from properties of all other point tuples in the learning zone. Tuples, satisfying such unicity conditions will not have similar tuples in the whole learning area and could, thus, be easily found at the search stage without mismatches.
  • the algorithm of search for such unique tuples is described in Fig. 4.
  • the Fig. 1 illustrates the operation of representative tuple research.
  • the Candidate tuple (14) is selected so that all points have different color or intensity values to increase discrimination between points.
  • a fixed neighbourhood Search zone (9) is defined around each of the points of the tuple in the Image frame (8).
  • Potential neighbour tuple (13) from those search zones its geometric and chromatic invariants are computed and their values are compared with invariants of the Candidate tuple (14) that is being tested for unicity.
  • the Search zone (9) is equivalent to the whole Learning zone (10) to obtain full unicity.
  • Tuples that satisfy this unicity criterion are stored as part of logo representation.
  • the use of those pixels in other tuples is reduced (they, however, will be used in neighbor tuples for comparison).
  • the redundancy of representation is important, the participation of each pixel in a fixed number of tuples (for example ten). One avoids the use of one pixel in too many tuples to avoid extreme dependence on this point.
  • the learning process will gradually use all the pixels in the Learning zone (10) (c.f. Fig. 1 ). If each pixel will be used several times in one or more Learned tuple (11) this area will be covered with several "layers" of tuples in a dense manner. With more tuples using a pixel, the whole representation will gain in reliability. At the end the whole representation will correspond to the large number of tuples characterised by their absolute intensity values, geometric invariant values and relative chromatic. When such dense representation is constructed, it can then be used for locating this logo in a new image.
  • the operation of locating a logo in a new image corresponds to the search of learned tuples in this image.
  • the whole image will be analysed for the presence of the learned tuples one by one.
  • the density of representation with tuples allows to deal with almost random occlusion since every point is virtually related to (at least) four other points at various parts of the image.
  • the Logo (6) can be occluded in any way by an Occluding object (7) as shown in Fig. 2, there will always be visible points not hidden by this object and that are related by tuples.
  • the position of the logo frame is estimated. Every learned tuple contain information about its position relatively to the Learning zone (10). This information (which is in fact a transformation) could be inversed and the position of the Learning zone (10) (or reference frame) obtained from the position of the tuple. There will be false matches between tuples, therefore taking the frarne position confirmed by the majority of found tuples allows to find the right frame. This frame is also used to compute the position of the logo on the screen as well as its size relative to the screen.
  • the visible part This is estimated as the number of pixels inside the frame covered by found tuples to the total number of pixels in the frar ⁇ e. This is done by computing the ratio of visually found points to the total number of points associated with particular logo. A different measure can be applied since not all points are visible all the time and something relative to pixels can be applied.
  • Another operation that can be performed once all the point tuples were identified, is the modification of the visible part of the logo in order to improve its visual quality or provide a neat image of the logo. If the number of the identified points is sufficient, one can add many other points that can improve the resolution of the logo if it is viewed as a remote object.
  • the last operation that can be performed is the replacement of the logo by visual information that is perceptually different from the logo appearance. This operation might be useful for hiding the logo, if its visibility in this video sequence is not desired.
  • This transformation can be modelled with several approximating transformations like for example scaling of each of rgb channels, linear transformation in rgb space, etc.
  • the scaling transformation corresponds to the scaling of the every chromatic channel by an independent scaling factor:
  • these invariant values for every pair of pixels in the tuple. Then, these values will play a role of additional constraint for selection of tuples. This selection will thus be independent of the illumination changes, all pairs that have that value are retrieved. In the search image, these values can be computed only by combining several (N-tuples). Even if the number of those corfc. bination is high we are obliged to compute them since no other information is available.
  • This st ⁇ p is an optional step in the search algorithm that is outlined in
  • Fig. 1 An example image of a logo and tuple being learned
  • Fig. 5 Algorithm for finding learned 5-tuple in the new image (pixel-based algorithm).
  • Fig. 6 Algorithm for finding learned 5-tuple in the new image (sub-pixel algorithm).
  • the algorithms of learning tuples and searching them can be implemented as an image processing software modules. This software can be run on a DSP within an embedded system or on a stardard computer that is connected to a camera.

Abstract

The visibility and exposure of logos in video is of important commercial concern for advertisement on television. During a tennis match for example a logo placed on the flat or round surface appears from different viewpoints, partially visible or occluded. The logo visibility time, its size (relative to the screen), position (relative to the terrain) and percent of the non-occluded part allow to directly compute its advertising impact. This patent discloses a method to automatically compute the visible part of the given logo in the video sequence. For each frame of the video sequence the method computes four parameters that describe visibility of a given logo. First, it is the percentage of the visible part with respect to the whole logo. Second and third, are the position and size of the logo with respect to the image. Fourth, is the quality (blur, motion) of the visible part. The advantages of the new method is its capacity to work with real video sequences including high occlusion, illumination changes and blur. These latter conditions are realistic thus making the method very useful and robust for this application. (Fig. 1) The obtained information can be used for logo modification or replacement as well as for its quality enhancement.

Description

Method for estimating logo visibility and exposure in video.
References cited
US 2003/0016921 Al
TW 434520 US 61,00,941 WO 0007367 US 4817166 WO 0152547
Keywords: Logo visibility, Multimedia, Advertisement, Video sequence, geometric invariants, chromatic invariants, point tuples.
10 claims, 7 drawing sheets
I. TECHNICAL FIELD OF THE INVENTION
The visibility and exposure of logos in video is of important commercial concern for advertisement on television. During a tennis match for exapmle a logo placed on the flat or round surface appears from different viewpoints, partially visible or occluded. The logo visibility time, its size (relative to the screen), position (relative to the terrain) and percent of the non-occluded part allows to directly compute its advertising impact. This patent discloses a method for automatic computation of the visible part of the given logo in the video sequence. For each frame of the video sequence the method computes four parameters that describe the visibility of a given logo. First parameter is the percentage of the visible part with respect to the whole logo. Second and third are the position and size of the logo with respect to the image. Fourth, is the quality (motion blur) of the visible part.
The present invention belongs to the domain of registration (or matching) methods in computer vision. In particular, methods of registration of two-dimensional (planar) patterns like logos under pro- jective transformation (modelling the image acquisition by a camera) are addressed. The registration of patterns is based on registration of features of those patterns. Therefore, registration methods are classified with respect to features that are used.
II. BACKGROUND ART
Technically speaking, some methods use template matching where an example of the logo original is matched to the image by applying all possible transformations to the original and evaluate the match. The best found match is the position of the logo. This method is very sensitive to occlusion, size changes and blur. Another family of methods use local features of the pattern. In particular, if the shape is composed of several curves and edges, comparison of their features can be used as registration. How- ever, the limitation of this method is the requirement for the pattern/logo to have special features th^t can not be guaranteed in practice. The algorithm is unable to locate logos that do not have such feature s.
Existing patents have a distant relation to the current invention. For example US 2003/0016921 A suggests a logo changing system. Other patents related to the application of logo (or other pattern) location in a video sequence are TW 434520, US 61,00,941, WO 0007367. Methods in adjacent applications were also explored in US 4817166. The aspect of logo modification in a video sequence is covered in WO 0152547.
The advantages of the new method is its capacity to work with real conditions including high occlusion of the logo, illumination changes and blur. These latter conditions are very frequent in sport events (see Fig. 2) thus making the method very useful for this application.
III. DISCLOSURE OF THE INVENTION
The method for registration of visible part of the logo is based on the dense and massive matching of point tuples. Each point tuple is matched using the comparison of its properties described by values that are invariant to geometric transformation produced by a camera and chromatic transformations produced by illumination changes.
The general algorithm of registration is presented on Fig. 3 and comprises several stages explained in the following sections.
First, an example image is used to specify a user-defined zone. Second, as many locally unique individual tuples as possible are found from that zone and their geometric and chromatic features stored are computed. The information about each tuple as well as the relative position of each tuple with respect to the user-defined zone is stored. Learning of locally unique tuples from example image is described in section III.3. The set of all tuples correspond to logo representation.
Third, the obtained representation is used to search for learned logo in a given video sequence by analysing each individual frame , . The stage of computing visibility of a particular logo in 1 frame consist of registering numerous learned tuples corresponding to example image and verifying their consistency. The stage of registration of individual tuple is explained in section III.1. The stage for computing the registration of a particular logo from registered tuples is described in III.4.
The main advantage of the invention is that a top-down approach is used to look for point tuples and that no support region is used to find points. Therefore, high resolution matching can be performed independently of any occlusion.
1. Registering an individual tuple
Registering a tuple of several points P , in two images is done by comparing joint geometric and chromatic properties of those points. Depending on the complexity of transformation, involved in image formation a various number of points and their properties is required to perform the registration robustly.
The geometric transformation from the plane with the logo in the scene to the sensor of the camera (and so the image) can be modelled by 2D affine, 2D projective or more complex transformation. Some additional transformation is required to account for lens distortion, but in the case of television cameras the distance to the object reduces the distortion effect. If 2D affine transformation is used to model camera, four points are necessary to compute two values that are independent on that transformation and thus characterizes the tuple independently of that information. If a 2D projective transformation is used, already five points are necessary to compute two different geometric invariant values. The known geometric invariant to projective transformation is the two cross-ratios of five points defined by:
Figure imgf000004_0001
where the determinants are defined as:
Figure imgf000004_0002
where xl,y[ are the coordinates of the point coordinates in the image. These two invariants will be used to search for matching tuples.
Another property of the tuple of points that can be used as a stable characteristic is the value based on chromatic values of individual points and that is invariant to illumination changes. If the changing illumination (sun, clouds, reflections) is modelled by the scaling transformation in rgb domain, two points are needed to compute one invariant value. The invariant to this chromatic transformation is the ratio of several rgb values. Therefore, for five points selected above four invariant values can be computed.
First, we describe the way individual tuples are searched in the image. Several types of transformations are considered here. Let us assume, first, that no illumination changes are present and the geometric transformation is the 2D projective transformation. Adaptation to illumination changes is described in section 5. Thus, a five-tuple will be used for search. A tuple of five points is characterized by two geometric invariants Iχ ,/2 and five chromatic values cχ, c2, c3, c4, c5. The search tries to find a tuple satisfying such values.
The general algorithm of finding a tuple of points in the image that satisfy this constraint is described in the figure Fig. 5. First, the areas of the color c, are found in the image as well as areas of other four colors c2 ,c3 ,c4 ,c5 . Then, tuples of points that strictly satisfy the geometric constraint of the projective invariants /, ,/2 are found. Finally, the tuple that have the minimum of the difference between values of original chromatic invariant and values of chromatic values of found points:
. „ . learned. ... m"»∑(c| - cι ) (3)
is retained to obtain the best position of the tuple.
2. Optimal searching with subpixel accuracy
If the learning of point tuples is performed in the pixel-based manner in the original image, the search should be done without relying on the pixel grid of the image being searched. Therefore, a sub- pixeling computations should be done during search. This is achieved by selecting instead of individual pixels the triples of neighbour pixels whose intensity (or color) values establish an interval containing the value that is searched for as shown in Fig. 7. The algorithms with subpixel precision is described in the Fig. 6.
First, for each of five points, instead of individual pixels, triangles of pixels (P Qt, Rt as shown in Fig. 7) are selected that have values defining an interval containing the value. For example, if the searched intensity value is 128 in graylevel intensity, the triangle of pixels with values 140, 124, 127 does contain the 128 value somewhere between them and thus can be selected.
Such selection of triangles is done for each point as shown in Fig. 7. Now each point St position can vary inside the triangle P,, Qt, R, , its coordinates are expressed in barycentric form with respect to the three points defining the triangle. This form defines each point position with two parameters «,, v, , so Pt = [x((w,, vf), >>,(«,, v()] . Thus four points will be defined with eight parameters. For each of the four points an interpolated intensity value can be computed c ut, v() .
Using these points definitions, one can compute the position of the 5th point whose position will thus be defined with eight parameters uχ, v,, w2> v2> w3, v3, u4, v4 and /,, 72 .
The geometric constraint, expressed with geometric invariant values, is satisfied by checking whether the corresponding comers of triangles Pv P2, P^, P , P5 satisfy the invariant values (withing certain tolerance).
For the fifth point the chromatic value is computed by interpolation and the final expression for c5 will depend on eight parameters c5{u{, v,, w2, v2, M3, V3, U4, V4) . Computing a chromatic value with those expressions c(( ) and subtracting a learned values, one obtains the error function for tuple position:
min , , » v) - c r"e ("(> v,)) ( )
Then, an interpolated values are computed and a close-form solution is produced. Minimising this expression with respect to the eight parameters give the position of the five points inside the triangles.
3. Learning the representative tuples
The algorithm for learning a suitable representation of a given pattern is outlined in Fig. 4 and illustrated in Fig. 1.
First, an area Learning zone (10) that contains an unoccluded pattern Logo (6) is specified by the user as in Fig. 1. Then, this area is searched for tuples of points that are representative in this whole area. In other words, tuples need to be found whose properties (geometric and chromatic invariant values) are sufficiently different from properties of all other point tuples in the learning zone. Tuples, satisfying such unicity conditions will not have similar tuples in the whole learning area and could, thus, be easily found at the search stage without mismatches.
The algorithm of search for such unique tuples is described in Fig. 4. The Fig. 1 illustrates the operation of representative tuple research. First, the Candidate tuple (14) is selected so that all points have different color or intensity values to increase discrimination between points.
Then, a fixed neighbourhood Search zone (9) is defined around each of the points of the tuple in the Image frame (8). For each tuple of neighbouring points Potential neighbour tuple (13) from those search zones its geometric and chromatic invariants are computed and their values are compared with invariants of the Candidate tuple (14) that is being tested for unicity. In practice, the Search zone (9) is equivalent to the whole Learning zone (10) to obtain full unicity. Also, instead of comparing values of each Candidate tuple (14) with all Potential neighbour tuple (13) in the Search zone (9), one need to compute invariant values for all possible tuples in the Learning zone (10) and select those that do nσt have similar ones.
The similarity between the Candidate tuple (14) and Potential neighbour tuple (13) is measured according to several criterii. First, the difference of absolute intensity values of corresponding points should be satisfied:
(.candidate _ (.neighbor _ < T C5)
Second the difference between geometric invariant values should be below certain threshold: jcandidate _ /neighbor _ § < y
(6)
/candidate _ /neighbor _ § < y 2. 2. Z o,
Finally, the difference between chromatic values for the whole tuple values of individual points and differences between combined chromatic values: x £-^ ( ,<?,( ,«,, v, \) - C l|earned (,",, v, .).) < Δ . ΛGg ( ,7_)
Tuples that satisfy this unicity criterion are stored as part of logo representation. The use of those pixels in other tuples is reduced (they, however, will be used in neighbor tuples for comparison). The redundancy of representation is important, the participation of each pixel in a fixed number of tuples (for example ten). One avoids the use of one pixel in too many tuples to avoid extreme dependence on this point.
4. Massive matching with many tuples
The learning process, described above, will gradually use all the pixels in the Learning zone (10) (c.f. Fig. 1 ). If each pixel will be used several times in one or more Learned tuple (11) this area will be covered with several "layers" of tuples in a dense manner. With more tuples using a pixel, the whole representation will gain in reliability. At the end the whole representation will correspond to the large number of tuples characterised by their absolute intensity values, geometric invariant values and relative chromatic. When such dense representation is constructed, it can then be used for locating this logo in a new image.
The operation of locating a logo in a new image corresponds to the search of learned tuples in this image. The whole image will be analysed for the presence of the learned tuples one by one. The density of representation with tuples allows to deal with almost random occlusion since every point is virtually related to (at least) four other points at various parts of the image. The Logo (6) can be occluded in any way by an Occluding object (7) as shown in Fig. 2, there will always be visible points not hidden by this object and that are related by tuples.
Once all tuples (and thus all points) that belong to the logo were identified, one can proceed with their processing. First, the position of the logo frame is estimated. Every learned tuple contain information about its position relatively to the Learning zone (10). This information (which is in fact a transformation) could be inversed and the position of the Learning zone (10) (or reference frame) obtained from the position of the tuple. There will be false matches between tuples, therefore taking the frarne position confirmed by the majority of found tuples allows to find the right frame. This frame is also used to compute the position of the logo on the screen as well as its size relative to the screen.
Once the reference frame is constructed, one can compute the visible part. This is estimated as the number of pixels inside the frame covered by found tuples to the total number of pixels in the frarπe. This is done by computing the ratio of visually found points to the total number of points associated with particular logo. A different measure can be applied since not all points are visible all the time and something relative to pixels can be applied.
When the logo position and its visible part were identified, several postprocessing operations can be invoked. First, the points inside the logo frame that were not identified as belonging to logo (and thus belong to the occluding object) can be restored to the usual values of the logo. This operation would remove the occluding object and restore the full visibility of the logo.
Another operation that can be performed once all the point tuples were identified, is the modification of the visible part of the logo in order to improve its visual quality or provide a neat image of the logo. If the number of the identified points is sufficient, one can add many other points that can improve the resolution of the logo if it is viewed as a remote object.
Finally, the last operation that can be performed is the replacement of the logo by visual information that is perceptually different from the logo appearance. This operation might be useful for hiding the logo, if its visibility in this video sequence is not desired.
5. Searching with illumination changes
When illumination changes occur, the chromatic values in the observed image are transformed. This transformation can be modelled with several approximating transformations like for example scaling of each of rgb channels, linear transformation in rgb space, etc. The scaling transformation corresponds to the scaling of the every chromatic channel by an independent scaling factor:
(8)
Figure imgf000007_0001
Therefore, absolute chromatic values of the analysed image cannot be used directly. The only reliable information that remains is the chromatic invariants that are computed from the color values of the points in the tuple and independent of such transformation.
Chromatic invariants for this diagonal transform require chromatic values of two points for its computation:
/ - r i J\ - - £ ,l i l\ - - Λ —b (9) r2 82 b2
Therefore, we will compute these invariant values for every pair of pixels in the tuple. Then, these values will play a role of additional constraint for selection of tuples. This selection will thus be independent of the illumination changes, all pairs that have that value are retrieved. In the search image, these values can be computed only by combining several (N-tuples). Even if the number of those corfc. bination is high we are obliged to compute them since no other information is available.
Since the intensity values are discreet, a table can be constructed to perform this search. This st^p is an optional step in the search algorithm that is outlined in
IV. BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 An example image of a logo and tuple being learned
Fig. 2 Example of logo occlusion.
Fig. 3 General algorithm for estimating visibility.
Fig. 4 Algorithm for findind the candidate five-point.
Fig. 5 Algorithm for finding learned 5-tuple in the new image (pixel-based algorithm).
Fig. 6 Algorithm for finding learned 5-tuple in the new image (sub-pixel algorithm).
Fig. 7 Subpixel search
V. DESCRIPTION OF THE PREFERRED EMBODIMENT
This section, describes the implementation of the described invention and how it can be realized in practice.
The algorithms of learning tuples and searching them can be implemented as an image processing software modules. This software can be run on a DSP within an embedded system or on a stardard computer that is connected to a camera.

Claims

VI. WE CLAIM:
1. A method for automatic computation of the presence, position and visibility of 2D surface patterns (such as logos) in a video sequence based on matching sets of point tuples, related by geometric and illumination invariants, learned from example image and registered in every frame of the video sequence.
2. A method as in claim 1 where the point tuples correspond to the groups of five or more points linked by 2D projective geometric invariants, groups of four or more points linked by 2D affine geometric invariants or sets of more than five points linked by invariants of transformations of higher order including lens distortion.
3. A method as in claim 1 where the tuples of four, five or more points are characterised with chromatic invariants that are independent on illumination changes approximated by scaling or linear transformation in RGB space.
4. A method as in claim 1 where the algorithm of learning representative tuples of the pattern and their relationship from example image of the pattern consist of several steps:
- take an image from the video sequence that contains the logo and select the reference frame (or rectangle) containing the fully visible pattern
- find point tuples that have geometric and chromatic invariant properties that are unique with respect to the whole pattern inside the reference frame
- store the properties of each tuple in indexing table for fast access
- store relative coordinates of each tuple found with respect to the reference rectangle of the pattern.
5. A method as in claim 1 where the registration of tuples is done in a top-down manner without using any type of surrounding support region of each individual point.
6. A method as in claim 1 where every point of the learned pattern is used in several representative tuples that include points from various parts of the pattern thus making the representation of the pattern dense, distributed and redundant.
7. A method as in claim 1 where the algorithm of registering a learned pattern in a given image frame of a video sequence consist of the following steps:
- the search for all possible tuples from the learned pattern
- computation of the area of the visible part of the logo by evaluating the area covered by found tuples with respect to the whole pattern area
- estimation of the position, orientation and visual quality of the pattern
8. A method as in claim 1 where a geometric invariants of tuples are attributed a canonical value corresponding to the frontal view of the pattern in order to rectify the observed image to the frontal view of a pattern or to recover the orientation of the camera with respect to the plane that bears this pattern.
9. A method as in claim 1 where the pattern registered with numerous tuples can be completed with tuples from its model that were not registered in order to achieve full visibility of the pattern with increased quality; the registered tuples can be replaced with information that is perceptually different from the original logo in order to make the pattern unrecognizable by human.
10. A method as in claim 1 where sets of point tuples characterising the logo as well as geometric and illumination invariants characterising the tuples can be used to estimate the similarity between different logos for similarity search in logo databases.
PCT/CH2004/000182 2003-03-27 2004-03-24 Method for estimating logo visibility and exposure in video WO2004086751A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP20040722778 EP1611738A2 (en) 2003-03-27 2004-03-24 Method for estimating logo visibility and exposure in video

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CHPCT03/00199 2003-03-27
CH0300199 2003-03-27

Publications (2)

Publication Number Publication Date
WO2004086751A2 true WO2004086751A2 (en) 2004-10-07
WO2004086751A3 WO2004086751A3 (en) 2005-02-03

Family

ID=33035093

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CH2004/000182 WO2004086751A2 (en) 2003-03-27 2004-03-24 Method for estimating logo visibility and exposure in video

Country Status (2)

Country Link
EP (1) EP1611738A2 (en)
WO (1) WO2004086751A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008109608A1 (en) * 2007-03-05 2008-09-12 Sportvision, Inc. Automatic measurement of advertising effectiveness
US7783130B2 (en) 2005-01-24 2010-08-24 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Spatial standard observer
CN106792153A (en) * 2016-12-01 2017-05-31 腾讯科技(深圳)有限公司 A kind of video labeling processing method and processing device
CN107153809A (en) * 2016-03-04 2017-09-12 无锡天脉聚源传媒科技有限公司 A kind of method and device for confirming TV station's icon
CN113469216A (en) * 2021-05-31 2021-10-01 浙江中烟工业有限责任公司 Retail terminal poster identification and integrity judgment method, system and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4817166A (en) 1986-05-05 1989-03-28 Perceptics Corporation Apparatus for reading a license plate
WO2000007367A2 (en) 1998-07-28 2000-02-10 Koninklijke Philips Electronics N.V. Apparatus and method for locating a commercial disposed within a video data stream
TW434520B (en) 1998-06-30 2001-05-16 Sony Corp Two-dimensional code recognition processing method, device therefor and medium
WO2001052547A1 (en) 2000-01-14 2001-07-19 Koninklijke Philips Electronics N.V. Simplified logo insertion in encoded signal
US20030016921A1 (en) 2001-07-23 2003-01-23 Seung-Kyu Paek Video reproducing/recording system to change a logo image/sound, and method of changing the logo image/sound

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4817166A (en) 1986-05-05 1989-03-28 Perceptics Corporation Apparatus for reading a license plate
TW434520B (en) 1998-06-30 2001-05-16 Sony Corp Two-dimensional code recognition processing method, device therefor and medium
WO2000007367A2 (en) 1998-07-28 2000-02-10 Koninklijke Philips Electronics N.V. Apparatus and method for locating a commercial disposed within a video data stream
US6100941A (en) 1998-07-28 2000-08-08 U.S. Philips Corporation Apparatus and method for locating a commercial disposed within a video data stream
WO2001052547A1 (en) 2000-01-14 2001-07-19 Koninklijke Philips Electronics N.V. Simplified logo insertion in encoded signal
US20030016921A1 (en) 2001-07-23 2003-01-23 Seung-Kyu Paek Video reproducing/recording system to change a logo image/sound, and method of changing the logo image/sound

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DOERMAN D. S. ET AL.: "Logo recognition using geometric invariants", 2ND INT. CONF. ON DOCUMENT ANALYSIS AND RECOGNITION, October 1993 (1993-10-01), pages 894 - 897
G. MEDIONI; G.GUY; H. ROM; A. FRANÇOIS: "Real-Time Billboard Substitution in a Video Stream", PROC. 10TH TYRRHENIAN INTERNAT. WORKSHOP ON DIGITAL COMMUNICATION, 1998, pages 71 - 84

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7783130B2 (en) 2005-01-24 2010-08-24 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Spatial standard observer
US8139892B2 (en) 2005-01-24 2012-03-20 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration (Nasa) Spatial standard observer
WO2008109608A1 (en) * 2007-03-05 2008-09-12 Sportvision, Inc. Automatic measurement of advertising effectiveness
CN107153809A (en) * 2016-03-04 2017-09-12 无锡天脉聚源传媒科技有限公司 A kind of method and device for confirming TV station's icon
CN107153809B (en) * 2016-03-04 2020-10-09 无锡天脉聚源传媒科技有限公司 Method and device for confirming television station icon
CN106792153A (en) * 2016-12-01 2017-05-31 腾讯科技(深圳)有限公司 A kind of video labeling processing method and processing device
CN106792153B (en) * 2016-12-01 2020-07-28 腾讯科技(深圳)有限公司 Video identification processing method and device and computer readable storage medium
CN113469216A (en) * 2021-05-31 2021-10-01 浙江中烟工业有限责任公司 Retail terminal poster identification and integrity judgment method, system and storage medium
CN113469216B (en) * 2021-05-31 2024-02-23 浙江中烟工业有限责任公司 Retail terminal poster identification and integrity judgment method, system and storage medium

Also Published As

Publication number Publication date
EP1611738A2 (en) 2006-01-04
WO2004086751A3 (en) 2005-02-03

Similar Documents

Publication Publication Date Title
Hu et al. Revisiting single image depth estimation: Toward higher resolution maps with accurate object boundaries
JP3740065B2 (en) Object extraction device and method based on region feature value matching of region-divided video
Revaud et al. Epicflow: Edge-preserving interpolation of correspondences for optical flow
CN110210276A (en) A kind of motion track acquisition methods and its equipment, storage medium, terminal
US7162102B2 (en) Method and system for compositing images to produce a cropped image
US7379583B2 (en) Color segmentation-based stereo 3D reconstruction system and process employing overlapping images of a scene captured from viewpoints forming either a line or a grid
US20040252886A1 (en) Automatic video object extraction
Venkatesh et al. Efficient object-based video inpainting
US9982995B2 (en) 3D scanner using structured lighting
CN101512601A (en) Method for determining a depth map from images, device for determining a depth map
CN111445389A (en) Wide-view-angle rapid splicing method for high-resolution images
CN107154017A (en) A kind of image split-joint method based on SIFT feature Point matching
CN110866882A (en) Layered joint bilateral filtering depth map restoration algorithm based on depth confidence
Stentoumis et al. A local adaptive approach for dense stereo matching in architectural scene reconstruction
Matas Colour-based object recognition
CN111402429B (en) Scale reduction and three-dimensional reconstruction method, system, storage medium and equipment
CN113128433A (en) Video monitoring image enhancement method of color migration matching characteristics
EP1611738A2 (en) Method for estimating logo visibility and exposure in video
CN110717910B (en) CT image target detection method based on convolutional neural network and CT scanner
Torres-Méndez et al. Reconstruction of 3d models from intensity images and partial depth
CN112733624B (en) People stream density detection method, system storage medium and terminal for indoor dense scene
JPH11510351A (en) Apparatus and method for object tracking
Patricio et al. A similarity-based adaptive neighborhood method for correlation-based stereo matching
Ross et al. Learning static object segmentation from motion segmentation
Palma et al. Enhanced visualization of detected 3d geometric differences

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2004722778

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2004722778

Country of ref document: EP