WO1999052063A1 - Feature motivated tracking and processing - Google Patents

Feature motivated tracking and processing Download PDF

Info

Publication number
WO1999052063A1
WO1999052063A1 PCT/IL1998/000383 IL9800383W WO9952063A1 WO 1999052063 A1 WO1999052063 A1 WO 1999052063A1 IL 9800383 W IL9800383 W IL 9800383W WO 9952063 A1 WO9952063 A1 WO 9952063A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
tracking
entity
border
information
Prior art date
Application number
PCT/IL1998/000383
Other languages
French (fr)
Inventor
Ehud Spiegel
Semon Kogan
Genady Lesov
Inna Stainvas
Boris Epshtein
Original Assignee
Automedia Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Automedia Ltd. filed Critical Automedia Ltd.
Publication of WO1999052063A1 publication Critical patent/WO1999052063A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning

Definitions

  • the present invention relates to the processing of video streams, and in particular to mixing and performing special visual effects on the stream.
  • Video post-production is a field in which existing video streams are processed, to generate visual effects that are either not possible or very expensive to create in real life.
  • Two main types of visual effects are "special effects”, in which the content of the video stream is modified and “compositing”, in which additional information, possibly from one or more video streams and/or a computer generated animation object, is combined with the first video stream.
  • a main tool for video post-production is the image mask, which is generally the size of the video image and which defines a mask to be applied to the processing of each individual video image.
  • a mask having non-zero values in an area the size and shape of an object to be inserted, may be used to define parameters for inserting the object into a video frame. Each non-zero value corresponds to a relative weighting of a pixel from the object to be inserted and of a pixel from the video frame.
  • a video image is segmented into objects and/or into layers, each of which corresponds to a distance. Thereafter, the video is processed to generate the desired effects.
  • video post-processing refers to processing of video images for post-production purposes. Some special effects are applied at borders of objects. For example, since an object in motion will have blurred edges, one post-processing technique adds blurring to the border of a composited object mask. In other cases, for example semi-transparent objects, the effects may be applied over the entire object. In another example, noise may be added over an entire object.
  • video comprises a stream of consecutive images, so a mask should preferably be defined for each image.
  • key-framing a mask is manually defined for a small number of "key- frames" and the mask is interpolated for video frames between the key-frames.
  • the border of the object or the entire object is tracked between consecutive images, after the border is manually entered once.
  • PCT publication WO/97/06631 of February 20, 1997 and U.S. Patent application No. 08/692,297, titled “Apparatus and Method for Object Tracking", both by inventors E. Spiegel and Y. Pastor, the disclosures of which are incorporated herein by reference, describe a method of identifying and tracking "special" points on a border of an object. The location of the border itself may then be determined from the relative location of the special points, by matching an approximation of the border to the underlying image or by extrapolation for portions of the border which are invisible.
  • Four point tracking is an image annotation system in which four points on the image, which include prominent features, are tracked and an object is inserted into the picture so that it is attached to the points.
  • An example of four point tracking is found in the Adobe® After- Effects® Pro software, version 3.1, available from Adobe Systems Inc., 411 First Avenue South, Seattle, WA 98104-2871, USA.
  • the portions comprise lines, preferably user- defined sections of borders.
  • the portions comprise user-defined areas of objects.
  • video sequence is used to describe a sequence of images that may be displayed as cine, however, there is no requirement that the images be from a video source.
  • the video source may be a camera. Alternatively, it may be a synthetic source.
  • Another object of some preferred embodiments of the invention is to provide a method of video scene rescripting, in which various video entities are identified and allowed to interact with each other.
  • each video entity or type of entity has a behavior associated therewith. When the video sequence is played, these behaviors cause the entities to interact, preferably adding effects to the scenes and/or changing the content of the scene. Examples of changing the scene include, but are not limited to adding, removing and changing the speed and direction of objects, removing artifacts such as wires, adding visual effects such as halos and annotations, and changing the way objects interact in the scene, such as making an elastic collision inelastic.
  • the term "scene rescripting" the term "scene rescripting"
  • a new scenario may be provided for the video sequence, which scenario defines actions to be "performed" by video entities in particular frames or groups of frames.
  • the video entities may interact with image portions from the current and/or
  • the entities may interact- with or their behavior may be affected by- externally provided data or inputs.
  • video entities may interact with virtual entities that are not part of- and/or are not visible in- the video sequence.
  • the entities may export their external state and/or interactions with other entities in order to drive external programs.
  • the video entities are related to image content.
  • a video entity may correspond to groups of objects.
  • a video entity may correspond to a portion of an object, such as a hand or a finger.
  • a video entities may be arbitrary, preferably defined to help generate a special effect.
  • an entity may be defined to be an area to which an object is to be attached.
  • an entity may include a border section of an object, as many interactions occur at the borders of objects.
  • a video entity may have information associated therewith, including a three-dimensional representation of a real world object which the entity corresponds to.
  • a video entity may be created dynamically as a result of the interaction of two or more video entities.
  • entity interactions may include interactions between entities in different frames.
  • all or most of the interactions will be automatic.
  • at least some of the interactions may be manually directed.
  • manual direction includes identification of entities and entering parameters that modify the interactions.
  • a user may be prompted, during the interactions, to make certain decisions, for example, in unanticipated situations or where an especially high quality is desired.
  • the post-processing is related to insertion of the object into a video stream.
  • the post-processing may be embodied in a mask.
  • the post-processing may be applied differently on the two sides of the border.
  • Such information may include one or more of: identification, motion vector, type of effect(s) to perform, parameters of the effects to perform, width (which may be multi-pixel, sub-pixel or even an inf ⁇ nitesimally thin line), type of color, graphical elements expected at and/or near the border, direction of inside and outside, type of tracking to perform, parameters relating to tracking, type of estimation to perform when tracking is lost, type of key- frame interpolation to perform and parameters relating to the interpolation, type of border, the shape of the border, type of interpolation to perform for regenerating the border from what is actually tracked between frames, type of an animation to apply per border, allowed distortion, allowed smoothing and a location in the current, a previous or later frame from which to copy information into the current frame, for example for wire removal. Additionally or alternatively, the information may include ranges of valid values for
  • the information associated with video entities is dynamic information, which may change between frames.
  • dynamic information is a function of global parameters, which are relevant to the entire frame and/or sequence portion.
  • the information may be a function of other information (dynamic or static).
  • the information may be a function of the activity of individual video entities.
  • the information may be dependent on the frame number and/or a frame count to a key-frame (previous, future, minor, major or any combination).
  • the information may be dependent on a history of the video stream.
  • the information may be dependent on the image content at a portion of the image.
  • an image portion is defined using frame coordinates.
  • it may be defined relative to video entity coordinates, especially if the video entity is tracked between images.
  • the coordinate system used may be based on outside-supplied coordinates, image segmentation, video layers, distance information, coordinates of the scene and or objects therein, preferably based on a model of the scene, camera coordinates and/or any combination thereof.
  • the information may be dependent on parameters of the photography used for generating the scene, such as pan speed and zoom angle.
  • such information is stored with the video sequence.
  • it may be determined by analyzing the sequence.
  • the information may be dependent on information associated with other video entities.
  • the content may be of portions of the current and/or in previous or later frames.
  • the coordinates of an image portion are taken from a current frame and applied to a previous and/or later frame, for example for an effect where an object dissolves.
  • the image content is from before post-processing is applied to the frame.
  • the information may depend on the content of an image portion after it has been partially or completely post processed and/or on a comparison between a post processed and non-post processed portion.
  • the information may be dependent on externally provided input and/or on information associated with the same or other video entities, such as user defined sections. It should be appreciated that combinations and/or dependencies of the above information affectors are also considered to be in the scope of some embodiments of the present invention.
  • dynamic information being dependent on the relative location of two objects, an increasing amount of blurring at a section the longer its exact position cannot be ascertained by tracking and changes in the image content at a location relative to one border section affecting the dynamic information associated with a different section or with an entire border.
  • such information may be dependent on the content of the image, for example, a different amount of blurring for each color, or for different intensities.
  • the type of interpolation performed for key-framing may be dependent on whether the border section is represented as a spline or as a raster (a free hand drawing in which each pixel is individually positioned).
  • a border may be tracked and/or adjusted to be aligned with an object border, by defining a range of color values to which all the pixels, either inside the object or on its background, belong.
  • the range of colors may be dynamically determined, for example, based on the average color of the background.
  • dynamic information is evaluated using a function.
  • the function is one of a vendor-supplied library of functions, whose parameters may be modified by a user.
  • a user may provide a user defined function.
  • the information may be provided as a table.
  • the information may be provided as a script, preferably
  • an order may be defined between the evaluation of the information and/or between the effecting of the postprocessing dictated by the information.
  • the results of applying the method in the two orders are not necessarily the same, as the information evaluation depends on what is happening in the image, while the post-processing is geared towards generating an image which better fulfills a user's desire.
  • a user may control, at runtime of the post-processing, an evaluation of the dynamic information and/or the postprocessing, by manipulating various parameters.
  • Such parameters may be manipulated before post-processing, during post processing and/or by stopping the post processing and then restarting or continuing it with new parameters.
  • the parameters may be entered by a user using standard input devices, such as a mouse, keyboard, graphical user interface, touch screen and voice input. Additionally or alternatively, values for such parameters may be entered using dedicated input devices, for example, a motion sensor.
  • such parameters are provided from an external source, such as a computer generating graphics to be incorporated into the image, a second video stream and/or additionally video and/or computer equipment. These sources may also be used to provide the dynamic information itself.
  • Another aspect of the present invention relates to tracking arbitrary user defined border sections.
  • Such arbitrary user defined border sections may contain no distinctive features. In some cases, even correlation-based tracking of such sections will not be possible.
  • computer selected border points and border segments may also be user-defined as sections.
  • the arbitrary user defined border section comprise a higher order of organizational level, as they are preferably defined relative to such segments and points.
  • a user defined section may start or end at any point along a border, even one which is not tracked directly by the computer.
  • border sections may overlap.
  • a border section may include more than one segment.
  • border segment indicates a segment of a border which is directly tracked by a computer, preferably by tracking end points of the segments but alternatively or additionally by correlation tracking or by using other methods.
  • border section indicates a user defined section of the border, which has a logical meaning to
  • the definition of the user defined section may be completely divorced from any technical considerations relating to tracking. Rather, it is preferably related only to image content and/or the desired post-processing.
  • the "user defined sections" may be automatically suggested by a computer based on a model of the object and/or based on image characteristics and/or based on an imprecise indication by the user.
  • a step of image segmentation and/or object detection may be performed automatically to help a user decide on user-defined sections.
  • a user can indicate an object, its border or a portion thereof to be a video entity.
  • the system tracks the effect of various parameter values on the operation of various aspects of the video post-processing and the extent to which these aspects are successful.
  • This information may be associated with a particular video entity, a particular frame, a group of frames, a scene, a type of video sequence, a type of scene and/or a type of entity. Additionally or alternatively, this information may be associated with various parameters and/or evaluated criteria of the entity, frame, scene or sequence, for example a frame's average background level or a section's smoothness.
  • One example of information which is preferably learned is what types of prominent points can be effectively tracked, for a particular type of border.
  • the computer selects points to use as anchors (one or more) for an arbitrary tracked section, when segments are selected for tracking and/or when a user selects anchor points, it is possible to select points of a type which are expected to have a higher probability of success.
  • the indication is made on more than one frame. These frames may be consecutive or, in some preferred embodiments of the invention, there may be gaps between the frames.
  • the border as indicated in one frame and/or sections thereof and/or prominent points thereon may be directly copied to the next frame, for adjustment by the system and/or by the user.
  • the user instructs the system to "track" the border from one frame to the next, optionally after adjustment by the user. Tracking by the system may also be used by the user to assess whether the indicated border will be properly tracked. In a preferred embodiment of the invention, the tracking between non-consecutive frames is used as an aid in entering key-frame information.
  • the user when the user indicates a segment or a section, he associates information with it.
  • the information may be entered by inputting data into a structured data file, using a graphical user interface, interpolation of values using a spatial distribution function, interpolation of values between key-frames, using a script and/or combinations of these methods.
  • the information and any other properties may also be entered using scripts associated with this or other video entities.
  • Another aspect of some preferred embodiments of the present invention relates to object overlapping.
  • an overlap of user defined sections is detected.
  • the overlapped area may be treated as a special type of section having special properties which may be a combination of the two sections' properties.
  • a property of the sections may be which section's properties overrides the other section's properties.
  • the post-processing system defines two borders, one for each object, so that when the amount of overlap changes, each object may be properly post-processed.
  • the system automatically estimates the extent and location of a previously overlapped or otherwise hidden edge.
  • an object may be split into two and/or a common border may be defined between the objects.
  • the information associated with one or more of the sections includes whether the section is hidden, in contact with another video entity, partly occluded and/or occluding another entity, and/or combinations of these conditions. Additionally or alternatively, such information preferably includes the conditions under which a new section is created and/or its default properties.
  • various scripts are defined for interactions between types of objects, for example between elastic and in-elastic objects or object borders or for interaction between animation objects and video objects. It should be appreciated that, in many cases, a video entity will have associated with it a zone of influence.
  • the zone will surround the entity.
  • the zone may be on only one side of the entity, may overlap the entity and/or may be located at a distance from the entity.
  • the size, shape, location and behavior of the zone are associated with each entity and constitute dynamic information.
  • more than one zone of influence may be associated simultaneously with each entity.
  • different scripts are defined for each such zone.
  • a spline is useful for approximating a curve, however, it may be difficult and/or time-consuming for a user to draw an appropriate spline.
  • a spline is stored as a set of spline parameters, end points and, optionally, additional control points.
  • a spline can be drawn using the stored parameters, after they are matched to the current frame conditions.
  • a user draws a border in free-hand style and/or using graphical tools.
  • at least a portion of the border may be drawn by color keying, i.e., as a border between the keyed area and the non-keyed area.
  • a user draws a border using a combination of such drawing methods.
  • the system then creates a spline that approximates what was drawn or a portion thereof.
  • a user does not accurately follow the border being marked. In some cases this is intentional, in others it may be unintentional.
  • the user may indicate to the system which points along the drawn border are exact and which are only approximate. Alternatively or additionally to indicating points, the user may indicate graphic primitives and/or border portions which should be adjusted.
  • the system prompts the user to provide such an indication for points that are either far from any feature (for example, a high contrast area) or very close to such a feature.
  • the user is prompted for an indication only regarding points that are used to define the graphic primitives used to enter the border.
  • the user is prompted for such an indication regarding prominent points.
  • any point which is a junction between more than two lines and/or three or more colors and/or a point at which there is a large angle between meeting lines and/or colors is considered to be a possible candidate for correction and a user may be prompted for an indication by the computer.
  • points at which the underlying border of the object deviates to a substantial amount from an estimated border are considered to be candidates.
  • the system may suggest one or more possible features which the point is to be anchored to.
  • the system attempts to match the drawn border to an automatically detected border of the object, to determine such features and/or required corrections. Preferably, this determination is performed by tracking the drawn border to an underlying object border in the current frame.
  • a user can indicate a position offset from such a feature to the point. This offset may be explicitly entered or may implicitly entered based on what was drawn.
  • the entire graphic primitive associated with the point is moved.
  • more than one pixel may be grouped by the system and/or by a user, to be treated as a single graphic primitive.
  • user defined sections and/or other video entities are entered using similar techniques.
  • the user indicates which points correspond to ends and control points of user defined sections.
  • Another aspect of some preferred embodiments of the present invention relates to automatically suggesting and/or generating tracking parameters for a video entity, such as a particular border portion.
  • user entered parameters or tracking methods may be tested and the user may be informed of the quality of- and/or problems with- his parameters.
  • the system suggests a type of tracking (spline, correlation, color-key) based on geometrical and/or color characteristics of the drawn or actual border and/or its vicinity.
  • spline tracking may be suggested for a smooth area.
  • Correlation tracking may be suggested for a unique looking area.
  • Color-key tracking may be suggested for a highly delicate area, such as hair, which has a distinct color.
  • the type of tracking is determined based on the behavior of the border portion over a sequence of frames.
  • the behavior of the border may be analyzed over the entire length of the sequence, possible skipping some frames. Frame skipping is especially useful when entering data for key-frames.
  • changes in the border are analyzed to determine which type of tracking and/or which parameter values would be most useful.
  • the tracking method and/or parameters are determined by trial-tracking the section with various possible tracking schemes.
  • the type of interpolation to use for drawing a border segments based on tracking of its end points and/or control points may also be determined, based on the changes in the form of the tracked border.
  • the type of tracking which is performed for invisible segments is also automatically evaluated and/or suggested.
  • such analysis is based on the smoothness of motion of a previously hidden border when it becomes visible.
  • other types of user entered information may be evaluated, for example, static information, dynamic information or scripts.
  • Such evaluation is preferably performed automatically, without user intervention, so it can proceed faster than real-time.
  • the evaluation is performed only on a subset of the images,
  • the evaluation is preferably based on an image quality criterion and/or on the number of generated exceptions.
  • a user may define part of a border to be dictated by one type of graphic primitive and another border portion to be dictated by another.
  • edges of a square table will preferably be dictated to be straight lines, while the outlines of ornamental legs of the table may be approximated by splines.
  • Still other parts of the object may be approximated by other graphic primitives, other splines and/or other parameters for the splines.
  • some parts of the border may be dictated to match a raster or free-hand definition.
  • the border graphic definition will include fixation points around which the graphic may be rotated and/or stretched, if the tracked border moves or changes shape. This is especially applicable to raster definition, for which, in some cases no changes will be desired, while in others, at least rotation will be desired.
  • raster borders are treated as a plurality of border segments, each of which is one pixel or less in size and which are preferably each anchored to a border feature.
  • a user defines an area and/or a boundary thereof to be tracked, and a tracking system tracks the area over a sequence of images.
  • some or all of the features described herein above with respect to boundary sections may also be applied to user defined areas, as well as video painting and different post-processing inside and outside the area.
  • an object to be inserted is anchored to such a user defined area. When the object on which the area is defined turns (in space), the size and/or shape of the area change and the inserted object is warped accordingly.
  • the inserted object may be automatically warped to match a warping of that body in the video sequence.
  • a single portion in a frame is automatically retouched based on a plurality of frames, for example for performing an averaging operation.
  • two portions of the frame are simultaneously retouched, each one in a different manner and/or from different source frames and/or different portions thereof.
  • wire removal in which support wires are removed by replacing their image with image pixels from a previous and/or a later frame.
  • the wire is automatically tracked.
  • areas in a previous and/or later frame which match the wire area are preferably automatically identified, for example as explained with respect to tracking arbitrary areas.
  • the copied image pixels are warped to fit with the current frame.
  • a smoothing filter is applied.
  • pixels from the same frame may also be used, for example for copying a portion of a sky or a portion of a complex pattern.
  • the areas from which the pixels are copied are tracked in all the frames so that there is a correspondence therebetween.
  • retouching preferably utilizes a 3D representation of the retouched area, which is preferably stored as static or dynamic information.
  • the type of retouching performed is used to make a composited object appear to have been generated in the same manner and/or same photographic conditions as the rest of the scene.
  • the retouching may be responsive to more than one source area, for example, one area is used to provide the color balancing and another area is used to provide the pixel data.
  • tracking of arbitrary areas and border sections is achieved by defining a positional relationship between points which
  • These special points may be tracked using methods in the above described PCT publication and U.S. application or by correlation tracking or by other tracking methods known in the art.
  • features of the image may be identified anew for every image.
  • the tracking of arbitrary areas or border sections is based on additional points that are tracked solely for the pu ⁇ ose of tracking the arbitrary portion. Such points may be internal to the object on which the arbitrary portion is defined, even for border section tracking. Alternatively or additionally, such points may be border points or even be external to the object.
  • the arbitrary tracked border sections, points and/or areas are calculated areas, whose size, shape, orientation and/or associated information may be calculated from the points which are actually tracked, from other tracked entities or based on dynamic information, as described herein.
  • borders are tracked by tracking segments of the border.
  • Each such segment is preferably tracked by tracking its end points.
  • the tracking may be by other methods, for example correlation tracking, feature based tracking and/or other tracking methods as described in the above referenced PCT and US applications.
  • the form of the segment is preferably corrected to match the actual border shape.
  • the border is represented as a list of points which can be tracked. When necessary, new points may be added (for example, to fit a major change in the contour of an object). Additionally, points may be removed or ignored for a few frames, if segments of the border shrink or disappear.
  • the points actually identified in a current frame are matched to a list of points of the border.
  • each point is matched to a point in a region of interest having a size and/or location dependent on the motion of the border segment.
  • the matching of the point lists is done using a disparity analysis method.
  • the updated list of points is used for matching border segments in future frames.
  • the above described postprocessing methods are applied to image compression, for example for video conferencing.
  • a different compression scheme and/or depth may be applied to each area.
  • the identification of areas for which different compression may be applied to each area.
  • these areas are automatically identified, albeit only once.
  • these areas may be identified for key- frames and/or when there is an error in the identification. Typically, it is much simpler computationally to detect a mismatch than to correct it.
  • these areas are identified by comparing the image with an image from a previous video conference, preferably for the same persons and/or setting, in which the sections and/or areas were identified.
  • the identification of areas is performed remotely by a server that accepts one or more frames from a beginning of the sequence and transmits back information regarding a suggested area breakdown for tracking and compression.
  • a server may be manually operated.
  • it employs a powerful computer, preferably with powerful pattern matching capability and/or object recognition ability.
  • the post-processing is applied to portions of a frame which are spaced, either spatially or temporally from the border section being tracked.
  • Such post processing may be applied in addition to or instead of processing at the section and/or may be of a different type.
  • such post processing includes throwing a shadow, relative to an externally provided location of a light source and scene geometry.
  • such post processing includes filling the object with a pattern, whose origin is relative to an arbitrary point on the boundary or in the object.
  • a laser beam and a future explosion are added to a video scene, based on a current orientation and location of a "blaster gun" object.
  • a method of video post processing a video sequence comprising: identifying a first video entity in a least one frame of the video sequence; identifying at least a second video entity in the video sequence; and automatically performing post-processing on a plurality of frames of the video sequence responsive to an interaction of the two entities.
  • a method of video post processing a video sequence comprising: tracking a first video entity across a plurality of frames of said sequence;
  • said first and second post-processing operations utilize different parameter values for a similar post-processing operation.
  • said first and second postprocessing operations comprise different post processing techniques.
  • said two entities comprise portions of a single object.
  • at least one of said first and said second postprocessing operations comprises adding a shadow.
  • at least one of said first and said second post-processing operations comprises compositing.
  • at least one of said first and said second post-processing operations comprises blurring.
  • at least one of said first and said second post-processing operations comprises applying a local filter.
  • at least one of said first and said second post-processing operations comprises smearing.
  • at least one of said first and said second post-processing operations comprises color adjustments.
  • at least one of said first and said second postprocessing operations comprises local color keying.
  • At least one of said first and said second post-processing operations comprises local addition of noise.
  • at least one of said first and said second post-processing operations comprises graphical annotation.
  • at least one of said first and said second post-processing operations comprises wrinkle removal.
  • at least one of said first and said second post-processing operations comprises video painting.
  • at least one of said first and said second postprocessing operations comprises object resizing.
  • at least one of said first and said second post-processing operations comprises wa ⁇ ing.
  • said first video entity is tracked independently of said second video entity.
  • a method of video post processing a video sequence comprising:
  • the method comprises changing the information between frames.
  • the information is changed responsive to image content.
  • the content is at a location remote from the location of the entity with which it is associated.
  • the method comprises changing said information responsive to a performance of said tracking.
  • the method comprises changing said information responsive to a performance of said tracking.
  • the method comprises changing said information responsive to information associated with a different entity.
  • the method comprises changing said information in a current frame responsive to information associated with a different frame.
  • the method comprises changing said information responsive to a frame-count parameter.
  • the method comprises changing said information responsive to a history of values for said information.
  • the method comprises changing said information responsive to a user input.
  • said first video entity is tracked independently of said second video entity.
  • said information comprises a type of tracking to use for that entity.
  • said information comprises a type classification of the entity.
  • said information comprises a color of the entity.
  • said information comprises graphical annotation data.
  • said information comprises at least part of an appearance/disappearance profile for the entity in the video sequence.
  • said information comprises a smoothness of the entity.
  • said information comprises a visibility of the entity.
  • said information comprises anchor points.
  • said information comprises at least one motion vector.
  • said information comprises parameters for key-framing.
  • said information comprises depth information.
  • said type of tracking may include an instruction not to track the entity.
  • said type of tracking fixes at least one degree of freedom of motion of the entity.
  • said at least one degree of freedom of motion of the entity Preferably, said at least one degree
  • said type of tracking fixes at least one limitation on changes in the entity.
  • said limitation comprises a size change limitation.
  • said information comprises a parameter of the tracking for that entity.
  • said information comprises a script.
  • said script comprises a script for performing video post processing.
  • said script comprises a script for updating said information.
  • a method of video post processing a video sequence comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; and identifying an interaction between said two entities.
  • said first video entity is tracked independently of said second video entity.
  • said interaction is an overlap between said two entities.
  • the method comprising generating a new video entity as a result of said interaction.
  • said new entity is a shared border between two video objects.
  • said new entity replaces at least a portion of at least one of the first and second entities.
  • the method comprises modifying at least one of said first and second entities, in response to said interaction.
  • the method comprises deleting at least one of said first and second entities, in response to said interaction.
  • said entities comprise arbitrary points.
  • said entities comprise arbitrary areas.
  • said entities comprise arbitrary lines.
  • said lines overlap borders of objects.
  • said lines no not overlap borders of objects.
  • said entities comprise portions of the video frames which are directly tracked.
  • said entities have extents and wherein said extents are related to an image content.
  • a method of video post processing a video sequence comprising: tracking a video entity across a plurality of frames in the sequence;
  • said tracking comprises tracking prominent points and wherein said parameter comprises a rule for identifying a point as a prominent point.
  • a method of spline fitting comprising: manually entering a graphical object, comprising of more than one graphical primitive of more than one type; and generating a spline which matches at least a portion of a border of said entered graphical object and which meets an e ⁇ or criterion.
  • the method comprises automatically sub-dividing said graphical object, at prominent points of the object and generating an individual spline for each sub-division.
  • said prominent points comprise points of maximum local curvature.
  • said prominent points are selected based on a local pattern.
  • said prominent points are selected based on a local noise analysis.
  • said prominent points are selected based on being a junction of three or more colors.
  • the method comprises automatically subdividing said graphical object, at non-prominent points of the object and generating an individual spline for each sub-division.
  • the method comprises adjusting parameters for said spline, such that loops in the spline are avoided.
  • a method of object border definition for tracking across a plurality of frames of a video sequence comprising: defining a first portion of said border as a first type of graphic primitive; defining at least a second portion of said border as a second, different, type of graphic primitive; and tracking said first and second portions across said plurality of frames.
  • at least one of portions is defined using a free-hand graphical object.
  • at least one of said portions is defined using a spline graphical object.
  • at least one of said portions is defined using a color-key.
  • the method comprises tracking said first and said second borders across a plurality of frames in a video sequence.
  • said object border comprises an incomplete border.
  • said object border comprises a complete border.
  • a method of spline fitting comprising: manually entering a graphical object, comprising of more than one graphical primitive of more than one type, which object comprises an outline corresponding associated with an object; automatically adjusting at least a portion of said outline to match at least a portion of said object; and generating a spline which matches at least a portion of said adjusted outline.
  • a method of video post processing a video sequence comprising: arbitrarily identifying an area on a frame of said sequence; tracking said arbitrary area across a plurality of frames of said sequence; and applying a video post-processing to said image responsive to said tracking.
  • a method of video post processing a video sequence comprising: arbitrarily identifying a line on a frame of said sequence; and tracking said arbitrary line across a plurality of frames of said sequence; applying a video post-processing to said image responsive to said tracking.
  • said arbitrary line comprises a portion of a border of an object.
  • said arbitrary identification is divorced from technical considerations regarding tracking.
  • a method of video post-processing planning comprising: manually identifying a video entity; automatically suggesting at least one parameter related to tracking of said entity; and automatically tracking said entity across a plurality of frames of a video sequence.
  • said parameter comprises a type of tracking.
  • a method of video post processing a video sequence comprising:
  • At least one of said image portions is in the same frame as the retouched frame.
  • said at least one image portion comprises at least two image portions in the same frame as the retouched frame.
  • At least one of said image portions is in a different frame from the retouched frame.
  • said at least one image portion comprises a plurality of image portions, at least two of which are in different frames.
  • said retouching comprises copying said at least one of said one or more image portion into the retouched frame.
  • said retouching comprises texture mapping said at least one image portion into the retouched frame.
  • said retouching comprises wa ⁇ ing said at least one image portion into the retouched frame.
  • said retouching comprises blending said at least one image portion into the retouched frame.
  • retouching comprises color-balancing responsive to said at least one image portion.
  • retouching comprises graining responsive to said at least one image portion.
  • retouching comprises noise adding responsive to said at least one image portion.
  • retouching comprises adjusting an image of an inserted object, to match scene characteristics, responsive to said at least one image portion.
  • a method of three-dimensional surface reconstruction comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; and reconstructing a surface representation of at least a portion of an object, responsive to said tracking.
  • said entities comprise lines. Alternatively or additionally, said entities are attached to said object. Preferably, said entities comprise border portions of at least one object.
  • said first and second tracking comprise tracking changes in the shape of said entities. Alternatively or additionally, said first and second tracking comprise tracking changes in the relative locations of said entities.
  • the method comprises performing a post processing operation on said video sequence, responsive to said reconstructed surface.
  • the method comprises modifying a tracking of an entity associated with said surface, responsive to said reconstruction.
  • Fig. 1 is a flowchart of a method of video post-processing, in accordance with a preferred embodiment of the invention
  • Fig. 2 is a schematic diagram showing an object having edges, special points, segments and sections defined thereon, in accordance with a prefe ⁇ ed embodiment of the invention
  • Fig. 3 is a schematic diagram illustrating a border definition method, in accordance with a prefe ⁇ ed embodiment of the invention.
  • Fig. 4 is a schematic diagram of an arbitrarily defined area to be tracked, in accordance with a prefe ⁇ ed embodiment of the invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • Fig. 1 is a flowchart of a method of video post-processing, in accordance with a prefe ⁇ ed embodiment of the invention.
  • a user indicates on a first video frame an approximate border of an object to which post-processing is to be applied (10). Such indication may be at least partially performed using color-keying, functional definition of the border, template matching and matching to a previous frame, as well as by drawing of one or more graphical primitives.
  • a computer may suggest a better fit between the drawn border and the object or a portion thereof.
  • a user then co ⁇ ects the computer's suggestion(12) and may repeat the first step.
  • the user then defines sections of the border to be tracked (14). It should be appreciated that, in some cases, a user will start with this step, dispensing with steps (10) and (12). Alternatively or additionally, the user can directly identify user defined sections by drawing them, rather than marking them on a previously drawn border. Finally, a user defines dynamic information to be evaluated and or post-processing to be performed for each border section. It should be appreciated that a section defined by a user for post-processing pu ⁇ oses may sometimes co ⁇ espond exactly to a segment selected by the computer for tracking pu ⁇ oses.
  • the border definition effects the creation of an object (or video) mask that performs a desired post-production effect.
  • the definition and tracking of the border section provides location coordinates and/or other parameters to a post-processing function.
  • more complex effects may be performed, for example wire removal, which includes replacing the removed pixels.
  • object speed changing which may include adding blurring, copying of missing ("uncovered") information by copying from a previous or later image and/or wa ⁇ ing of object portions whose image is affected by the speed of motion.
  • the information associated with each section may be outputted so that it can be used by an external device.
  • this device comprises a plug-in, which may be executed on the same computer.
  • the information may be outputted to a different computer.
  • the information is outputted to a video source, to control it, for example, the information may be outputted to a graphical object generator.
  • each of these and other post-processing effects may be a function of information associated with each user defined border section.
  • any information may be associated with each section, especially information relating to the drawing and analyzing of the frame.
  • the information associated includes one or more of:
  • (11) depth information which may be provided by a 3D camera or by analysis of the image.
  • Dynamic information is information which may change between video frames. Such changes may be a function of:
  • history such as history of dynamic information and/or image content
  • different post-processing is performed based on the value of the dynamic information, for example, a different amount of motion blurring may be performed depending on the velocity of motion.
  • the direction of the blurring is preferably dependent on the vector of motion, which vector may be different both in magnitude and in direction, for different sections.
  • information is made dynamic by providing a script to be performed to evaluate the dynamic information for each section.
  • a function is provided to evaluate the dynamic information.
  • a script is associated with possible events for each section. Such events may include: first appearance, last appearance, disappearance, overlapping with another section or object, interaction with an external event, being at a position, deviating from a track, loss of tracking, size change and rate of change.
  • different scripts for the same pu ⁇ ose, may be associated with various states of the system, the tracked video entities and/or user defined states.
  • a set of properties of a section to be tracked which may be dependent on local (possibly dynamic) information, relate to the tracking itself.
  • different sections have different types of tracking associated therewith.
  • the type of tracking may change based on the history of the information at the section.
  • each section may have different values for parameters relating to tracking, for example, in a co ⁇ elation tracking method, the size of an ROI (region of interest) in which to perform the co ⁇ elation or the size of convolution kernel for the co ⁇ elation.
  • Another example of a tracking relating parameter is the type of position estimation to use, especially the equations of motion to be used.
  • a tracking related parameter is relative weights for directly tracked points.
  • Another example is the range of values used for color-key based tracking.
  • One type of border tracking may be based on there being a distinct different between what is inside an object and what is outside, for example using statistical definitions, even though there is no demarcation of the border on a pixel by pixel basis.
  • the border may be better defined by analyzing a region of interest around the border. For example, the border of a pair of blue pants on a blue background can approximated by the existence of creases in the pants and not in the background.
  • the size of the region of interest may be a property of the section being tracked and/or of its underlying segments or tracked points. As can be appreciated the above properties and parameters may also be dynamic information.
  • video entities especially border sections can be defined as not to be tracked.
  • This definition is especially useful for rigid objects, for which, if one portion is tracked, the entire outline can be reconstructed.
  • a considerable amount of computer resources that might otherwise be devoted to tracking can be conserved.
  • some entities may be defined to be locked in position, so that even an approximation of their position is not necessary.
  • various restrictions may be placed on the allowed changes in location and/or orientation of the entity, for example, an object may only be allowed to move, move at less than a certain maximal velocity and/or rotate around a particular point.
  • various restrictions may be placed on an allowed change and/or rate of change of an entity's shape and/or pixel content.
  • the user when such restrictions reduce the accuracy of the tracking and/or special effects, the user is automatically notified.
  • the system automatically advances to a higher level of tracking ability, in which some of the restrictions may be removed.
  • a particular property of a section to be tracked is the method used for reconstruction of a border segment based on tracked portions of the border.
  • Values of this property may include using spline based inte ⁇ olation or other graphic primitive based inte ⁇ olation, different types of splines, fixed (exact) border based reconstmction and the level of precision to use for the tracking.
  • parameters that affect the level of precision include whether to adjust the spline parameters periodically to match the real border and/or whether to allow adding and/or removing of points which are tracked and/or what number of points may be added and/or removed.
  • Other examples of precision affecting information is the type of tracking for the underlying segments and/or the limitations placed on changes in the shape of a raster (free-hand) segment.
  • Animation is another particular property that may be associated with a section to be tracked.
  • an animation property may include a definition of a procedure which generates the animation.
  • the animation and/or the procedure may be dependent, inter alia, on dynamic information and/or on a history of dynamic information.
  • dynamic information is utilized for key-framing.
  • Information associated with user defined sections may include: type of inte ⁇ olation, which key-frames to use for inte ⁇ olation, the number of frames allowed between key frames, the amount of motion allowed between key-frames. Different key-frames may be defined for different sections.
  • the information is inte ⁇ olated, but not necessarily for all the frames and/or for all the sections.
  • the information is inte ⁇ olated for the same section between frames, possibly using key-framing. Altematively or additionally, the information is inte ⁇ olated between two or more sections and or between values defined for two or more points defined on a section.
  • overlapping of borders is detected and/or tracked. Preferably, special scripts are associated with such overlapped borders.
  • two border sections are preferably defined at the overlapping portion, thus, the two objects can later separate.
  • Fig. 2 is a schematic diagram showing an object 20 having edges, special points, segments and sections defined thereon, in accordance with a prefe ⁇ ed embodiment of the invention.
  • a plurality of points 22, 24, 26 and 30 are distinguished as being suitable for being tracked between consecutive images. Such points are preferably automatically selected by the tracking system, for example as described in the above-referenced PCT publication.
  • a segment 22-24 is defined as a border segment which connects the two tracked points. As indicated above, this segment may be tracked by tracking its end points and reconstructing it using a stored spline or another graphic primitive. Preferably, the reconstructed border segment and/or stored spline parameters are adjusted to match the exact border of the object being tracked. By providing a large number of points to be tracked, it is generally possible to associate a border in one image with a border in a second image, even if some of the points are not identified.
  • a user defined section 28 (indicated by the dotted lines) may be defined arbitrarily and divorced from the automatically identifiably segments, such as segment 22-24 or 24-26.
  • the end points of segment 28 are identified as being anchored relative to identifiable points (such as points 22, 24, 26 and 30) or segments (such as segments 22-24 and 24-26).
  • identifiable points such as points 22, 24, 26 and 30
  • segments such as segments 22-24 and 24-26.
  • the selection of either points or segments may depend on the tracking method used to identify the edge over a sequence of images.
  • the anchors may comprise other directly identifiable features, for example, a point on object 32, which is internal to object 20.
  • interior and/or prominent and/or anchor points are selected based on their being a junction, a local pattern, a color, a transparency, a local variance of noise, an angle, a color and/or gray-scale spectra, a statistically significant difference in intensity, blurring profile, edges, or on frequency analysis and/or based on a combination of these characteristics.
  • endpoints of the sections and/or control points therein are defined as being a fixed distance and/or orientation from certain anchors. Alternatively a range of distances is allowed.
  • the position is defined as a relative distance between a plurality of points.
  • the relative positioning is dependent on more global parameters, such as the size of the object (which can be estimated from the length of an appropriately defined user defined section or of a particular segment).
  • a control point may be used to force a minimum distance from the control point to an anchor.
  • an optional stickiness parameter may be defined.
  • a designated portion of the section or an end or a control point thereof are automatically repositioned to be aligned with- or positioned relative to- a prominent point on the object.
  • the prominent point is predefined by a user.
  • the point may be automatically detected by the system.
  • this stickiness is restricted to a small number of pixels.
  • the location of an elbow section be identified with the bend of the arm in the picture. In one prefe ⁇ ed embodiment of the invention, this bend is automatically detected.
  • the user indicates the prominent point in a later frame and the system backtracks to a first frame to define the elbow location and/or to set parameters for the stickiness.
  • the system backtracks to modify a stickiness parameter, so that the motion of the user defined section will more closely resemble the actual motion. For example, it is usually undesirable for the user-defined section to move in a non- smooth motion only because the prominent point suddenly appear or disappears.
  • a user defined section may be an arbitrary line which does not follow a border.
  • such a line may be used to split an object into two portions, each of which is to be post-processed differently.
  • the user defined section may be defined to be unrelated to the contour of the border, but only to locations of points on the border.
  • such a line may be used to define a special effect that is either inside the object or outside the object, for example a skeleton or a halo.
  • the user-defined section may be defined to substantially conform to at least a portion of the border.
  • instead of tracking the border it may be inte ⁇ olated, such as using key-framing.
  • this type of pseudo-tracking is of a
  • sub-sections can be defined relative to the sections, for example, a sub-section 34 of section 28.
  • the sections are organized in an object-oriented hierarchical structure, with one or more levels and/or entities being defined relative to the previous level.
  • a positional relationship is defined between entities on different hierarchical levels.
  • a section in a hierarchical level may also be anchored relative to a level two or more hierarchical stages away.
  • an order may be defined between anchors.
  • a relative weight of the anchors may be defined.
  • various properties may be defined to be inherited from other entities and/or other levels, especially scripts for dealing with events.
  • one or more of the tracked video entities may be invisible or occluded. However, these entities still exist and post processing can be performed responsive to them, for example x-ray viewing of the hidden object through a solid object by drawing a border and a skeleton of the hidden object on the occluding solid object. In addition, other entities and/or effects may be described to be relative to the invisible entity.
  • the segments and/or the special points that are tracked also belong to the hierarchy.
  • sub-segments may be defined on a lower hierarchical level than the segments.
  • information especially information related to tracking, may be associated with each such element.
  • a script for information evaluation and/or video post-processing is defined for some or all the elements in the hierarchy. In some cases a user will desire to directly manipulate the segments and/or points as video entities in addition to or instead of user defined sections.
  • one or more sections to be tracked are automatically suggested, for example based on image characteristics, such as
  • such suggested sections are defined based on a model of the object, the scene and/or the border.
  • a section may be functionally defined, for example: "the clockwise-most x blue pixels in a border segment (or section) where x is a function of the length of the entire segment".
  • a section may be split into two, for example, when an object decomposes and/or explodes.
  • a new section may have to be defined and it may be defined as a function of the determination of the old section.
  • Fig. 3 is a schematic diagram illustrating a border definition method, in accordance with a prefe ⁇ ed embodiment of the invention.
  • An object 40 for which a border is to be defined, is a key.
  • the border is shown distanced from the outline of the key for reasons of clarity. However, in a practical situation, the border will usually overlap the actual border of the key.
  • a user may draw a suggested border (step 10 in Fig. 1) using free-hand drawing and/or using graphic primitives and/or by correlation with existing templates and/or using color keying and/or any combination of the above.
  • a suggested border step 10 in Fig. 1
  • border portion 44 will generally be a straight line and is preferably entered using a line primitive.
  • An expected type of error is that one or both of the end points is not properly located.
  • Border portion 48 is a free-hand pixel-by-pixel drawing which attempts to follow the exact outline of the key. The types of error to be expected are an inexact rendering of the outline (which can also be automatically identified and drawn by the system) and/or an offset from the actual border.
  • Border portion 50 is a spline that follows the curve of the key.
  • Border section 52 is a section which may be entered freehand or by spline and which follows the tip of a tassel on the key.
  • the system automatically identifies possibly erroneous data entry points and queries the user as to whether and how to correct them.
  • a point 54 on the drawn border should probably aligned with point 56 on the key border.
  • Point 56 may be identified by its being a prominent point in the outline of the object.
  • Point 54 may be identified by its being a prominent point on the drawing of the border.
  • Point 58 is intentionally not directly related to the outline of the key, as a user will probably indicate to the system, so it will not be co ⁇ ected.
  • a point When a point is confirmed, it may either be fixed in place relative to one or more identifiable anchor points or co ⁇ ected to be aligned with a prominent point. When a co ⁇ ection is required, the point will generally move. As can be expected, when one point moves it can affect two or more border portions. In one prefe ⁇ ed embodiment of the invention, free-hand border portions may only move and do not bend, while spline portions, may also be defined to bend. Alternatively, a free-hand drawn portion may be split into two such segments, i.e., it can break at prominent points, automatically selected and/or user-suggested. Alternatively, such breaking is allowed only at a distance from prominent points. In addition, the free-hand border portion may be broken so as to minimize an alignment e ⁇ or. Alternatively, a free-hand border portion is treated as a group of segments, each of which is one pixel in size.
  • a spline may be generated from a free-hand drawing, for example for border portion 50.
  • the system identifies prominent points along the drawn portion and/or in the outline of the object and attempts to match one to the other. A minimum number of spline points are maintained which allows a good co ⁇ espondence between the outline and the drawn border.
  • a fixed number of control and/or sub-division points are defined. If any change is required in their number, a user may be prompted to allow a change.
  • the drawn portion that is converted into a spline may also include other graphic primitives, including multiple types of splines.
  • the spline is generated by the following method. A user draws a border, using any of the techniques described herein and optionally adjusts the border to more closely follow the border of the object and/or and arbitrary feature on or off the object. If subdivision of the segment is allowed, the segment is preferably subdivided into sub-segments at prominent points, and each of the sub-segments may be individually approximated (by spline or otherwise).
  • the number of sub-segments into which the segment is split preferably depends on the number of prominent points in the segment. Alternatively or additionally, the number of sub-segments may depend on the number of prominent points which are over a threshold of a prominence criterion. Altematively or additionally, the number of sub-segments depends on the length of the sub-segments.
  • a spline is fitted for a particular sub-segment (or a whole segment if sub-division is not allowed).
  • a method for fitting a spline to an arbitrary line is described for example in "An Algorithm for Automatically Fitting Digital Curves", in Graphic GEMS (vol. 1), edited by
  • the spline fitting takes into account visual artifacts.
  • the angle between the two ends of the spline is too large, especially if each end is directed to a different side of a line connecting the two ends, the angle of each end are preferably made smaller, for example reduced by half or by some other user defined value and/or reduced to below a threshold value. This adjustment preferably increase the convergence speed of the spline fitting. In addition, it reduces the probability that there will be a cusp or a loop in the spline.
  • the handles of the spline cross one or both of them are shortened so that they do not meet, since crossing of the handles may create a loop.
  • parameters of the spline are changes so that the cusp and/or the loop are eliminated, possibly at the expense of a larger e ⁇ or.
  • This threshold may be a constant or a function, possibly of the segment shape and/or length.
  • the fit e ⁇ or may be based on an average e ⁇ or along the border segment, on a weighted e ⁇ or, which takes into account portions where precision is more important (possibly by user input), an allowed variance, the number of sub-divisions and/or iterations performed, an error in other segments, especially neighboring segments and/or it may be based on a cumulative e ⁇ or value. If the e ⁇ or is below the threshold, the fitting is complete. Otherwise, the segment is subdivided again.
  • additional subdivision may be performed at prominent points that, for some reason, were not included in the main subdivision.
  • the subdivision is performed at arbitrary points (automatically selected), which may be directly tracked or which may be indirectly tracked by using prominent points.
  • this process of sub-division and fitting is repeated until the error threshold is met, a certain number of iterations has passed and/or a subdivision limit is reached.
  • the arms will generally be defined as splines and the head and hands as a raster.
  • the hair portion border will generally be defined by color-keying relative to the raster. If a portion of the man, for example his pants are the same color as the background, some segments may be tracked by approximation, for example the knee position may be approximated by the position of the belt and of the shoes. Preferably, such an approximation takes into account the dynamics of the human leg, i.e., if the leg is bent the location of the knee is not on a line connecting the shoe and belt.
  • Fig. 4 is a schematic diagram of an arbitrarily defined area to be tracked, in accordance with a preferred embodiment of the invention.
  • a postprocessing effect such as object insertion
  • an object 60 is a man, and it is desired to "attach" a sheriffs badge 68 unto his shirt.
  • the shirt may be substantially featureless. Alternatively it may be over- featured, for example being a dense plaid, individual points of which are very difficult to track using co ⁇ elation tracking.
  • a plurality of special points and/or features on object 60 are tracked directly.
  • Such objects may include a pocket 64 and buttons 62.
  • these points may include directly tracked points or segments along the border or even indirectly tracked sections along the border.
  • An arbitrary area is defined relative to the tracked objects, which can serve as anchor points.
  • a plurality of relationships are defined between the anchors and the arbitrary area.
  • relationships are defined to more than three points in- and/or on- the border of the tracked area.
  • the type of- and parameters relating to- the relationships and/or other aspects of tracking are also individually associated with each section.
  • tracking of arbitrary points, areas and/or sections is used to reconstruct a three-dimensional surface from the two dimensional locations.
  • such reconstmction is used as an input for tracking arbitrary areas and/or section and/or for performing post-processing.
  • the 3D information is preferably reconstructed by tracking the relative changes in the locations of tracked points and borders and applying various assumptions regarding rigidity. Such assumptions are preferably provided from a model of the 3D object being tracked. In addition, parallax effects and object hiding effects may also be detected.
  • arbitrary points are tracked in addition to or instead of tracking areas or border sections. It should be appreciated that it is possible to track not only the position of the arbitrary point but also its orientation relative to the rest of object 60 or portions thereof or other objects in the scene.
  • information, scripts, hierarchical structure and/or other features of the invention as described above with respect to arbitrary border section tracking may also be applied, to arbitrary tracked areas.
  • the above methods are used to find a co ⁇ espondence between two images of the same or similar scenes, at different times and/or different orientations and/or different color filters and/or different photographic parameters. Additionally or alternatively, these methods may be used for two parallel video scenes which contain similar video entities, border portions or other features which can be tracked. Additionally or alternatively, these methods may be used to simultaneously apply special effects or object insertions to a left and a right stereo pair.
  • Macintosh® or an IBM PC Pentium® computer Altematively or additionally, it may be at least partially embodied using special pu ⁇ ose hardware that preferably includes a video acquisition card.

Abstract

A method of video processing a video sequence, comprising tracking a first video entity (20) across a plurality of frames of said sequence, tracking a second video entity across a plurality of frames of sequence, automatically applying a first post-processing operation to at least a portion of at least one of said frames, said post processing being associated with said first entity and substantially simultaneously applying a second post-processing operation (16) to at least a second portion of one of said frames, said post processing being associated with said second entity, where said first post-processing operation is substantially different from said second post-processing operation.

Description

FEATURE MOTIVATED TRACKING AND PROCESSING FIELD OF THE INVENTION
The present invention relates to the processing of video streams, and in particular to mixing and performing special visual effects on the stream. RELATED APPLICATIONS
The present application claims the benefit under 35 USC 119(e) of a provisional patent application with like title, filed on August 10, 1998 in the US Patent and Trademark office, the disclosure of which is incorporated herein in its entirety by reference.
BACKGROUND OF THE INVENTION Video post-production is a field in which existing video streams are processed, to generate visual effects that are either not possible or very expensive to create in real life. Two main types of visual effects are "special effects", in which the content of the video stream is modified and "compositing", in which additional information, possibly from one or more video streams and/or a computer generated animation object, is combined with the first video stream.
A main tool for video post-production is the image mask, which is generally the size of the video image and which defines a mask to be applied to the processing of each individual video image. In one example, a mask, having non-zero values in an area the size and shape of an object to be inserted, may be used to define parameters for inserting the object into a video frame. Each non-zero value corresponds to a relative weighting of a pixel from the object to be inserted and of a pixel from the video frame. In some types of video post production, a video image is segmented into objects and/or into layers, each of which corresponds to a distance. Thereafter, the video is processed to generate the desired effects. As used herein, the term video post-processing refers to processing of video images for post-production purposes. Some special effects are applied at borders of objects. For example, since an object in motion will have blurred edges, one post-processing technique adds blurring to the border of a composited object mask. In other cases, for example semi-transparent objects, the effects may be applied over the entire object. In another example, noise may be added over an entire object. As can be appreciated, video comprises a stream of consecutive images, so a mask should preferably be defined for each image. In one type of method, called key-framing, a mask is manually defined for a small number of "key- frames" and the mask is interpolated for video frames between the key-frames. In another type of method, the border of the object or the entire object is tracked between consecutive images, after the border is manually entered once. PCT publication WO/97/06631 of February 20, 1997 and U.S. Patent application No. 08/692,297, titled "Apparatus and Method for Object Tracking", both by inventors E. Spiegel and Y. Pastor, the disclosures of which are incorporated herein by reference, describe a method of identifying and tracking "special" points on a border of an object. The location of the border itself may then be determined from the relative location of the special points, by matching an approximation of the border to the underlying image or by extrapolation for portions of the border which are invisible.
Four point tracking is an image annotation system in which four points on the image, which include prominent features, are tracked and an object is inserted into the picture so that it is attached to the points. An example of four point tracking is found in the Adobe® After- Effects® Pro software, version 3.1, available from Adobe Systems Inc., 411 First Avenue South, Seattle, WA 98104-2871, USA.
SUMMARY OF THE INVENTION It is an object of some preferred embodiments of the present invention to provide a method of video post-processing in which the processing is performed for arbitrary, user defined, portions of a video stream. Preferably, the portions comprise lines, preferably user- defined sections of borders. Alternatively or additionally, the portions comprise user-defined areas of objects. As used herein, the term "video sequence" is used to describe a sequence of images that may be displayed as cine, however, there is no requirement that the images be from a video source. In some preferred embodiments of the invention, the video source may be a camera. Alternatively, it may be a synthetic source.
Another object of some preferred embodiments of the invention is to provide a method of video scene rescripting, in which various video entities are identified and allowed to interact with each other. In a preferred embodiment of the invention, each video entity or type of entity has a behavior associated therewith. When the video sequence is played, these behaviors cause the entities to interact, preferably adding effects to the scenes and/or changing the content of the scene. Examples of changing the scene include, but are not limited to adding, removing and changing the speed and direction of objects, removing artifacts such as wires, adding visual effects such as halos and annotations, and changing the way objects interact in the scene, such as making an elastic collision inelastic. Hence, the term "scene rescripting".
Alternatively or additionally to defining a behavior for each entity, a new scenario may be provided for the video sequence, which scenario defines actions to be "performed" by video entities in particular frames or groups of frames. Alternatively or additionally to interacting with each other, the video entities may interact with image portions from the current and/or
2 other frames. Alternatively or additionally the entities may interact- with or their behavior may be affected by- externally provided data or inputs. Alternatively or additionally, video entities may interact with virtual entities that are not part of- and/or are not visible in- the video sequence. Alternatively or additionally, the entities may export their external state and/or interactions with other entities in order to drive external programs.
In a preferred embodiment of the invention, the video entities are related to image content. Preferably, there is a correspondence between video entities and objects that are viewed in the video sequence, for example people, trees and cars. Alternatively or additionally, a video entity may correspond to groups of objects. Alternatively or additionally, a video entity may correspond to a portion of an object, such as a hand or a finger. Alternatively or additionally, a video entities may be arbitrary, preferably defined to help generate a special effect. In one example, an entity may be defined to be an area to which an object is to be attached. In another example, an entity may include a border section of an object, as many interactions occur at the borders of objects. In a preferred embodiment of the invention, a video entity may have information associated therewith, including a three-dimensional representation of a real world object which the entity corresponds to.
In some preferred embodiments of the invention, a video entity may be created dynamically as a result of the interaction of two or more video entities. In a preferred embodiment of the invention, entity interactions may include interactions between entities in different frames. In some preferred embodiments of the invention, all or most of the interactions will be automatic. Alternatively, at least some of the interactions may be manually directed. Preferably, manual direction includes identification of entities and entering parameters that modify the interactions. In addition, a user may be prompted, during the interactions, to make certain decisions, for example, in unanticipated situations or where an especially high quality is desired.
One aspect of some preferred embodiments of the present invention relates to performing different post processing for different border sections of objects. In one preferred embodiment of the invention, the post-processing is related to insertion of the object into a video stream. Alternatively or additionally, the post-processing may be embodied in a mask. Alternatively or additionally, the post-processing may be applied differently on the two sides of the border.
Another aspect of some preferred embodiments of the present invention is related to associating information with video entities, especially user defined border sections, and with portions of the video sequence. In a preferred embodiment of the invention, such information may include one or more of: identification, motion vector, type of effect(s) to perform, parameters of the effects to perform, width (which may be multi-pixel, sub-pixel or even an infϊnitesimally thin line), type of color, graphical elements expected at and/or near the border, direction of inside and outside, type of tracking to perform, parameters relating to tracking, type of estimation to perform when tracking is lost, type of key- frame interpolation to perform and parameters relating to the interpolation, type of border, the shape of the border, type of interpolation to perform for regenerating the border from what is actually tracked between frames, type of an animation to apply per border, allowed distortion, allowed smoothing and a location in the current, a previous or later frame from which to copy information into the current frame, for example for wire removal. Additionally or alternatively, the information may include ranges of valid values for the associated information. Preferably, when a value is outside the allowed ranges an exception occurs and a user may be notified.
Another aspect of some preferred embodiments of the present invention is that at least some of the information associated with video entities is dynamic information, which may change between frames. Substantially any type of information, such as listed with respect to the previous aspect, may be dynamic information. In some preferred embodiments of the invention, dynamic information is a function of global parameters, which are relevant to the entire frame and/or sequence portion. Alternatively or additionally, the information may be a function of other information (dynamic or static). Alternatively or additionally, the information may be a function of the activity of individual video entities. In accordance with one preferred embodiment of the invention, the information may be dependent on the frame number and/or a frame count to a key-frame (previous, future, minor, major or any combination). Alternatively or additionally, the information may be dependent on a history of the video stream.
Alternatively or additionally, the information may be dependent on the image content at a portion of the image.
In a preferred embodiment of the invention, an image portion is defined using frame coordinates. Alternatively or additionally, it may be defined relative to video entity coordinates, especially if the video entity is tracked between images. Alternatively or additionally, the coordinate system used may be based on outside-supplied coordinates, image segmentation, video layers, distance information, coordinates of the scene and or objects therein, preferably based on a model of the scene, camera coordinates and/or any combination thereof. Alternatively or additionally, the information may be dependent on parameters of the photography used for generating the scene, such as pan speed and zoom angle. Preferably, such information is stored with the video sequence. Alternatively, it may be determined by analyzing the sequence. Alternatively or additionally, the information may be dependent on information associated with other video entities.
In a preferred embodiment of the invention where dynamic information is a function of image content, the content may be of portions of the current and/or in previous or later frames. In some preferred embodiments of the invention, the coordinates of an image portion are taken from a current frame and applied to a previous and/or later frame, for example for an effect where an object dissolves. Preferably, the image content is from before post-processing is applied to the frame. Alternatively or additionally, the information may depend on the content of an image portion after it has been partially or completely post processed and/or on a comparison between a post processed and non-post processed portion.
Alternatively or additionally, the information may be dependent on externally provided input and/or on information associated with the same or other video entities, such as user defined sections. It should be appreciated that combinations and/or dependencies of the above information affectors are also considered to be in the scope of some embodiments of the present invention.
Some examples of using dynamic information include: dynamic information being dependent on the relative location of two objects, an increasing amount of blurring at a section the longer its exact position cannot be ascertained by tracking and changes in the image content at a location relative to one border section affecting the dynamic information associated with a different section or with an entire border. In addition, such information may be dependent on the content of the image, for example, a different amount of blurring for each color, or for different intensities. In another example, the type of interpolation performed for key-framing may be dependent on whether the border section is represented as a spline or as a raster (a free hand drawing in which each pixel is individually positioned). In another example, a border may be tracked and/or adjusted to be aligned with an object border, by defining a range of color values to which all the pixels, either inside the object or on its background, belong. Preferably, when using such color key tracking, the range of colors may be dynamically determined, for example, based on the average color of the background. In accordance with one preferred embodiment of the invention, dynamic information is evaluated using a function. Preferably, the function is one of a vendor-supplied library of functions, whose parameters may be modified by a user. Alternatively, a user may provide a user defined function. Alternatively or additionally, the information may be provided as a table. Alternatively or additionally, the information may be provided as a script, preferably
5 written in an interpreted high level video-processing programming language. Preferably, such a language is created by adding video-processing specific procedures to an existing script language, such as Microsoft Visual Basic. In one preferred embodiment of the invention, especially if the dynamic information for different sections may interact, an order may be defined between the evaluation of the information and/or between the effecting of the postprocessing dictated by the information. In should be noted that the results of applying the method in the two orders are not necessarily the same, as the information evaluation depends on what is happening in the image, while the post-processing is geared towards generating an image which better fulfills a user's desire. In accordance with a preferred embodiment of the invention, a user may control, at runtime of the post-processing, an evaluation of the dynamic information and/or the postprocessing, by manipulating various parameters. Such parameters may be manipulated before post-processing, during post processing and/or by stopping the post processing and then restarting or continuing it with new parameters. The parameters may be entered by a user using standard input devices, such as a mouse, keyboard, graphical user interface, touch screen and voice input. Additionally or alternatively, values for such parameters may be entered using dedicated input devices, for example, a motion sensor. Alternatively or additionally, such parameters are provided from an external source, such as a computer generating graphics to be incorporated into the image, a second video stream and/or additionally video and/or computer equipment. These sources may also be used to provide the dynamic information itself.
Another aspect of the present invention relates to tracking arbitrary user defined border sections. Such arbitrary user defined border sections may contain no distinctive features. In some cases, even correlation-based tracking of such sections will not be possible. It should be noted that in some cases computer selected border points and border segments may also be user-defined as sections. Preferably however, the arbitrary user defined border section comprise a higher order of organizational level, as they are preferably defined relative to such segments and points. A user defined section may start or end at any point along a border, even one which is not tracked directly by the computer. Additionally or alternatively, border sections may overlap. Additionally or alternatively, a border section may include more than one segment.
As used herein, the term border segment indicates a segment of a border which is directly tracked by a computer, preferably by tracking end points of the segments but alternatively or additionally by correlation tracking or by using other methods. The term border section indicates a user defined section of the border, which has a logical meaning to
6 the user, rather than only a technical reason relating to tracking, segmentation or the like. In some preferred embodiments of the invention, the definition of the user defined section may be completely divorced from any technical considerations relating to tracking. Rather, it is preferably related only to image content and/or the desired post-processing. In accordance with a preferred embodiment of the invention, alternatively to user identification of sections, the "user defined sections" may be automatically suggested by a computer based on a model of the object and/or based on image characteristics and/or based on an imprecise indication by the user. In a preferred embodiment of the invention, a step of image segmentation and/or object detection may be performed automatically to help a user decide on user-defined sections. Preferably, a user can indicate an object, its border or a portion thereof to be a video entity.
Another aspect of some preferred embodiments of the invention relates to learning. In a preferred embodiment of the invention, the system tracks the effect of various parameter values on the operation of various aspects of the video post-processing and the extent to which these aspects are successful. This information may be associated with a particular video entity, a particular frame, a group of frames, a scene, a type of video sequence, a type of scene and/or a type of entity. Additionally or alternatively, this information may be associated with various parameters and/or evaluated criteria of the entity, frame, scene or sequence, for example a frame's average background level or a section's smoothness. One example of information which is preferably learned is what types of prominent points can be effectively tracked, for a particular type of border. Thereafter, when the computer selects points to use as anchors (one or more) for an arbitrary tracked section, when segments are selected for tracking and/or when a user selects anchor points, it is possible to select points of a type which are expected to have a higher probability of success. In a preferred embodiment of the invention, when a user indicates a video entity or an object border, the indication is made on more than one frame. These frames may be consecutive or, in some preferred embodiments of the invention, there may be gaps between the frames. The border as indicated in one frame and/or sections thereof and/or prominent points thereon may be directly copied to the next frame, for adjustment by the system and/or by the user. Alternatively or additionally, the user instructs the system to "track" the border from one frame to the next, optionally after adjustment by the user. Tracking by the system may also be used by the user to assess whether the indicated border will be properly tracked. In a preferred embodiment of the invention, the tracking between non-consecutive frames is used as an aid in entering key-frame information.
7 In a preferred embodiment of the invention, when the user indicates a segment or a section, he associates information with it. In a preferred embodiment of the invention, the information may be entered by inputting data into a structured data file, using a graphical user interface, interpolation of values using a spatial distribution function, interpolation of values between key-frames, using a script and/or combinations of these methods. Preferably, the information and any other properties may also be entered using scripts associated with this or other video entities.
Another aspect of some preferred embodiments of the present invention relates to object overlapping. In accordance with a preferred embodiment of the invention, when two objects overlap, an overlap of user defined sections is detected. In some preferred embodiments of the invention, the overlapped area may be treated as a special type of section having special properties which may be a combination of the two sections' properties. As can be appreciated, a property of the sections may be which section's properties overrides the other section's properties. Preferably, when two objects overlap, the post-processing system defines two borders, one for each object, so that when the amount of overlap changes, each object may be properly post-processed. In a preferred embodiment of the invention, the system automatically estimates the extent and location of a previously overlapped or otherwise hidden edge. Alternatively or additionally, it requests a user intervention. In addition, an object may be split into two and/or a common border may be defined between the objects. In a preferred embodiment of the invention, the information associated with one or more of the sections includes whether the section is hidden, in contact with another video entity, partly occluded and/or occluding another entity, and/or combinations of these conditions. Additionally or alternatively, such information preferably includes the conditions under which a new section is created and/or its default properties. In a preferred embodiment of the invention, various scripts are defined for interactions between types of objects, for example between elastic and in-elastic objects or object borders or for interaction between animation objects and video objects. It should be appreciated that, in many cases, a video entity will have associated with it a zone of influence. Typically the zone will surround the entity. However, in some preferred embodiments of the invention, the zone may be on only one side of the entity, may overlap the entity and/or may be located at a distance from the entity. Preferably, the size, shape, location and behavior of the zone are associated with each entity and constitute dynamic information. Preferably, more than one zone of influence may be associated simultaneously with each entity. Preferably, different scripts are defined for each such zone. Preferably, the zones are also part of an object-orientated hierarchical structure. Another aspect of some preferred embodiments of the invention relates to a method of entering a spline into a graphics system, especially for a border tracking and/or video postprocessing system. In many cases a spline is useful for approximating a curve, however, it may be difficult and/or time-consuming for a user to draw an appropriate spline. Preferably a spline is stored as a set of spline parameters, end points and, optionally, additional control points. When a border segment needs to be drawn, a spline can be drawn using the stored parameters, after they are matched to the current frame conditions. In a preferred embodiment of the invention, a user draws a border in free-hand style and/or using graphical tools. Alternatively or additionally, at least a portion of the border may be drawn by color keying, i.e., as a border between the keyed area and the non-keyed area. In a preferred embodiment of the invention, a user draws a border using a combination of such drawing methods. The system then creates a spline that approximates what was drawn or a portion thereof. In many cases, a user does not accurately follow the border being marked. In some cases this is intentional, in others it may be unintentional. In accordance with a preferred embodiment of the invention, the user may indicate to the system which points along the drawn border are exact and which are only approximate. Alternatively or additionally to indicating points, the user may indicate graphic primitives and/or border portions which should be adjusted. Preferably, the system prompts the user to provide such an indication for points that are either far from any feature (for example, a high contrast area) or very close to such a feature. Preferably, the user is prompted for an indication only regarding points that are used to define the graphic primitives used to enter the border. Alternatively or additionally, the user is prompted for such an indication regarding prominent points. Preferably, any point which is a junction between more than two lines and/or three or more colors and/or a point at which there is a large angle between meeting lines and/or colors, is considered to be a possible candidate for correction and a user may be prompted for an indication by the computer. Alternatively or additionally, points at which the underlying border of the object deviates to a substantial amount from an estimated border, are considered to be candidates.
In a preferred embodiment of the invention, when a user indicates that such a candidate point is to be corrected, the system may suggest one or more possible features which the point is to be anchored to. In a preferred embodiment of the invention, the system attempts to match the drawn border to an automatically detected border of the object, to determine such features and/or required corrections. Preferably, this determination is performed by tracking the drawn border to an underlying object border in the current frame. In a preferred embodiment of the invention, a user can indicate a position offset from such a feature to the point. This offset may be explicitly entered or may implicitly entered based on what was drawn. Preferably, when a point is moved, the entire graphic primitive associated with the point is moved. In a preferred embodiment of the invention, when a border section is entered using individual pixels, more than one pixel may be grouped by the system and/or by a user, to be treated as a single graphic primitive.
In a preferred embodiment of the invention, user defined sections and/or other video entities are entered using similar techniques. Preferably, the user indicates which points correspond to ends and control points of user defined sections.
Another aspect of some preferred embodiments of the present invention relates to automatically suggesting and/or generating tracking parameters for a video entity, such as a particular border portion. Alternatively, user entered parameters or tracking methods may be tested and the user may be informed of the quality of- and/or problems with- his parameters. In a preferred embodiment of the invention, the system suggests a type of tracking (spline, correlation, color-key) based on geometrical and/or color characteristics of the drawn or actual border and/or its vicinity. For example, spline tracking may be suggested for a smooth area. Correlation tracking may be suggested for a unique looking area. Color-key tracking may be suggested for a highly delicate area, such as hair, which has a distinct color.
In a preferred embodiment of the invention, the type of tracking is determined based on the behavior of the border portion over a sequence of frames. Alternatively or additionally, the behavior of the border may be analyzed over the entire length of the sequence, possible skipping some frames. Frame skipping is especially useful when entering data for key-frames.
Preferably, changes in the border are analyzed to determine which type of tracking and/or which parameter values would be most useful. Alternatively, the tracking method and/or parameters are determined by trial-tracking the section with various possible tracking schemes. In a preferred embodiment of the invention, the type of interpolation to use for drawing a border segments based on tracking of its end points and/or control points, may also be determined, based on the changes in the form of the tracked border. In a preferred embodiment of the invention, the type of tracking which is performed for invisible segments is also automatically evaluated and/or suggested. Preferably, such analysis is based on the smoothness of motion of a previously hidden border when it becomes visible. In a preferred embodiment of the invention, other types of user entered information may be evaluated, for example, static information, dynamic information or scripts. Such evaluation is preferably performed automatically, without user intervention, so it can proceed faster than real-time. In a preferred embodiment of the invention, the evaluation is performed only on a subset of the images,
10 preferably a subset suggested by a user. The evaluation is preferably based on an image quality criterion and/or on the number of generated exceptions.
Another aspect of some preferred embodiments of the invention relates to a method of better defining a border, for a border tracking and/or a video post-processing system. In a preferred embodiment of the invention, a user may define part of a border to be dictated by one type of graphic primitive and another border portion to be dictated by another. For example, edges of a square table will preferably be dictated to be straight lines, while the outlines of ornamental legs of the table may be approximated by splines. Still other parts of the object may be approximated by other graphic primitives, other splines and/or other parameters for the splines. Additionally, some parts of the border may be dictated to match a raster or free-hand definition. In some preferred embodiments of the invention, the border graphic definition will include fixation points around which the graphic may be rotated and/or stretched, if the tracked border moves or changes shape. This is especially applicable to raster definition, for which, in some cases no changes will be desired, while in others, at least rotation will be desired. In a preferred embodiment of the invention, raster borders are treated as a plurality of border segments, each of which is one pixel or less in size and which are preferably each anchored to a border feature.
It should be appreciated that where two primitives graphic meet, there may be some interaction between them. In some cases, it will not be possible to provide a precise border, for example, if four fixed length line definitions are provided to track a size-changing square. Preferably, a best fit border is then determined. Alternatively, some points along the border are considered to be more important than others with respect to the quality of their tracking and the best fit provides a better fit for the more important points.
Another aspect of the present invention relates to tracking arbitrary user defined areas of an image. In accordance with a preferred embodiment of the invention, a user defines an area and/or a boundary thereof to be tracked, and a tracking system tracks the area over a sequence of images. In a preferred embodiment some or all of the features described herein above with respect to boundary sections may also be applied to user defined areas, as well as video painting and different post-processing inside and outside the area. In accordance with a preferred embodiment of the invention, an object to be inserted is anchored to such a user defined area. When the object on which the area is defined turns (in space), the size and/or shape of the area change and the inserted object is warped accordingly. In addition, if the area is defined on a non-rigid body, the inserted object may be automatically warped to match a warping of that body in the video sequence.
11 Another aspect of some embodiments of the present invention relates to real-time processing of video images. Some of the embodiments described herein are preferably applied to linear video. Preferably, a look ahead of between 5 and 20 frames is used, which causes a delay of up to one second. Another aspect of some embodiments of the present invention relates to automatic retouching of a video frame based on the content of a previous or a later frame or a combination thereof. Alternatively or additionally, retouching comprises automatically retouching a first portion of a video frame based on other portions of the same frame.
Alternatively or additionally, a single portion in a frame is automatically retouched based on a plurality of frames, for example for performing an averaging operation. In a preferred embodiment of the invention, two portions of the frame are simultaneously retouched, each one in a different manner and/or from different source frames and/or different portions thereof.
One example of retouching automatic is wire removal, in which support wires are removed by replacing their image with image pixels from a previous and/or a later frame. Preferably, the wire is automatically tracked. Additionally or alternatively, areas in a previous and/or later frame which match the wire area are preferably automatically identified, for example as explained with respect to tracking arbitrary areas. Preferably, the copied image pixels are warped to fit with the current frame. Additionally or alternatively, a smoothing filter is applied. Alternatively to using pixels from other frames, pixels from the same frame may also be used, for example for copying a portion of a sky or a portion of a complex pattern.
Preferably, when retouching based on a plurality of images, the areas from which the pixels are copied are tracked in all the frames so that there is a correspondence therebetween.
In a preferred embodiment of the invention, other operations may be used for retouching and/or for special effects which relate a plurality of image portions, including, blending, waφing, morphing, texture mapping, graining, noise adding, artifact adding and color balancing. Moφhing preferably utilizes a 3D representation of the retouched area, which is preferably stored as static or dynamic information. In a preferred embodiment of the invention, the type of retouching performed is used to make a composited object appear to have been generated in the same manner and/or same photographic conditions as the rest of the scene. In addition, the retouching may be responsive to more than one source area, for example, one area is used to provide the color balancing and another area is used to provide the pixel data.
In accordance with a preferred embodiment of the invention, tracking of arbitrary areas and border sections is achieved by defining a positional relationship between points which
12 define the arbitrary area (or boundary or point) to be tracked and special points and/or features of the video image which are directly tracked. These special points may be tracked using methods in the above described PCT publication and U.S. application or by correlation tracking or by other tracking methods known in the art. Alternatively or additionally, features of the image may be identified anew for every image. Additionally or alternatively, the tracking of arbitrary areas or border sections is based on additional points that are tracked solely for the puφose of tracking the arbitrary portion. Such points may be internal to the object on which the arbitrary portion is defined, even for border section tracking. Alternatively or additionally, such points may be border points or even be external to the object. It should be appreciated that in some preferred embodiments of the invention, when an arbitrary point is tracked, its orientation may also be tracked, especially its orientation relative to a different point on the object and/or in the same image. In a preferred embodiment of the invention, the arbitrary tracked border sections, points and/or areas are calculated areas, whose size, shape, orientation and/or associated information may be calculated from the points which are actually tracked, from other tracked entities or based on dynamic information, as described herein.
In a preferred embodiment of the invention, borders are tracked by tracking segments of the border. Each such segment is preferably tracked by tracking its end points. Alternatively or additionally, the tracking may be by other methods, for example correlation tracking, feature based tracking and/or other tracking methods as described in the above referenced PCT and US applications. In addition, the form of the segment is preferably corrected to match the actual border shape. In a preferred embodiment of the invention, the border is represented as a list of points which can be tracked. When necessary, new points may be added (for example, to fit a major change in the contour of an object). Additionally, points may be removed or ignored for a few frames, if segments of the border shrink or disappear. In a preferred embodiment of the invention, the points actually identified in a current frame are matched to a list of points of the border. Preferably each point is matched to a point in a region of interest having a size and/or location dependent on the motion of the border segment. Preferably, the matching of the point lists is done using a disparity analysis method. Preferably, the updated list of points is used for matching border segments in future frames. In accordance with a preferred embodiment of the invention, the above described postprocessing methods are applied to image compression, for example for video conferencing. In accordance with a preferred embodiment of the invention, once borders and areas are defined, a different compression scheme and/or depth may be applied to each area. In one preferred embodiment of the invention, the identification of areas for which different compression may
13 be used is manual. Alternatively, these areas are automatically identified, albeit only once. Alternatively or additionally, these areas may be identified for key- frames and/or when there is an error in the identification. Typically, it is much simpler computationally to detect a mismatch than to correct it. Preferably, these areas are identified by comparing the image with an image from a previous video conference, preferably for the same persons and/or setting, in which the sections and/or areas were identified. Alternatively or additionally, the identification of areas is performed remotely by a server that accepts one or more frames from a beginning of the sequence and transmits back information regarding a suggested area breakdown for tracking and compression. Such a server may be manually operated. Alternatively, it employs a powerful computer, preferably with powerful pattern matching capability and/or object recognition ability.
In accordance with a preferred embodiment of the invention, the post-processing is applied to portions of a frame which are spaced, either spatially or temporally from the border section being tracked. Such post processing may be applied in addition to or instead of processing at the section and/or may be of a different type. In one example, such post processing includes throwing a shadow, relative to an externally provided location of a light source and scene geometry. In another example, such post processing includes filling the object with a pattern, whose origin is relative to an arbitrary point on the boundary or in the object. In another example, a laser beam and a future explosion are added to a video scene, based on a current orientation and location of a "blaster gun" object.
In another preferred embodiment of the invention, even when border sections are tracked, there is no necessity to track an entire border and/or to apply post-processing to more than a portion of the object and/or its border. Thus, in some cases, it may not be necessary to define an entire border. There is thus provided in accordance with a preferred embodiment of the invention, a method of video post processing a video sequence, comprising: identifying a first video entity in a least one frame of the video sequence; identifying at least a second video entity in the video sequence; and automatically performing post-processing on a plurality of frames of the video sequence responsive to an interaction of the two entities.
There is also provided in accordance with a preferred embodiment of the invention, a method of video post processing a video sequence, comprising: tracking a first video entity across a plurality of frames of said sequence;
14 tracking a second video entity across a plurality of frames of said sequence; automatically applying a first post-processing operation to at least a portion of at least one of said frames, said post processing being associated with said first entity; and substantially simultaneously applying a second post-processing operation to at least a second portion of one of said frames, said post processing being associated with said second entity, where said first post-processing operation is substantially different from said second post-processing operation.
Preferably, said first and second post-processing operations utilize different parameter values for a similar post-processing operation. Alternatively, said first and second postprocessing operations comprise different post processing techniques.
In a preferred embodiment of the invention, said two entities comprise portions of a single object. Alternatively or additionally, at least one of said first and said second postprocessing operations comprises adding a shadow. Alternatively or additionally, at least one of said first and said second post-processing operations comprises compositing. Alternatively or additionally, at least one of said first and said second post-processing operations comprises blurring. Alternatively or additionally, at least one of said first and said second post-processing operations comprises applying a local filter. Alternatively or additionally, at least one of said first and said second post-processing operations comprises smearing. Alternatively or additionally, at least one of said first and said second post-processing operations comprises color adjustments. Alternatively or additionally, at least one of said first and said second postprocessing operations comprises local color keying. Alternatively or additionally, at least one of said first and said second post-processing operations comprises local addition of noise. Alternatively or additionally, at least one of said first and said second post-processing operations comprises graphical annotation. Alternatively or additionally, at least one of said first and said second post-processing operations comprises wrinkle removal. Alternatively or additionally, at least one of said first and said second post-processing operations comprises video painting. Alternatively or additionally, at least one of said first and said second postprocessing operations comprises object resizing. Alternatively or additionally, at least one of said first and said second post-processing operations comprises waφing.
In a preferred embodiment of the invention, said first video entity is tracked independently of said second video entity.
There is also provided in accordance with a preferred embodiment of the invention, a method of video post processing a video sequence, comprising:
15 tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; associating different post-processing information with the different entities. Preferably the method comprises changing the information between frames. Preferably the information is changed responsive to image content. Preferably, the content is at a location remote from the location of the entity with which it is associated.
In a preferred embodiment of the invention, the method comprises changing said information responsive to a performance of said tracking. Altematively or additionally, the method comprises changing said information responsive to a performance of said tracking. Alternatively or additionally, the method comprises changing said information responsive to information associated with a different entity. Alternatively or additionally, the method comprises changing said information in a current frame responsive to information associated with a different frame. Alternatively or additionally, the method comprises changing said information responsive to a frame-count parameter. Alternatively or additionally, the method comprises changing said information responsive to a history of values for said information. Alternatively or additionally, the method comprises changing said information responsive to a user input.
In a preferred embodiment of the invention, said first video entity is tracked independently of said second video entity. In a preferred embodiment of the invention, said information comprises a type of tracking to use for that entity. Alternatively or additionally, said information comprises a type classification of the entity. Alternatively or additionally, said information comprises a color of the entity. Alternatively or additionally, said information comprises graphical annotation data.
Alternatively or additionally, said information comprises at least part of an appearance/disappearance profile for the entity in the video sequence. Alternatively or additionally, said information comprises a smoothness of the entity. Alternatively or additionally, said information comprises a visibility of the entity. Alternatively or additionally, said information comprises anchor points. Alternatively or additionally, said information comprises at least one motion vector. Alternatively or additionally, said information comprises parameters for key-framing. Alternatively or additionally, said information comprises depth information.
In a preferred embodiment of the invention, said type of tracking may include an instruction not to track the entity. Alternatively or additionally, said type of tracking fixes at least one degree of freedom of motion of the entity. Preferably, said at least one degree
16 comprises translational motion. Alternatively or additionally, said type of tracking fixes at least one limitation on changes in the entity. Preferably, said limitation comprises a size change limitation.
In a preferred embodiment of the invention, said information comprises a parameter of the tracking for that entity. Alternatively or additionally, said information comprises a script. Preferably, said script comprises a script for performing video post processing. Alternatively or additionally, said script comprises a script for updating said information.
There is also provided in accordance with a prefeπed embodiment of the invention, a method of video post processing a video sequence, comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; and identifying an interaction between said two entities.
Preferably, said first video entity is tracked independently of said second video entity. In a preferred embodiment of the invention, said interaction is an overlap between said two entities.
In a prefeπed embodiment of the invention, the method comprising generating a new video entity as a result of said interaction. Preferably, said new entity is a shared border between two video objects. Alternatively or additionally, said new entity replaces at least a portion of at least one of the first and second entities. In a preferred embodiment of the invention, the method comprises modifying at least one of said first and second entities, in response to said interaction. Alternatively or additionally, the method comprises deleting at least one of said first and second entities, in response to said interaction.
In prefeπed embodiments of the invention as described herein above, said entities comprise arbitrary points. Alternatively or additionally, said entities comprise arbitrary areas. Alternatively or additionally, said entities comprise arbitrary lines. Preferably, said lines overlap borders of objects. Alternatively, said lines no not overlap borders of objects.
Alternatively or additionally, said entities comprise portions of the video frames which are directly tracked. Alternatively or additionally, said entities have extents and wherein said extents are related to an image content.
There is also provided in accordance with a prefeπed embodiment of the invention, a method of video post processing a video sequence, comprising: tracking a video entity across a plurality of frames in the sequence;
17 automatically performing video post-processing responsive to the tracking, which postprocessing has a success criterion; and determining from the success criterion of said post-processing, at least one parameter which affects said success. Preferably, said tracking comprises tracking prominent points and wherein said parameter comprises a rule for identifying a point as a prominent point.
There is also provided in accordance with a prefeπed embodiment of the invention, a method of spline fitting, comprising: manually entering a graphical object, comprising of more than one graphical primitive of more than one type; and generating a spline which matches at least a portion of a border of said entered graphical object and which meets an eπor criterion.
Preferably, the method comprises automatically sub-dividing said graphical object, at prominent points of the object and generating an individual spline for each sub-division. Preferably, said prominent points comprise points of maximum local curvature. Alternatively or additionally, said prominent points are selected based on a local pattern. Alternatively or additionally, said prominent points are selected based on a local noise analysis. Alternatively or additionally, said prominent points are selected based on being a junction of three or more colors. In a preferred embodiment of the invention, the method comprises automatically subdividing said graphical object, at non-prominent points of the object and generating an individual spline for each sub-division.
In a preferred embodiment of the invention, the method comprises adjusting parameters for said spline, such that loops in the spline are avoided. There is also provided in accordance with a prefeπed embodiment of the invention, a method of object border definition for tracking across a plurality of frames of a video sequence, comprising: defining a first portion of said border as a first type of graphic primitive; defining at least a second portion of said border as a second, different, type of graphic primitive; and tracking said first and second portions across said plurality of frames. Preferably, at least one of portions is defined using a free-hand graphical object. Alternatively or additionally, at least one of said portions is defined using a spline graphical object. Alternatively or additionally, at least one of said portions is defined using a color-key.
18 In a preferred embodiment of the invention, the method comprises tracking said first and said second borders across a plurality of frames in a video sequence.
In a preferred embodiment of the invention, said object border comprises an incomplete border. Alternatively, said object border comprises a complete border. There is also provided in accordance with a prefeπed embodiment of the invention, a method of spline fitting, comprising: manually entering a graphical object, comprising of more than one graphical primitive of more than one type, which object comprises an outline corresponding associated with an object; automatically adjusting at least a portion of said outline to match at least a portion of said object; and generating a spline which matches at least a portion of said adjusted outline. There is also provided in accordance with a prefeπed embodiment of the invention, a method of video post processing a video sequence, comprising: arbitrarily identifying an area on a frame of said sequence; tracking said arbitrary area across a plurality of frames of said sequence; and applying a video post-processing to said image responsive to said tracking. There is also provided in accordance with a prefeπed embodiment of the invention, a method of video post processing a video sequence, comprising: arbitrarily identifying a line on a frame of said sequence; and tracking said arbitrary line across a plurality of frames of said sequence; applying a video post-processing to said image responsive to said tracking. Preferably, said arbitrary line comprises a portion of a border of an object. In a preferred embodiment of the invention, said arbitrary identification is divorced from technical considerations regarding tracking.
There is also provided in accordance with a prefeπed embodiment of the invention, a method of video post-processing planning, comprising: manually identifying a video entity; automatically suggesting at least one parameter related to tracking of said entity; and automatically tracking said entity across a plurality of frames of a video sequence.
Preferably, said parameter comprises a type of tracking.
There is also provided in accordance with a prefeπed embodiment of the invention, a method of video post processing a video sequence, comprising:
19 tracking a first video entity across a plurality of frames of said sequence; retouching a plurality of frames of said sequence, responsive to said tracking and responsive to one or more image portions in said sequence.
Preferably, at least one of said image portions is in the same frame as the retouched frame. Preferably, said at least one image portion comprises at least two image portions in the same frame as the retouched frame.
In a preferred embodiment of the invention, at least one of said image portions is in a different frame from the retouched frame. Preferably, said at least one image portion comprises a plurality of image portions, at least two of which are in different frames. In a prefeπed embodiment of the invention, said retouching comprises copying said at least one of said one or more image portion into the retouched frame. Preferably, said retouching comprises texture mapping said at least one image portion into the retouched frame.
Alternatively or additionally, said retouching comprises waφing said at least one image portion into the retouched frame. Alternatively or additionally, said retouching comprises blending said at least one image portion into the retouched frame. Alternatively or additionally, retouching comprises color-balancing responsive to said at least one image portion. Alternatively or additionally, retouching comprises graining responsive to said at least one image portion. Alternatively or additionally, retouching comprises noise adding responsive to said at least one image portion. Alternatively or additionally, retouching comprises adjusting an image of an inserted object, to match scene characteristics, responsive to said at least one image portion.
There is also provided in accordance with a prefeπed embodiment of the invention, a method of three-dimensional surface reconstruction, comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; and reconstructing a surface representation of at least a portion of an object, responsive to said tracking.
Preferably, said entities comprise lines. Alternatively or additionally, said entities are attached to said object. Preferably, said entities comprise border portions of at least one object. In a preferred embodiment of the invention, said first and second tracking comprise tracking changes in the shape of said entities. Alternatively or additionally, said first and second tracking comprise tracking changes in the relative locations of said entities.
In a preferred embodiment of the invention, the method comprises performing a post processing operation on said video sequence, responsive to said reconstructed surface.
20 In a prefeπed embodiment of the invention, the method comprises modifying a tracking of an entity associated with said surface, responsive to said reconstruction.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be more clearly understood from the following detailed description of the prefeπed embodiments of the invention and from the attached drawings, in which:
Fig. 1 is a flowchart of a method of video post-processing, in accordance with a preferred embodiment of the invention;
Fig. 2 is a schematic diagram showing an object having edges, special points, segments and sections defined thereon, in accordance with a prefeπed embodiment of the invention;
Fig. 3 is a schematic diagram illustrating a border definition method, in accordance with a prefeπed embodiment of the invention; and
Fig. 4 is a schematic diagram of an arbitrarily defined area to be tracked, in accordance with a prefeπed embodiment of the invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Fig. 1 is a flowchart of a method of video post-processing, in accordance with a prefeπed embodiment of the invention. In a typical first step, a user indicates on a first video frame an approximate border of an object to which post-processing is to be applied (10). Such indication may be at least partially performed using color-keying, functional definition of the border, template matching and matching to a previous frame, as well as by drawing of one or more graphical primitives. Thereafter, a computer may suggest a better fit between the drawn border and the object or a portion thereof. A user then coπects the computer's suggestion(12) and may repeat the first step.
The user then defines sections of the border to be tracked (14). It should be appreciated that, in some cases, a user will start with this step, dispensing with steps (10) and (12). Alternatively or additionally, the user can directly identify user defined sections by drawing them, rather than marking them on a previously drawn border. Finally, a user defines dynamic information to be evaluated and or post-processing to be performed for each border section. It should be appreciated that a section defined by a user for post-processing puφoses may sometimes coπespond exactly to a segment selected by the computer for tracking puφoses.
In accordance with a prefeπed embodiment of the invention, the border definition effects the creation of an object (or video) mask that performs a desired post-production effect. Alternatively, the definition and tracking of the border section provides location coordinates and/or other parameters to a post-processing function.
21 In accordance with a prefeπed embodiment of the invention, one or more of the following effects and combinations thereof may be performed, including:
(I) different post-processing on each side of the user defined section and/or underlying the section itself; (2) compositing video and object insertion;
(3) blurring, especially motion blur;
(4) smoothing and other local filter applications;
(5) smearing;
(6) color adjustments, changing, coπection and offsetting; (7) local color key effects;
(8) selective addition of noise;
(9) adding graphics, such as a border in a color;
(10) wrinkle removal;
(I I) video painting; (12) resizing; and
(13) waφing.
In addition, more complex effects may be performed, for example wire removal, which includes replacing the removed pixels. Another example of an effect is object speed changing, which may include adding blurring, copying of missing ("uncovered") information by copying from a previous or later image and/or waφing of object portions whose image is affected by the speed of motion.
In a preferred embodiment of the invention, the information associated with each section may be outputted so that it can be used by an external device. In one prefeπed embodiment of the invention, this device comprises a plug-in, which may be executed on the same computer. Alternatively, the information may be outputted to a different computer. In a particular example, the information is outputted to a video source, to control it, for example, the information may be outputted to a graphical object generator.
Each of these and other post-processing effects may be a function of information associated with each user defined border section. In some preferred embodiments of the invention, any information may be associated with each section, especially information relating to the drawing and analyzing of the frame. Preferably, the information associated includes one or more of:
(l) type;
22 (2) color;
(3) text to be displayed in association therewith;
(4) information associated with the video sequence, such as the first frame in which it appears and/or a general appearance/disappearance profile; (5) smoothness;
(6) visibility;
(7) user defined parameters;
(8) neighboring borders;
(9) motion vector; (10) parameters for key- framing, including the type of inteφolation to use, which frames are key frames for which section and which key- frames are minor or major; and
(11) depth information, which may be provided by a 3D camera or by analysis of the image.
In a prefeπed embodiment of the invention, dynamic information is associated with one or more of the user defined sections. Dynamic information is information which may change between video frames. Such changes may be a function of:
(1) information at other sections;
(2) information in other frames;
(3) activity of the post-processing system, including its ability to track; (4) image content, local and remote to the section;
(5) time and/or frame count, especially relative to key-frames;
(6) history such as history of dynamic information and/or image content;
(7) global frame information; and
(8) user input. In addition, some types of information which may be associated with- and/or evaluated for- various sections is inherently dynamic, for example motion vectors, which can be used for better object insertion.
In a prefeπed embodiment of the invention, different post-processing is performed based on the value of the dynamic information, for example, a different amount of motion blurring may be performed depending on the velocity of motion. In addition, the direction of the blurring is preferably dependent on the vector of motion, which vector may be different both in magnitude and in direction, for different sections.
23 In a prefeπed embodiment of the invention, information is made dynamic by providing a script to be performed to evaluate the dynamic information for each section. Alternatively or additionally, a function is provided to evaluate the dynamic information.
In a prefeπed embodiment of the invention, additional types of scripts may be provided for each section, including a script relating to video post processing and a script relating to external data processing. In a prefeπed embodiment of the invention, a script is associated with possible events for each section. Such events may include: first appearance, last appearance, disappearance, overlapping with another section or object, interaction with an external event, being at a position, deviating from a track, loss of tracking, size change and rate of change. In addition, different scripts, for the same puφose, may be associated with various states of the system, the tracked video entities and/or user defined states.
In accordance with a prefeπed embodiment of the invention, a set of properties of a section to be tracked, which may be dependent on local (possibly dynamic) information, relate to the tracking itself. Preferably, different sections have different types of tracking associated therewith. Alternatively or additionally, the type of tracking may change based on the history of the information at the section. Alternatively or additionally, each section may have different values for parameters relating to tracking, for example, in a coπelation tracking method, the size of an ROI (region of interest) in which to perform the coπelation or the size of convolution kernel for the coπelation. Another example of a tracking relating parameter is the type of position estimation to use, especially the equations of motion to be used. Another example of a tracking related parameter is relative weights for directly tracked points. Another example is the range of values used for color-key based tracking. One type of border tracking may be based on there being a distinct different between what is inside an object and what is outside, for example using statistical definitions, even though there is no demarcation of the border on a pixel by pixel basis. In yet another example, the border may be better defined by analyzing a region of interest around the border. For example, the border of a pair of blue pants on a blue background can approximated by the existence of creases in the pants and not in the background. The size of the region of interest may be a property of the section being tracked and/or of its underlying segments or tracked points. As can be appreciated the above properties and parameters may also be dynamic information.
In a prefeπed embodiment of the invention, video entities, especially border sections can be defined as not to be tracked. This definition is especially useful for rigid objects, for which, if one portion is tracked, the entire outline can be reconstructed. Thus, a considerable amount of computer resources that might otherwise be devoted to tracking, can be conserved.
24 Additionally or alternatively, some entities may be defined to be locked in position, so that even an approximation of their position is not necessary. Alternatively or additionally, various restrictions may be placed on the allowed changes in location and/or orientation of the entity, for example, an object may only be allowed to move, move at less than a certain maximal velocity and/or rotate around a particular point. Alternatively or additionally, various restrictions may be placed on an allowed change and/or rate of change of an entity's shape and/or pixel content.
In a preferred embodiment of the invention, when such restrictions reduce the accuracy of the tracking and/or special effects, the user is automatically notified. Alternatively or additionally, the system automatically advances to a higher level of tracking ability, in which some of the restrictions may be removed.
In accordance with a preferred embodiment of the invention, a particular property of a section to be tracked, which may be dependent on local (possibly dynamic) information, is the method used for reconstruction of a border segment based on tracked portions of the border. Values of this property may include using spline based inteφolation or other graphic primitive based inteφolation, different types of splines, fixed (exact) border based reconstmction and the level of precision to use for the tracking. Examples of parameters that affect the level of precision include whether to adjust the spline parameters periodically to match the real border and/or whether to allow adding and/or removing of points which are tracked and/or what number of points may be added and/or removed. Other examples of precision affecting information is the type of tracking for the underlying segments and/or the limitations placed on changes in the shape of a raster (free-hand) segment.
Animation is another particular property that may be associated with a section to be tracked. In accordance with a prefeπed embodiment of the invention, an animation property may include a definition of a procedure which generates the animation. The animation and/or the procedure may be dependent, inter alia, on dynamic information and/or on a history of dynamic information.
In accordance with another prefeπed embodiment of the invention, dynamic information is utilized for key-framing. Information associated with user defined sections may include: type of inteφolation, which key-frames to use for inteφolation, the number of frames allowed between key frames, the amount of motion allowed between key-frames. Different key-frames may be defined for different sections.
In a preferred embodiment of the invention, instead of evaluating dynamic information, the information is inteφolated, but not necessarily for all the frames and/or for all the sections.
25 In one preferred embodiment of the invention, the information is inteφolated for the same section between frames, possibly using key-framing. Altematively or additionally, the information is inteφolated between two or more sections and or between values defined for two or more points defined on a section. In accordance with a prefeπed embodiment of the invention, overlapping of borders is detected and/or tracked. Preferably, special scripts are associated with such overlapped borders. In addition, two border sections are preferably defined at the overlapping portion, thus, the two objects can later separate.
Fig. 2 is a schematic diagram showing an object 20 having edges, special points, segments and sections defined thereon, in accordance with a prefeπed embodiment of the invention. A plurality of points 22, 24, 26 and 30 are distinguished as being suitable for being tracked between consecutive images. Such points are preferably automatically selected by the tracking system, for example as described in the above-referenced PCT publication.
Alternatively or additionally, some of the points may be selected by a user. Typically such points include junctions and/or prominent edges or comers. A segment 22-24 is defined as a border segment which connects the two tracked points. As indicated above, this segment may be tracked by tracking its end points and reconstructing it using a stored spline or another graphic primitive. Preferably, the reconstructed border segment and/or stored spline parameters are adjusted to match the exact border of the object being tracked. By providing a large number of points to be tracked, it is generally possible to associate a border in one image with a border in a second image, even if some of the points are not identified. A user defined section 28 (indicated by the dotted lines) may be defined arbitrarily and divorced from the automatically identifiably segments, such as segment 22-24 or 24-26.
In accordance with a preferred embodiment of the invention, the end points of segment 28 are identified as being anchored relative to identifiable points (such as points 22, 24, 26 and 30) or segments (such as segments 22-24 and 24-26). The selection of either points or segments may depend on the tracking method used to identify the edge over a sequence of images. Alternatively or additionally, the anchors may comprise other directly identifiable features, for example, a point on object 32, which is internal to object 20. In accordance with a prefeπed embodiment of the invention, interior and/or prominent and/or anchor points are selected based on their being a junction, a local pattern, a color, a transparency, a local variance of noise, an angle, a color and/or gray-scale spectra, a statistically significant difference in intensity, blurring profile, edges, or on frequency analysis and/or based on a combination of these characteristics.
26 In a preferred embodiment of the invention, endpoints of the sections and/or control points therein are defined as being a fixed distance and/or orientation from certain anchors. Alternatively a range of distances is allowed. Preferably, the position is defined as a relative distance between a plurality of points. Alternatively or additionally, the relative positioning is dependent on more global parameters, such as the size of the object (which can be estimated from the length of an appropriately defined user defined section or of a particular segment). In a prefeπed embodiment of the invention, a control point may be used to force a minimum distance from the control point to an anchor.
In a preferred embodiment of the invention, when the section is tracked, it is positioned relative to the directly tracked points which are closest to it. If however, one of these points is invisible or tracked with a low quality, a point that is further away or that has a lower stability (e.g., not on a rigid body) is used. In a prefeπed embodiment of the invention, an optional stickiness parameter may be defined. In this embodiment, a designated portion of the section or an end or a control point thereof are automatically repositioned to be aligned with- or positioned relative to- a prominent point on the object. Preferably, the prominent point is predefined by a user. Alternatively, the point may be automatically detected by the system. Preferably, this stickiness is restricted to a small number of pixels. In one example, if a straight arm is bent, it is desirable that the location of an elbow section be identified with the bend of the arm in the picture. In one prefeπed embodiment of the invention, this bend is automatically detected. Alternatively or additionally, the user indicates the prominent point in a later frame and the system backtracks to a first frame to define the elbow location and/or to set parameters for the stickiness. Alternatively or additionally, the system backtracks to modify a stickiness parameter, so that the motion of the user defined section will more closely resemble the actual motion. For example, it is usually undesirable for the user-defined section to move in a non- smooth motion only because the prominent point suddenly appear or disappears.
In a preferred embodiment of the invention, a user defined section may be an arbitrary line which does not follow a border. In one example, such a line may be used to split an object into two portions, each of which is to be post-processed differently. Thus, the user defined section may be defined to be unrelated to the contour of the border, but only to locations of points on the border. In another example, such a line may be used to define a special effect that is either inside the object or outside the object, for example a skeleton or a halo. Thus, the user-defined section may be defined to substantially conform to at least a portion of the border. In a preferred embodiment of the invention, instead of tracking the border, it may be inteφolated, such as using key-framing. However, this type of pseudo-tracking is of a
27 generally lower quality since there is no clear indication of exactly where the border is. Thus, many types of special effects, such as object compositing may create artifacts. Preferably, the edges of such a composited object are smoothed, to coπect for any artifacts caused by not knowing exactly where the border is. In addition, key-framing usually requires a significant increase in the amount of user input required to post-process a video sequence, as each keyframe is typically manually entered and/or adjusted.
In accordance with a prefeπed embodiment of the invention, sub-sections can be defined relative to the sections, for example, a sub-section 34 of section 28. Preferably, the sections are organized in an object-oriented hierarchical structure, with one or more levels and/or entities being defined relative to the previous level. In one example, a positional relationship is defined between entities on different hierarchical levels. In some prefeπed embodiments of the invention, a section in a hierarchical level may also be anchored relative to a level two or more hierarchical stages away. In case of conflict between the different anchors, an order may be defined between anchors. Additionally a relative weight of the anchors may be defined. In addition, various properties may be defined to be inherited from other entities and/or other levels, especially scripts for dealing with events.
In a preferred embodiment of the invention, one or more of the tracked video entities may be invisible or occluded. However, these entities still exist and post processing can be performed responsive to them, for example x-ray viewing of the hidden object through a solid object by drawing a border and a skeleton of the hidden object on the occluding solid object. In addition, other entities and/or effects may be described to be relative to the invisible entity.
Alternatively to a hierarchical structure, at least part of the information and/or video entities may be represented using other types of data structures, such as lists, with arbitrary relationships between the elements. In accordance with a prefeπed embodiment of the invention, the segments and/or the special points that are tracked, also belong to the hierarchy. Preferably, sub-segments may be defined on a lower hierarchical level than the segments. In a prefeπed embodiment of the invention, information, especially information related to tracking, may be associated with each such element. Preferably, a script for information evaluation and/or video post-processing is defined for some or all the elements in the hierarchy. In some cases a user will desire to directly manipulate the segments and/or points as video entities in addition to or instead of user defined sections.
In accordance with a preferred embodiment of the invention, one or more sections to be tracked are automatically suggested, for example based on image characteristics, such as
28 symmetry and entropy. Alternatively, such suggested sections are defined based on a model of the object, the scene and/or the border.
In some preferred embodiments of the invention, a section may be functionally defined, for example: "the clockwise-most x blue pixels in a border segment (or section) where x is a function of the length of the entire segment". In addition, in some prefeπed embodiments of the invention, a section may be split into two, for example, when an object decomposes and/or explodes. In another example, if the hand in Fig. 2 is bent and/or moved to partially overlap the body, a new section may have to be defined and it may be defined as a function of the determination of the old section. Fig. 3 is a schematic diagram illustrating a border definition method, in accordance with a prefeπed embodiment of the invention. An object 40, for which a border is to be defined, is a key. The border is shown distanced from the outline of the key for reasons of clarity. However, in a practical situation, the border will usually overlap the actual border of the key. In accordance with a prefeπed embodiment of the invention a user may draw a suggested border (step 10 in Fig. 1) using free-hand drawing and/or using graphic primitives and/or by correlation with existing templates and/or using color keying and/or any combination of the above. As can be appreciated, such an initial entry of a border may not be very precise. However, it should be noted that the type of eπor for different parts of the object may be very different. In the example of Fig. 3, border portion 44 will generally be a straight line and is preferably entered using a line primitive. An expected type of error is that one or both of the end points is not properly located. Border portion 48 is a free-hand pixel-by-pixel drawing which attempts to follow the exact outline of the key. The types of error to be expected are an inexact rendering of the outline (which can also be automatically identified and drawn by the system) and/or an offset from the actual border. Border portion 50 is a spline that follows the curve of the key. Border section 52 is a section which may be entered freehand or by spline and which follows the tip of a tassel on the key.
In accordance with a prefeπed embodiment of the invention, the system automatically identifies possibly erroneous data entry points and queries the user as to whether and how to correct them. In one example, a point 54 on the drawn border should probably aligned with point 56 on the key border. Point 56 may be identified by its being a prominent point in the outline of the object. Point 54 may be identified by its being a prominent point on the drawing of the border. Point 58, however, is intentionally not directly related to the outline of the key, as a user will probably indicate to the system, so it will not be coπected. In addition to
29 aligning control points of graphical primitives, it is possible to align lines and the like. In addition, if an approximate raster is drawn, an exact match of the raster to the border may be suggested by the system, optionally at a user's request. This exact matching may be termed a
"free-free" coπection, since the suggested coπection may also be a free-hand graphical object. Additionally or alternatively, a user can himself draw an accurate representation of the border.
When a point is confirmed, it may either be fixed in place relative to one or more identifiable anchor points or coπected to be aligned with a prominent point. When a coπection is required, the point will generally move. As can be expected, when one point moves it can affect two or more border portions. In one prefeπed embodiment of the invention, free-hand border portions may only move and do not bend, while spline portions, may also be defined to bend. Alternatively, a free-hand drawn portion may be split into two such segments, i.e., it can break at prominent points, automatically selected and/or user-suggested. Alternatively, such breaking is allowed only at a distance from prominent points. In addition, the free-hand border portion may be broken so as to minimize an alignment eπor. Alternatively, a free-hand border portion is treated as a group of segments, each of which is one pixel in size.
In accordance with a preferred embodiment of the invention, a spline may be generated from a free-hand drawing, for example for border portion 50. Preferably, the system identifies prominent points along the drawn portion and/or in the outline of the object and attempts to match one to the other. A minimum number of spline points are maintained which allows a good coπespondence between the outline and the drawn border. In one prefeπed embodiment of the invention, a fixed number of control and/or sub-division points are defined. If any change is required in their number, a user may be prompted to allow a change. In a prefeπed embodiment of the invention, the drawn portion that is converted into a spline may also include other graphic primitives, including multiple types of splines. In a prefeπed embodiment of the invention, the spline is generated by the following method. A user draws a border, using any of the techniques described herein and optionally adjusts the border to more closely follow the border of the object and/or and arbitrary feature on or off the object. If subdivision of the segment is allowed, the segment is preferably subdivided into sub-segments at prominent points, and each of the sub-segments may be individually approximated (by spline or otherwise). The number of sub-segments into which the segment is split preferably depends on the number of prominent points in the segment. Alternatively or additionally, the number of sub-segments may depend on the number of prominent points which are over a threshold of a prominence criterion. Altematively or additionally, the number of sub-segments depends on the length of the sub-segments.
30 Preferably, different values for one or more of these various limitations may be associated with each section and/or segment.
A spline is fitted for a particular sub-segment (or a whole segment if sub-division is not allowed). A method for fitting a spline to an arbitrary line is described for example in "An Algorithm for Automatically Fitting Digital Curves", in Graphic GEMS (vol. 1), edited by
Andrew J. Glassner, p. 612, published by AP Professional, Academic Press, 1990, the disclosure of which is incoφorated herein by reference.
In a preferred embodiment of the invention, the spline fitting takes into account visual artifacts. In one example, if the angle between the two ends of the spline is too large, especially if each end is directed to a different side of a line connecting the two ends, the angle of each end are preferably made smaller, for example reduced by half or by some other user defined value and/or reduced to below a threshold value. This adjustment preferably increase the convergence speed of the spline fitting. In addition, it reduces the probability that there will be a cusp or a loop in the spline. In another example, if the handles of the spline cross, one or both of them are shortened so that they do not meet, since crossing of the handles may create a loop. In a preferred embodiment of the invention, if there is a cusp and/or a loop in the spline, parameters of the spline are changes so that the cusp and/or the loop are eliminated, possibly at the expense of a larger eπor.
After fitting, a check is made to see if the fitting results in a fit error, which is compared to a threshold. This threshold may be a constant or a function, possibly of the segment shape and/or length. In addition, the fit eπor may be based on an average eπor along the border segment, on a weighted eπor, which takes into account portions where precision is more important (possibly by user input), an allowed variance, the number of sub-divisions and/or iterations performed, an error in other segments, especially neighboring segments and/or it may be based on a cumulative eπor value. If the eπor is below the threshold, the fitting is complete. Otherwise, the segment is subdivided again. If such additional subdivision is not allowed, an attempt may be made to fit a different type of graphical object, such as a different spline type, a straight line, a raster or based on a table of shapes for special situations. The additional subdivision may be performed at prominent points that, for some reason, were not included in the main subdivision. Alternatively, the subdivision is performed at arbitrary points (automatically selected), which may be directly tracked or which may be indirectly tracked by using prominent points. Preferably, this process of sub-division and fitting is repeated until the error threshold is met, a certain number of iterations has passed and/or a subdivision limit is reached.
31 It should be appreciated that in some prefeπed embodiments of the invention there is no obligation on the system to directly track any of the points used by the user to draw the border. Generally however, one or more of the points used by the user to define sections may also be selected by the system for the puφose of tracking the border. Conversely, the system may suggest section definitions based on the user entered border drawing.
In a particular example of definition of edge types for tracking, in accordance with a preferred embodiment of the invention, in the man of Fig. 2, the arms will generally be defined as splines and the head and hands as a raster. The hair portion border will generally be defined by color-keying relative to the raster. If a portion of the man, for example his pants are the same color as the background, some segments may be tracked by approximation, for example the knee position may be approximated by the position of the belt and of the shoes. Preferably, such an approximation takes into account the dynamics of the human leg, i.e., if the leg is bent the location of the knee is not on a line connecting the shoe and belt.
Fig. 4 is a schematic diagram of an arbitrarily defined area to be tracked, in accordance with a preferred embodiment of the invention. In some cases it is desirable to apply a postprocessing effect, such as object insertion, to an arbitrary portion on an object in a video stream. In the example of Fig. 4, an object 60 is a man, and it is desired to "attach" a sheriffs badge 68 unto his shirt. The shirt may be substantially featureless. Alternatively it may be over- featured, for example being a dense plaid, individual points of which are very difficult to track using coπelation tracking. In accordance with a prefeπed embodiment of the invention, a plurality of special points and/or features on object 60 are tracked directly. Such objects may include a pocket 64 and buttons 62. Alternatively or additionally, these points may include directly tracked points or segments along the border or even indirectly tracked sections along the border. An arbitrary area is defined relative to the tracked objects, which can serve as anchor points. Preferably, when the anchoring is defined, a plurality of relationships are defined between the anchors and the arbitrary area. Preferably, relationships are defined to more than three points in- and/or on- the border of the tracked area. Thus, the problem of determining the location and shape of the area is over defined, rather than under defined. In particular, it becomes possible to distort the tracked area in response to distortion of object 60. For example, when the man rums, the relative horizontal locations of all the objects change, so that the horizontal dimension of the tracked area will also change, providing a more realistic object insertion. In a preferred embodiment of the invention, the type of- and parameters relating to- the relationships and/or other aspects of tracking are also individually associated with each section.
32 In a preferred embodiment of the invention, tracking of arbitrary points, areas and/or sections is used to reconstruct a three-dimensional surface from the two dimensional locations. Preferably, such reconstmction is used as an input for tracking arbitrary areas and/or section and/or for performing post-processing. The 3D information is preferably reconstructed by tracking the relative changes in the locations of tracked points and borders and applying various assumptions regarding rigidity. Such assumptions are preferably provided from a model of the 3D object being tracked. In addition, parallax effects and object hiding effects may also be detected.
In a preferred embodiment of the invention, arbitrary points are tracked in addition to or instead of tracking areas or border sections. It should be appreciated that it is possible to track not only the position of the arbitrary point but also its orientation relative to the rest of object 60 or portions thereof or other objects in the scene.
In a preferred embodiment of the invention, information, scripts, hierarchical structure and/or other features of the invention as described above with respect to arbitrary border section tracking, may also be applied, to arbitrary tracked areas.
It should be appreciated that although many of the above embodiments have been described for video sequences which comprise consecutive image frames, this is not necessary in all embodiments of the invention. In one prefeπed embodiments of the invention, the above methods are used to find a coπespondence between two images of the same or similar scenes, at different times and/or different orientations and/or different color filters and/or different photographic parameters. Additionally or alternatively, these methods may be used for two parallel video scenes which contain similar video entities, border portions or other features which can be tracked. Additionally or alternatively, these methods may be used to simultaneously apply special effects or object insertions to a left and a right stereo pair. It should be appreciated that the above described apparatus and methods for video postprocessing contain many features, not all of which need be practiced in all embodiments of the invention. Rather, various embodiments of the invention will utilize only some of the above described techniques, features or methods and/or combinations thereof. In addition, although the present invention has been described mainly with reference to methods, the scope of the invention is intended to cover also apparatus for performing such methods and especially software for doing so. The scope of the invention also includes computer readable media on which such software is stored. In a prefeπed embodiment of the invention, the system is embodied as a plug-in software module for an existing post-production program, such as
Adobe® After-Effects® , which can be run on a general puφose computer, such as an Apple®
33 Macintosh® or an IBM PC Pentium® computer. Altematively or additionally, it may be at least partially embodied using special puφose hardware that preferably includes a video acquisition card.
It will be appreciated by a person skilled in the art that the present invention is not limited by what has thus far been described. Rather, the present invention is limited only by the claims which follow.
34

Claims

1. A method of video post processing a video sequence, comprising: identifying a first video entity in a least one frame of the video sequence; identifying at least a second video entity in the video sequence; and automatically performing post-processing on a plurality of frames of the video sequence responsive to an interaction of the two entities.
2. A method of video post processing a video sequence, comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; automatically applying a first post-processing operation to at least a portion of at least one of said frames, said post processing being associated with said first entity; and substantially simultaneously applying a second post-processing operation to at least a second portion of one of said frames, said post processing being associated with said second entity, wherein said first post-processing operation is substantially different from said second post-processing operation.
3. A method according to claim 2, wherein said first and second post-processing operations utilize different parameter values for a similar post-processing operation.
4. A method according to claim 2, wherein said first and second post-processing operations comprise different post processing techniques.
5. A method according to claim 2, wherein said two entities comprise portions of a single object.
6. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises adding a shadow.
7. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises compositing.
35
8. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises blurring.
9. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises applying a local filter.
10. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises smearing.
11. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises color adjustments.
12. A method according to claim 2, wherein at least one of said first and said second post- processing operations comprises local color keying.
13. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises local addition of noise.
14. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises graphical annotation.
15. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises wrinkle removal.
16. A method according claim 2, wherein at least one of said first and said second postprocessing operations comprises video painting.
17. A method according to claim 2, wherein at least one of said first and said second post- processing operations comprises object resizing.
18. A method according to claim 2, wherein at least one of said first and said second postprocessing operations comprises waφing.
36
19. A method according to any of claims 2-18, wherein said first video entity is tracked independently of said second video entity.
20. A method of video post processing a video sequence, comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; associating different post-processing information with the different entities.
21. A method according to claim 20, comprising changing the information between frames.
22. A method according to claim 21, comprising changing said information responsive to image content.
23. A method according to claim 22, wherein said content is at a location remote from the location of the entity with which it is associated.
24. A method according to claim 21, comprising changing said information responsive to a performance of said tracking.
25. A method according to claim 21, comprising changing said information responsive to a performance of said tracking.
26. A method according to claim 21, comprising changing said information responsive to information associated with a different entity.
27. A method according to claim 21, comprising changing said information in a current frame responsive to information associated with a different frame.
28. A method according to claim 21, comprising changing said information responsive to a frame-count parameter.
29. A method according to claim 21, comprising changing said information responsive to a history of values for said information.
37
30. A method according to claim 21, comprising changing said information responsive to a user input.
31. A method according to any of claims 20-30, wherein said first video entity is tracked independently of said second video entity.
32. A method according to any of claims 20-30, wherein said information comprises a type of tracking to use for that entity.
33. A method according to claim 32, wherein said type of tracking includes an instruction not to track the entity.
34. A method according to claim 32, wherein said type of tracking fixes at least one degree of freedom of motion of the entity.
35. A method according to claim 34, wherein said at least one degree comprises translational motion.
36. A method according to claim 32, wherein said type of tracking fixes at least one limitation on changes in the entity.
37. A method according to claim 36, wherein said limitation comprises a size change limitation.
38. A method according to any of claims 20-30, wherein said information comprises a type classification of the entity.
39. A method according to any of claims 20-30, wherein said information comprises a color of the entity.
40. A method according to any of claims 20-30, wherein said information comprises graphical annotation data.
38
41. A method according to any of claims 20-30, wherein said information comprises at least part of an appearance/disappearance profile for the entity in the video sequence.
42. A method according to any of claims 20-30, wherein said information comprises a smoothness of the entity.
43. A method according to any of claims 20-30, wherein said information comprises a visibility of the entity.
44. A method according to any of claims 20-30, wherein said information comprises anchor points.
45. A method according to any of claims 20-30, wherein said information comprises at least one motion vector.
46. A method according to any of claims 20-30, wherein said information comprises parameters for key- framing.
47. A method according to any of claims 20-30, wherein said information comprises depth information.
48. A method according to any of claims 20-30, wherein said information comprises a parameter of the tracking for that entity.
49. A method according to any of claims 20-30, wherein said information comprises a script.
50. A method according to claim 49, wherein said script comprises a script for performing video post processing.
51. A method according to claim 49, wherein said script comprises a script for updating said information.
39
52. A method of video post processing a video sequence, comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; and identifying an interaction between said two entities.
53. A method according to claim 52, wherein said first video entity is tracked independently of said second video entity.
54. A method according to claim 52, wherein said interaction is an overlap between said two entities.
55. A method according to claim 52, comprising generating a new video entity as a result of said interaction.
56. A method according to claim 55, wherein said new entity is a shared border between two video objects.
57. A method according to any of claims 55-56, wherein said new entity replaces at least a portion of at least one of the first and second entities.
58. A method according to any of claims 52-56, comprising modifying at least one of said first and second entities, in response to said interaction.
59. A method according to any of claims 52-56, comprising deleting at least one of said first and second entities, in response to said interaction.
60. A method according to any of claims 1-18, 20-30 or 52-56, wherein said entities comprise arbitrary points.
61. A method according to any of claims 1-18, 20-30 or 52-56, wherein said entities comprise arbitrary areas.
40
62. A method according to any of claims 1-18, 20-30 or 52-56, wherein said entities comprise arbitrary lines.
63. A method according to claim 62, wherein said lines overlap borders of objects.
64. A method according to claim 62, wherein said lines no not overlap borders of objects.
65. A method according to any of claims 1-18, 20-30 or 52-56, wherein said entities comprise portions of the video frames which are directly tracked.
66. A method according to any of claims 1-18, 20-30 or 52-56, wherein said entities have extents and wherein said extents are related to an image content.
67. A method of video post processing a video sequence, comprising: tracking a video entity across a plurality of frames in the sequence; automatically performing video post-processing responsive to the tracking, which postprocessing has a success criterion; and determining from the success criterion of said post-processing, at least one parameter which affects said success.
68. A method according to claim 67, wherein said tracking comprises tracking prominent points and wherein said parameter comprises a le for identifying a point as a prominent point.
69. A method of spline fitting, comprising: manually entering a graphical object, comprising of more than one graphical primitive of more than one type; and generating a spline which matches at least a portion of a border of said entered graphical object and which meets an eπor criterion.
70. A method according to claim 69, comprising automatically sub-dividing said graphical object, at prominent points of the object and generating an individual spline for each subdivision.
41
71. A method according to claim 70, wherein said prominent points comprise points of maximum local curvature.
72. A method according to claim 70, wherein said prominent points are selected based on a local pattern.
73. A method according to claim 70, wherein said prominent points are selected based on a local noise analysis.
74. A method according to claims 70, wherein said prominent points are selected based on being a junction of three or more colors.
75. A method according to any of claims 69-74, comprising automatically sub-dividing said graphical object, at non-prominent points of the object and generating an individual spline for each sub-division.
76. A method according to any of claims 69-74, comprising adjusting parameters for said spline, such that loops in the spline are avoided.
77. A method of object border definition for tracking across a plurality of frames of a video sequence, comprising: defining a first portion of said border as a first type of graphic primitive; defining at least a second portion of said border as a second, different, type of graphic primitive; and tracking said first and second portions across said plurality of frames.
78. A method according to claim 77, wherein at least one of portions is defined using a free-hand graphical object.
79. A method according to claim 77, wherein at least one of said portions is defined using a spline graphical object.
42
80. A method according to claim 77, wherein at least one of said portions is defined using a color-key.
81. A method according to any of claims 77-80, comprising tracking said first and said second borders across a plurality of frames in a video sequence.
82. A method according to claim any of claims 77- 80, wherein said object border comprises an incomplete border.
83. A method according to any of claims 77- 80, wherein said object border comprises a complete border.
84. A method of spline fitting, comprising: manually entering a graphical object, comprising of more than one graphical primitive of more than one type, which object comprises an outline corresponding associated with an object; automatically adjusting at least a portion of said outline to match at least a portion of said object; and generating a spline which matches at least a portion of said adjusted outline.
85. A method of video post processing a video sequence, comprising: arbitrarily identifying an area on a frame of said sequence; tracking said arbitrary area across a plurality of frames of said sequence; and applying a video post-processing to said image responsive to said tracking.
86. A method of video post processing a video sequence, comprising: arbitrarily identifying a line on a frame of said sequence; and tracking said arbitrary line across a plurality of frames of said sequence; applying a video post-processing to said image responsive to said tracking.
87. A method according to claim 86, wherein said arbitrary line comprises a portion of a border of an object.
43
88. A method according to any of claims 85-87, wherein said arbitrary identification is divorced from technical considerations regarding tracking.
89. A method of video post-processing planning, comprising: manually identifying a video entity; automatically suggesting at least one parameter related to tracking of said entity; and automatically tracking said entity across a plurality of frames of a video sequence.
90. A method according to claim 89, wherein said parameter comprises a type of tracking.
91. A method of video post processing a video sequence, comprising: tracking a first video entity across a plurality of frames of said sequence; retouching a plurality of frames of said sequence, responsive to said tracking and responsive to one or more image portions in said sequence.
92. A method according to claim 91, wherein at least one of said image portions is in the same frame as the retouched frame.
93. A method according to claim 92, wherein said at least one image portion comprises at least two image portions in the same frame as the retouched frame.
94. A method according to any of claims 91-93, wherein at least one of said image portions is in a different frame from the retouched frame.
95. A method according to claim 94, wherein said at least one image portion comprises a plurality of image portions, at least two of which are in different frames.
96. A method according to any of claims 91-93, wherein said retouching comprises copying said at least one of said one or more image portion into the retouched frame.
97. A method according to any of claims 91-93, wherein said retouching comprises texture mapping said at least one image portion into the retouched frame.
44
98. A method according to any of claims 91-93, wherein said retouching comprises waφing said at least one image portion into the retouched frame.
99. A method according to any of claims 91-93, wherein said retouching comprises blending said at least one image portion into the retouched frame.
100. A method according to any of claims 91-93, wherein retouching comprises color- balancing responsive to said at least one image portion.
101. A method according to any of claims 91-93, wherein retouching comprises graining responsive to said at least one image portion.
102. A method according to any of claims 91-93, wherein retouching comprises noise adding responsive to said at least one image portion.
103. A method according to any of claims 91-93, wherein retouching comprises adjusting an image of an inserted object, to match scene characteristics, responsive to said at least one image portion.
104. A method of three-dimensional surface reconstmction, comprising: tracking a first video entity across a plurality of frames of said sequence; tracking a second video entity across a plurality of frames of said sequence; and reconstructing a surface representation of at least a portion of an object, responsive to said tracking.
105. A method according to claim 104, wherein said entities comprise lines.
106. A method according to claim 104 or claim 105, wherein said entities are attached to said object.
107. A method according to claim 106, wherein said entities comprise border portions of at least one object.
45
108. A method according to any of claims 104-105, wherein said first and second tracking comprise tracking changes in the shape of said entities.
109. A method according to any of claims 104-105, wherein said first and second tracking comprise tracking changes in the relative locations of said entities.
110. A method according to any of claims 104-105, comprising performing a post processing operation on said video sequence, responsive to said reconstructed surface.
111. A method according to any of claims 104- 105, comprising modifying a tracking of an entity associated with said surface, responsive to said reconstmction.
46
PCT/IL1998/000383 1998-04-05 1998-08-13 Feature motivated tracking and processing WO1999052063A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IL123955 1998-04-05
US9602498P 1998-08-10 1998-08-10

Publications (1)

Publication Number Publication Date
WO1999052063A1 true WO1999052063A1 (en) 1999-10-14

Family

ID=22254758

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL1998/000383 WO1999052063A1 (en) 1998-04-05 1998-08-13 Feature motivated tracking and processing

Country Status (1)

Country Link
WO (1) WO1999052063A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010151137A1 (en) 2009-06-24 2010-12-29 Tandberg Telecom As Method and device for modifying a composite video signal layout

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5052045A (en) * 1988-08-29 1991-09-24 Raytheon Company Confirmed boundary pattern matching
US5430496A (en) * 1992-04-29 1995-07-04 Canon Kabushiki Kaisha Portable video animation device for creating a real-time animated video by combining a real-time video signal with animation image data
US5512939A (en) * 1994-04-06 1996-04-30 At&T Corp. Low bit rate audio-visual communication system having integrated perceptual speech and video coding
US5550928A (en) * 1992-12-15 1996-08-27 A.C. Nielsen Company Audience measurement system and method
US5611036A (en) * 1990-11-30 1997-03-11 Cambridge Animation Systems Limited Apparatus and method for defining the form and attributes of an object in an image
US5657251A (en) * 1995-10-02 1997-08-12 Rockwell International Corporation System and process for performing optimal target tracking

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5052045A (en) * 1988-08-29 1991-09-24 Raytheon Company Confirmed boundary pattern matching
US5611036A (en) * 1990-11-30 1997-03-11 Cambridge Animation Systems Limited Apparatus and method for defining the form and attributes of an object in an image
US5692117A (en) * 1990-11-30 1997-11-25 Cambridge Animation Systems Limited Method and apparatus for producing animated drawings and in-between drawings
US5430496A (en) * 1992-04-29 1995-07-04 Canon Kabushiki Kaisha Portable video animation device for creating a real-time animated video by combining a real-time video signal with animation image data
US5550928A (en) * 1992-12-15 1996-08-27 A.C. Nielsen Company Audience measurement system and method
US5512939A (en) * 1994-04-06 1996-04-30 At&T Corp. Low bit rate audio-visual communication system having integrated perceptual speech and video coding
US5657251A (en) * 1995-10-02 1997-08-12 Rockwell International Corporation System and process for performing optimal target tracking

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010151137A1 (en) 2009-06-24 2010-12-29 Tandberg Telecom As Method and device for modifying a composite video signal layout

Similar Documents

Publication Publication Date Title
US6124864A (en) Adaptive modeling and segmentation of visual image streams
Xu et al. Video-based characters: creating new human performances from a multi-view video database
Zhang et al. Geometry-driven photorealistic facial expression synthesis
Beeler et al. High-quality passive facial performance capture using anchor frames.
Zhang et al. Robust and rapid generation of animated faces from video images: A model-based modeling approach
US9036898B1 (en) High-quality passive performance capture using anchor frames
US6249285B1 (en) Computer assisted mark-up and parameterization for scene analysis
KR100720309B1 (en) Automatic 3D modeling system and method
US7065233B2 (en) Rapid computer modeling of faces for animation
US20080018668A1 (en) Image Processing Device and Image Processing Method
US9224245B2 (en) Mesh animation
KR20120054550A (en) Method and device for detecting and tracking non-rigid objects in movement, in real time, in a video stream, enabling a user to interact with a computer system
WO1999026198A2 (en) System and method for merging objects into an image sequence without prior knowledge of the scene in the image sequence
US10467793B2 (en) Computer implemented method and device
JP4246516B2 (en) Human video generation system
WO2001026050A2 (en) Improved image segmentation processing by user-guided image processing techniques
Mori et al. Efficient use of textured 3D model for pre-observation-based diminished reality
WO2009151755A2 (en) Video processing
Wenninger et al. Realistic virtual humans from smartphone videos
Mori et al. InpaintFusion: incremental RGB-D inpainting for 3D scenes
JP3411469B2 (en) Frame multiplex image creation method
Snavely et al. Stylizing 2.5-d video
Fischer et al. Measuring the discernability of virtual objects in conventional and stylized augmented reality
Park et al. Virtual object placement in video for augmented reality
WO1999052063A1 (en) Feature motivated tracking and processing

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): US

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 09680663

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: CA

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase