US20140218358A1 - Automatic tracking matte system - Google Patents

Automatic tracking matte system Download PDF

Info

Publication number
US20140218358A1
US20140218358A1 US14/344,878 US201214344878A US2014218358A1 US 20140218358 A1 US20140218358 A1 US 20140218358A1 US 201214344878 A US201214344878 A US 201214344878A US 2014218358 A1 US2014218358 A1 US 2014218358A1
Authority
US
United States
Prior art keywords
matte
camera
undistorted
live action
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/344,878
Inventor
Newton Eliot Mack
Phillip R. Mass
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lightcraft Tech LLC
Original Assignee
Lightcraft Tech LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lightcraft Tech LLC filed Critical Lightcraft Tech LLC
Priority to US14/344,878 priority Critical patent/US20140218358A1/en
Publication of US20140218358A1 publication Critical patent/US20140218358A1/en
Assigned to LIGHTCRAFT TECHNOLOGY LLC reassignment LIGHTCRAFT TECHNOLOGY LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MACK, NEWTON ELIOT, MASS, PHILIP R
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/20Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/156Mixing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • This invention relates to systems and methods for combining real scene elements from a video, film, digital type camera or the like with virtual scene elements from a virtual camera into a finished composite image, and more particularly, to systems and methods for creating drawn shapes that move along with the camera's motion to control how the virtual and real scene elements are combined.
  • the state of the art in combining real world imagery with additional imagery from another source is a process that requires careful control over which sections of each image are to be used in the final composite image.
  • One common application is to combine images generated by a computer with images acquired from a traditional motion picture, video or digital camera.
  • images generated by a computer with images acquired from a traditional motion picture, video or digital camera.
  • the areas of each image that are to be preserved or modified must be defined. These areas are typically called mattes.
  • Mattes may be defined in a number of ways. In traditional compositing, the mattes are frequently defined by having an artist mark points around the perimeter of the object to be preserved or removed. The computer then connects the dots to form a closed shape, which forms the matte. Problems can arise, however, if the object and/or the camera move relative to the other.
  • a moving camera or object is handled by making a tracking matte, or a matte that moves along with the object. While the methods of moving the matte along with the object vary, they typically center around having the user specify an area of high contrast in the live action image, measuring how that image moves around in the frame, and connecting the motion of the drawn matte to the motion of the high contrast object.
  • This process works, but has several limitations. Firstly, if the high contrast area is located on the front of a character's shirt, for example, and the character turns around, or if the camera moves around to another side of the character, the local effect is destroyed. Secondly, the process of measuring the camera motion by tracking the individual pixels of the high contrast part of the image is both fragile and time-consuming if there is no additional camera data to work from. It typically cannot be computed in real time, and if a frame of the live action image has a lighting change where the pattern is unrecognizable, the artist must re-specify the high contrast area at the frame of failure to continue the process. The process of creating all of the multiple overlapping mattes that are used in a sophisticated visual effects shot can exceed the time required to complete the rest of the shot due to the handwork required.
  • the high contrast area that was being tracked can simply disappear from the image, resulting in the matte failing to track the camera lens change.
  • the pixel tracking based methods do not work well for the demands of real time visual effects processing, which must be very rapid to compute as well as robust to the frame by frame changes in the live action video image.
  • mattes In real time processing, mattes have traditionally been created by surveying the edges of the green screen background using an architectural measurement tool such as a total station, and creating a model of the matte in 3D space.
  • an architectural measurement tool such as a total station
  • models of this type cannot be rapidly modified by the artist under typical time pressure conditions found in entertainment production.
  • an artist selects points on a computer screen to generate a rough outline around the object to be removed or preserved. These points are selected using a 2D display of the live action image, typically by locating a pointer in the desired location and pressing a selection button. The user clicks a mouse around the border of the object, and then selects the inside or the outside of the finished outline to determine on which side of the line the matte will be active. The user can also begin or end the outline at an edge of the screen, in which case the system extrapolates the matte for a given distance out from the edge of the screen. This distance can be five meters or more, generally between one and ten meters.
  • the above process generates a 2D outline.
  • the shape must be converted to a 3D representation.
  • This 3D shape can be a set of attached polygons whose outer perimeter matches the outline of the points that the user selected.
  • the 3D polygon mesh exists at a given point in 3D space.
  • the 3D polygon mesh can be created in a plane normal to the axis of the main virtual camera when the matte is initiated, and at a distance specified by the user.
  • the 2D representation is viewed from the position of the current live action camera.
  • the 3D mesh can be projected from a virtual camera with the same position and orientation as the live action camera.
  • the further away from the camera the polygon mesh is moved the larger it must become for the 2D points to remain in the same relative position on the live action image.
  • This computation can be done automatically by geometric projection as the user moves the 3D polygon mesh closer or further away from the virtual camera. This automatic calculation can take into account the current position and orientation of the virtual and live action cameras, the current focal length and distortion of the cameras, and the sensor size of the cameras.
  • the camera may have moved in this interval, but to keep the points aligned correctly with the original object, the normal along which the 3D mesh is scaled up or down must be known.
  • the mesh points can be manipulated by the artist directly in the 2D user interface, but may be constrained to move only in the original 3D plane in which they were created.
  • a unified matte system is created with individual points that are entered either on the screen in a 2D form as described herein, or directly in 3D from survey coordinate data. Once a given polygon is entered, the various points can be forced into a plane. This plane then defines where the individual points can move when later edited. Thereby, the artist can simply click and drag on an existing matte point to edit it, knowing that it will stay in the plane in which it was created.
  • the 3D mesh object(s) is (are) rendered in separate passes, and grouped together to form the overall set of despill, garbage, or other types of mattes.
  • the points of the mesh can also be entered using 3D survey data.
  • This 3D data can be determined in a variety of ways, including photogrammetry techniques and laser surveying instruments such as a total station.
  • the first three entered points of survey data can be used to set the plane of the rest of the entered survey points of that polygon.
  • the mesh can be made to move along with a separate form of tracking.
  • a separate motion capture system can measure the 3D location of a person, face, or object in real time, and locate the 3D matte mesh at the location of the person.
  • the basic matte shape can be used for many different applications such as a garbage matte (removal of foreground), a despill matte (removal of extra blue or green color), a color grading matte (selective enhancement of one area of the scene's color), and so forth.
  • the matte distance set can be set automatically by measuring the distance from the camera to the subject, such as by acoustic or optical methods, or by measuring the current focus distance from the lens system.
  • a method for creating mattes whose shape can be drawn by an artist, but which tracks automatically as the camera or object moves, is provided.
  • the computations required to move the mattes can be performed in real time.
  • the matte tracking can automatically handle variations in lens focal length or distortion.
  • the matte data can be entered in standard 3D survey coordinate form and rapidly modified by the artist during production.
  • a matte tracking method can be achieved with data that is already existing in a real-time compositing and camera tracking system.
  • FIG. 1 is a perspective view of an embodiment in accordance with the present disclosure.
  • FIG. 2 is a front view of a live action image with a set of points around it representing a rough matte outline in accordance with an embodiment of the present disclosure.
  • FIG. 3 is a perspective view of a 3D compositing scene in accordance with the present disclosure.
  • FIG. 4 is a perspective view of a polygon mesh in a 3D compositing scene in accordance with the present disclosure.
  • FIG. 5 is a top view of a 3D compositing scene in accordance with the present disclosure.
  • FIG. 6 is a perspective view of a live action environment located within a coordinate system in accordance with the present disclosure.
  • FIG. 7 depicts a matte outline before and after applying lens distortion, which uses the present disclosure to generate an automatically tracking matte.
  • FIG. 8 is a block diagram that depicts the data flow through a system of the present disclosure.
  • a rapid, efficient, reliable system for generating an automatically tracking matte that significantly speeds the integration of live action and virtual composite images.
  • Applications ranging from video games to feature films can implement the system in a fraction of the time typically spent tracking multiple areas of high contrast in the image by hand.
  • the system thereby can greatly reduce the cost and complexity of controlling matte motion, and enables a much wider usage of the virtual production method.
  • the process can work with a real-time video feed from a camera, which is presently available on most “still” cameras as well.
  • the process can work with a “video tap” mounted on a film camera, in systems where the image is converted to a standard video format that can be processed.
  • An objective of the present disclosure is to provide a method and apparatus for creating automatically tracking mattes for a live action subject that enable rapid control over how different areas of the image are processed.
  • a scene camera 130 with a lens 190 is positioned to capture a live action image 20 of a subject 10 in front of a background 100 .
  • the subject(s) 10 can be actors, props, and physical sets.
  • the background 100 may have obstructions 115 that may cause problems with the keying process. Per an aspect of the disclosure, the obstructions 115 can be dealt with by creating matte objects of this disclosure.
  • the scene camera 130 can be mounted on a camera tracking system 140 .
  • this camera tracking system 140 can be an encoded pedestal, dolly, jib, crane, or any other form of camera position, orientation, and field-of-view measuring system. Focus distance may also be measured, as the parameters of a lens can change while focusing. There may be more than one scene camera to enable different views of the subject's performance to be captured.
  • the scene camera 130 and the camera tracking system 140 are connected to a video processing system 150 , as depicted in FIG. 8 .
  • the video processing system 150 takes the incoming live action video image 20 , generates the corresponding background virtual matte shape 60 , and performs the automatic tracking matte process using the two images.
  • the video processing system can include a computer with a live video input, a camera tracking data input, and a video card capable of processing 2D and 3D computer graphics calculations.
  • FIG. 2 An embodiment of the present disclosure is illustrated in FIG. 2 .
  • a live action subject 10 is shown in the center of a live action image 20 with a perimeter 30 around the edge.
  • a matte shape 60 is displayed along with the live action image 20 in a user interface 240 .
  • the matte shape 60 outlined by a set of points 40 connected by segments 50 , is drawn around subject 10 .
  • Perimeter points 42 are similar to points 40 , but are located within the boundary set by perimeter 30 . The user selects the location of points 40 one at a time by clicking on the screen.
  • the matte shape 60 will extend off the screen in the direction of the lines 50 as they extend off the screen.
  • the program will recognize this as a closed ring.
  • the user selects whether the matte shape 60 is to be on the inside or the outside of the outline 50 . This can be done by detecting to which side of the closed shape that the user moves the mouse pointer, and then clicks to set the inside or the outside of outline 50 to select.
  • the display in the user interface is 2D, but for correct alignment, all of the various components exist as 3D objects in a virtual scene.
  • the transformation of 3D to 2D coordinates can be done using standard projection geometry calculations that are well known to those skilled in the art.
  • a virtual camera 70 has a frustum 80 that describes the field of view of the virtual camera.
  • the parameters of virtual camera 70 and frustum 80 match the parameters of the live action camera 130 and the lens 190 .
  • a live action image 20 containing a subject 10 is located at a distance from virtual camera 70 , and is centered on the optical axis 82 of the virtual camera.
  • the size of the live action image 20 in the virtual space is determined by its distance from virtual camera 70 ; the further away the live action image is placed in the virtual space, the larger the image must be to fill the view angle described by virtual frustum 80 .
  • the matte shape 60 is shown as located in the 3D space in between virtual camera 70 and live action image 20 . Since the user controls the distance between virtual camera 70 and matte shape 60 , the matte shape can also be located further away from the virtual camera than live action image 20 .
  • an automatic extension 62 of matte shape 60 is seen; this extrapolates the direction of segments 50 that end in perimeter points 42 to go past the limits of the screen.
  • the automatic extension 62 can extend five meters, for example, past the original visible edge of matte shape 60 .
  • the orientation of matte shape 60 can be created perpendicular to the optical axis 82 , and at a user specified distance from the virtual camera 70 . Thereby, the user can measure out how far away the live action subject 10 is, enter that distance into the interface, and know that the matte shape 60 is being created at a matching distance from the virtual camera 70 .
  • Matte shape 60 can also be created using direct input of 3D survey data, measured with an architectural survey tool such as a total station.
  • the entered points of matte shape 60 can be forced to lie on the same plane by using the first three entered points to set the plane definition, with additional entered points projected into that plane to enforce planarity.
  • FIG. 4 demonstrates an embodiment of this method.
  • the outline 50 is automatically converted into a set of polygons 90 with internal edges 92 using an automatic tessellation routine.
  • the automatic extension 62 is similarly tessellated.
  • This automatic tessellation routine can be done using an algorithm called Delaunay triangulation as is well known to practitioners.
  • FIG. 5 demonstrates an embodiment of this process wherein virtual camera 70 and virtual frustum 80 are viewed from the top down to make their geometry clearer. Three positions of matte shape 60 are shown increasing and decreasing in size as they are closer or further away from virtual camera 70 .
  • the size of matte shape 60 is scaled in proportion to the viewing angle or field of view of virtual frustum 80 .
  • the user can also adjust the overall matte shape 60 by moving the points 40 after the original shape has been created.
  • the points 40 can be constrained to their original created plane in 3D space as they are moved around. This enables the artist to manipulate points using a convenient interactive 2D interface common in computers, but have the points stay in the correct 3D plane.
  • the matte shape 60 will need to move along with the subject 10 . This can occur when a foreground subject 10 is moving. (On the other hand, the matte shape does not need to move when it is drawn around a background object, such as a green screen wall, that does not move.)
  • a 3D tracking device 130 can be used to measure the position 120 of the subject 10 in the stage.
  • the position 120 of subject 10 is measured with respect to a coordinate system 110 .
  • This coordinate system 110 can be located identically to the virtual coordinate system used for the rest of the background.
  • the 3D tracking device 130 can be any type of system that can resolve the location of the subject 10 on the stage; and as an example it can be a markerless motion capture system. Since the position 120 of the subject 10 is known by the system, the position of matte shape 60 can be connected to the position 120 of subject 10 , with the result being that the movement of matte shape 60 will be locked to subject 10 even as both subject 10 and virtual camera 110 move around the scene. The orientation of matte shape 60 can change to remain normal to the virtual camera 70 as it is moved.
  • matte shape 60 is created by connecting segments 50 together and the end points of segments 50 are the selected points 40 .
  • the live action image from which the user is selecting points 40 has lens distortion.
  • the selected points 40 in the interface have distortion, which is removed before being converted to a polygon mesh.
  • an undistorted matte shape 62 is created by generating undistorted points 44 based on applying lens distortion removal calculations to the original segment points 40 , and connecting them with undistorted segments 52 .
  • the calculation of undistorted points 42 on the X,Y plane of the user interface and the rendered matte from the original points location can be computed with the following equations:
  • the value of K1 can be generated by a lens calibration system that measures the current distortion of the physical lens at its current setting.
  • An example of a lens calibration system is described in U.S. patent application Ser. No. 12/832,480, which was published as U.S. Patent Publication No. 20110026014 and whose entire contents are hereby incorporated by reference.
  • the conversion of the undistorted points 42 and segments 52 into 3D coordinates can be completed with standard projection geometry calculations well known to practitioners in the field.
  • the undistorted matte shape 62 can be rendered in 2D space and the reverse of the above distortion calculations can be applied to it. In this way, the undistorted matte shape 62 is properly displayed no matter what the current live action lens is doing.
  • FIG. 8 The data flow of the system is illustrated in FIG. 8 .
  • a number of the processing steps described in earlier figures are combined into the video processing system 150 .
  • a scene camera 130 transmits a live action image 20 to 2D compositor 180 .
  • Camera tracking system 140 measures and transmits camera data 160 to data combiner 300 .
  • Lens 190 transmits lens position data 200 to the lens calibration table 210 .
  • Lens calibration table 210 looks up the appropriate lens data 230 and transmits that data to data combiner 300 .
  • Data combiner 300 then transmits the combined camera and lens data 310 to the 3D renderer 290 , to distortion removal processor 260 , to 2D-to-3D converter 270 , and to distortion addition processor 170 .
  • the distortion removal processor 260 uses the combined data 310 , which includes lens data 230 , the distortion removal processor 260 creates a set of undistorted points 44 that are transmitted to the 2D-to-3D converter 270 .
  • the distortion removal processor 260 can use the distortion algorithms mentioned with respect to FIG. 7 .
  • the 2D-to-3D converter 270 calculates the matte shape 60 and sends it to 3D renderer 290 .
  • the calculation of the 3D matte geometry based on the undistorted points 44 and the combined data 310 can use projection geometry calculations that are well known to those skilled in the art.
  • 3D renderer 290 can use matte shape 60 and the combined camera and lens data 310 to place a virtual camera 70 and frustum 80 .
  • the 3D renderer generates a 2D undistorted matte shape 62 .
  • the creation of a 2D undistorted shape from 3D geometry is essentially the reverse of the 2D-to-3D conversion mentioned in the previous paragraph, and is well known to those skilled in the art.
  • the 3D renderer 290 then sends the 2D undistorted matte shape 62 to the distortion addition processor 170 .
  • the distortion addition processor 170 using the lens data 230 contained in combined data 310 , creates a distorted 2D matte image 175 and sends it to 2D compositor 180 .
  • the calculations to add this distortion can be the same as described for FIG. 7 .
  • a goal of this 2D-to-3D and 3D-to-2D conversion is to allow the user to select and manipulate points on a 2D user interface 240 containing live action image 20 that actually generate correct matte shape 60 which when rendered with the same lens distortion as the live action image 20 , results in a matte image 175 that lines up with the original perimeter point 40 selected by the user. Otherwise, the matte image 175 would appear in a different place than that selected, and this would be a frustrating interface for the user.
  • the same rendering and distortion addition process can be used to create virtual background scenes that will be combined with the live action image 20 in the 2D compositor 180 .
  • Background scene geometry 320 from an external 3D content creation software program such as Maya is loaded into the 3D renderer 290 , which generates an undistorted background image 340 .
  • This is sent to the distortion addition processor 170 , which then applies the same lens distortion addition used for the matte image 175 to result in background image 185 .
  • 2D compositor 180 uses the matte image 175 to selectively process portions of the live action image 20 in combination with background image 185 to generate a composited image 320 .
  • the composited image 320 can be delivered in the form of a live action actor placed into a virtual background, for example.
  • the user simply clicks on perimeter points 40 and they appear correctly on the screen of the user interface 240 in the expected position. This is because they have been correctly converted to accurate 3D spatial coordinates and re-drawn with matching lens and camera data.
  • the convenience of 2D drawn mattes is preserved, while operating in a fully-tracked 3D world, which is needed for complex real-time visual effects.
  • the following prompts are provided to the user at the user interface: selectable and draggable points that overlay a live action image.
  • An alternative program provides the following prompts: numerical XYZ entry fields for direct input of 3D coordinate points.
  • the resulting drawn or surveyed mattes can be used in a variety of manners.
  • the mattes can be used as a garbage matte or a despill matte.
  • Garbage mattes are used to completely remove unwanted sections (like a hanging microphone in front of the green screen) of the live action image.
  • the garbage mattes replace that part of the live action image with the computer-generated image underneath.
  • despill mattes are used to preserve part of the foreground image from being keyed (the green area made transparent), but still “clamping” the green (limiting the green level to the largest of either the red or the blue levels) to remove the greenish cast that otherwise permeates all through the image from the reflected light off the green screen.
  • An example is a green screen placed outside a window, but the green reflects onto a glass table indoors, making it green.
  • a despill matte removes the green tinge from the glass top, but without making it transparent. That is, a despill matte defines the part of the live action foreground to apply only the despill process, as opposed to the keying process, both of which are well known to practitioners in the art.
  • An alternative embodiment is the creation of the ‘holdout’ matte. This is typically based on live action objects in the scene, and is used to force virtual objects to be behind the live action objects, or to enable virtual objects to cast virtual shadows on live action objects. This is the area of use most likely for 3D mattes generated from natural feature tracking.
  • the 3D objects that are used to describe the matte positions can be saved and exported to external applications for post-production. They can be saved into a Collada or other 3D file format that is easily imported into other standard visual effects applications.
  • Alternative embodiments include using the mattes to drive a color grading process, so that the matte defines the part of the image to which to apply a color transformation. In this way, the process of correcting images manually, shot by shot, can be heavily automated.
  • Additional alternative embodiments include the automated movement of different points in the matte according to different tracking points from a 3D tracking system, or using facial tracking connected to the main camera to drive the matte tracking to only track facial features.
  • systems of the present disclosure have many unique advantages such as those discussed immediately below.
  • the artist can edit the 3D points by dragging them around in a 2D interface, while preserving their location on their original 2D plane. This gives the artist fast interaction, while avoiding confusing “out of plane” geometry.
  • Using a 2D interface can be accomplished by real time undistortion and re-distortion, to create correctly matched geometry while providing a convenient, familiar 2D interface.
  • Most compositors only work with 2D, and 3D can be confusing to them.
  • Automatically extending the matte beyond the edges when using the perimeter points allows the compositor to extend the matte without requiring the camera operator to move back and forth.
  • the system allows the mattes to be stored and exported for future use, which is particularly useful for example for the following applications: Nuke, After Effects, Shake, Flame, and Inferno.
  • a system of the present disclosure can include a graphics card or CPU that includes: (a) a distortion removal processor 260 programmed to create a set of undistorted points; (b) a 2D-to-3D converter 270 configured to use the set of undistorted points to calculate 3D matte geometry; (c) a 3D renderer 290 configured to use the 3D matte geometry to generate a 2D undistorted matte shape; (d) a distortion addition processor 170 programmed to use the 2D undistorted matte shape to create a distorted 2D matte shape; and (e) a 2D compositor 180 configured to use the distorted 2D matte shape to combine at least one portion of a live action image with at least one other image to generate a composited image.
  • the composited image can be delivered in the form of a high definition serial digital interface signal to an external recording system.
  • An example of a commercially available graphics card that can be so programmed is the Quadro card available from nVidia Corporation of Santa Clara, Calif.
  • the above-mentioned graphics card or CPU can also include data combiner 300 and lens calibration table 210 , or the processes can be divided between a graphics card and a CPU.
  • the present automatic matte tracking system can be based on the prior art Prelegion system, which is/was available from Lightcraft Technology of Venice, Calif.
  • the Prelegion system includes a camera tracking system, a lens calibration system, a real-time compositing system, and a built-in 3D renderer.
  • the tracking mattes feature adds the ability to hand draw mattes in 2D on the screen, that are then converted into a 3D space by the system, enabling it to move automatically as the camera moves, and in real time.
  • An example of a publication disclosing the prior art Prevulon system is the Prelegion product brochure, entitled Prelegion Specifications 2011, published on Apr. 8, 2011, and whose contents are incorporated by reference.
  • An embodiment of the present system can be made by modifying the prior art Prezeron system by adding a tab to the user interface where the user can create the present matte and adjust it.
  • the prior art Prezeron system can be adapted by the addition of the drawable mattes, the computations of their positions and orientations and their adjustments using the saved common plane of the 3D points.
  • Prebraking is unique in that the 2D video processing and the 3D rendering are being done in the same product. In contrast, most other systems have separate consoles for 2D and 3D, which are used to separately create the 3D background virtual scene and merge it with the 2D live action scene.
  • the 3D box that has the tracking matte information can send it to the 2D box, in the form of another 2D video signal that is a black-and-white garbage matte.
  • 2D-3D and 3D-2D conversions can be done quickly while taking into account lens distortion.
  • the distortion removal processor, 2D-to-3D converter, 3D renderer, distortion addition processor, and 2D compositor can all be performed on a graphics card of the system.
  • a video I/O card handles the video input and output.
  • a program of the present disclosure can be delivered as an executable code that is installed on a target system.
  • the same math can work in a browser as it is largely a matter of geometry and input.

Abstract

A system for generating automatically tracking mattes that rapidly integrates live action and virtual composite images.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit of co-pending provisional application Ser. No. 61/565,884, filed Dec. 1, 2011, and whose entire contents are hereby incorporated by reference.
  • TECHNICAL FIELD
  • This invention relates to systems and methods for combining real scene elements from a video, film, digital type camera or the like with virtual scene elements from a virtual camera into a finished composite image, and more particularly, to systems and methods for creating drawn shapes that move along with the camera's motion to control how the virtual and real scene elements are combined.
  • BACKGROUND
  • The state of the art in combining real world imagery with additional imagery from another source is a process that requires careful control over which sections of each image are to be used in the final composite image. One common application is to combine images generated by a computer with images acquired from a traditional motion picture, video or digital camera. In order to seamlessly combine the images, the areas of each image that are to be preserved or modified must be defined. These areas are typically called mattes.
  • Mattes may be defined in a number of ways. In traditional compositing, the mattes are frequently defined by having an artist mark points around the perimeter of the object to be preserved or removed. The computer then connects the dots to form a closed shape, which forms the matte. Problems can arise, however, if the object and/or the camera move relative to the other.
  • In traditional computer compositing, a moving camera or object is handled by making a tracking matte, or a matte that moves along with the object. While the methods of moving the matte along with the object vary, they typically center around having the user specify an area of high contrast in the live action image, measuring how that image moves around in the frame, and connecting the motion of the drawn matte to the motion of the high contrast object.
  • This process works, but has several limitations. Firstly, if the high contrast area is located on the front of a character's shirt, for example, and the character turns around, or if the camera moves around to another side of the character, the local effect is destroyed. Secondly, the process of measuring the camera motion by tracking the individual pixels of the high contrast part of the image is both fragile and time-consuming if there is no additional camera data to work from. It typically cannot be computed in real time, and if a frame of the live action image has a lighting change where the pattern is unrecognizable, the artist must re-specify the high contrast area at the frame of failure to continue the process. The process of creating all of the multiple overlapping mattes that are used in a sophisticated visual effects shot can exceed the time required to complete the rest of the shot due to the handwork required.
  • In addition, if the live action camera is zoomed in, the high contrast area that was being tracked can simply disappear from the image, resulting in the matte failing to track the camera lens change.
  • Accordingly, the pixel tracking based methods do not work well for the demands of real time visual effects processing, which must be very rapid to compute as well as robust to the frame by frame changes in the live action video image.
  • In real time processing, mattes have traditionally been created by surveying the edges of the green screen background using an architectural measurement tool such as a total station, and creating a model of the matte in 3D space. However, models of this type cannot be rapidly modified by the artist under typical time pressure conditions found in entertainment production.
  • SUMMARY
  • Various embodiments of an automatic tracking matte system are disclosed herein. In one embodiment, an artist selects points on a computer screen to generate a rough outline around the object to be removed or preserved. These points are selected using a 2D display of the live action image, typically by locating a pointer in the desired location and pressing a selection button. The user clicks a mouse around the border of the object, and then selects the inside or the outside of the finished outline to determine on which side of the line the matte will be active. The user can also begin or end the outline at an edge of the screen, in which case the system extrapolates the matte for a given distance out from the edge of the screen. This distance can be five meters or more, generally between one and ten meters.
  • The above process generates a 2D outline. However, for the matte to track properly in a 3D space, the shape must be converted to a 3D representation. This 3D shape can be a set of attached polygons whose outer perimeter matches the outline of the points that the user selected. The 3D polygon mesh exists at a given point in 3D space. The 3D polygon mesh can be created in a plane normal to the axis of the main virtual camera when the matte is initiated, and at a distance specified by the user.
  • Since the mesh is created by drawing around a live action object, the 2D representation is viewed from the position of the current live action camera. For the 3D mesh to line up accurately, it can be projected from a virtual camera with the same position and orientation as the live action camera. In addition, the further away from the camera the polygon mesh is moved, the larger it must become for the 2D points to remain in the same relative position on the live action image. This computation can be done automatically by geometric projection as the user moves the 3D polygon mesh closer or further away from the virtual camera. This automatic calculation can take into account the current position and orientation of the virtual and live action cameras, the current focal length and distortion of the cameras, and the sensor size of the cameras.
  • After creating the mesh, the user will frequently need to adjust the position and/or shape of an existing mesh. The camera may have moved in this interval, but to keep the points aligned correctly with the original object, the normal along which the 3D mesh is scaled up or down must be known. The mesh points can be manipulated by the artist directly in the 2D user interface, but may be constrained to move only in the original 3D plane in which they were created.
  • According to an aspect of the disclosure a unified matte system is created with individual points that are entered either on the screen in a 2D form as described herein, or directly in 3D from survey coordinate data. Once a given polygon is entered, the various points can be forced into a plane. This plane then defines where the individual points can move when later edited. Thereby, the artist can simply click and drag on an existing matte point to edit it, knowing that it will stay in the plane in which it was created.
  • According to one aspect of the disclosure the 3D mesh object(s) is (are) rendered in separate passes, and grouped together to form the overall set of despill, garbage, or other types of mattes.
  • According to another aspect of the disclosure the points of the mesh can also be entered using 3D survey data. This 3D data can be determined in a variety of ways, including photogrammetry techniques and laser surveying instruments such as a total station. In this embodiment, the first three entered points of survey data can be used to set the plane of the rest of the entered survey points of that polygon.
  • According to a further aspect of the disclosure the mesh can be made to move along with a separate form of tracking. For example, a separate motion capture system can measure the 3D location of a person, face, or object in real time, and locate the 3D matte mesh at the location of the person.
  • According to a still further aspect of the disclosure the basic matte shape can be used for many different applications such as a garbage matte (removal of foreground), a despill matte (removal of extra blue or green color), a color grading matte (selective enhancement of one area of the scene's color), and so forth.
  • According to a yet still further aspect of the disclosure the matte distance set can be set automatically by measuring the distance from the camera to the subject, such as by acoustic or optical methods, or by measuring the current focus distance from the lens system.
  • According to an aspect of the disclosure a method for creating mattes whose shape can be drawn by an artist, but which tracks automatically as the camera or object moves, is provided.
  • According to another aspect of the disclosure the computations required to move the mattes can be performed in real time.
  • According to a further aspect of the disclosure the matte tracking can automatically handle variations in lens focal length or distortion.
  • According to a still further aspect of the disclosure the matte data can be entered in standard 3D survey coordinate form and rapidly modified by the artist during production.
  • According to another aspect a matte tracking method can be achieved with data that is already existing in a real-time compositing and camera tracking system.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments, taken in conjunction with the accompanying drawings.
  • FIG. 1 is a perspective view of an embodiment in accordance with the present disclosure.
  • FIG. 2 is a front view of a live action image with a set of points around it representing a rough matte outline in accordance with an embodiment of the present disclosure.
  • FIG. 3 is a perspective view of a 3D compositing scene in accordance with the present disclosure.
  • FIG. 4 is a perspective view of a polygon mesh in a 3D compositing scene in accordance with the present disclosure.
  • FIG. 5 is a top view of a 3D compositing scene in accordance with the present disclosure.
  • FIG. 6 is a perspective view of a live action environment located within a coordinate system in accordance with the present disclosure.
  • FIG. 7 depicts a matte outline before and after applying lens distortion, which uses the present disclosure to generate an automatically tracking matte.
  • FIG. 8 is a block diagram that depicts the data flow through a system of the present disclosure.
  • DETAILED DESCRIPTION
  • The following is a detailed description of the presently known best mode(s) of carrying out the inventions. This description is not to be taken in a limiting sense, but is made merely for the purpose of illustrating the general principles of the inventions.
  • A rapid, efficient, reliable system is disclosed herein for generating an automatically tracking matte that significantly speeds the integration of live action and virtual composite images. Applications ranging from video games to feature films can implement the system in a fraction of the time typically spent tracking multiple areas of high contrast in the image by hand. The system thereby can greatly reduce the cost and complexity of controlling matte motion, and enables a much wider usage of the virtual production method.
  • Since the present process is primarily for joining live action with computer-generated elements, its applications for video games may be limited. The process can work with a real-time video feed from a camera, which is presently available on most “still” cameras as well. The process can work with a “video tap” mounted on a film camera, in systems where the image is converted to a standard video format that can be processed.
  • An objective of the present disclosure is to provide a method and apparatus for creating automatically tracking mattes for a live action subject that enable rapid control over how different areas of the image are processed.
  • Referring to FIG. 1, an embodiment of the present disclosure is depicted. A scene camera 130 with a lens 190 is positioned to capture a live action image 20 of a subject 10 in front of a background 100. The subject(s) 10, for example, can be actors, props, and physical sets. The background 100 may have obstructions 115 that may cause problems with the keying process. Per an aspect of the disclosure, the obstructions 115 can be dealt with by creating matte objects of this disclosure.
  • The scene camera 130 can be mounted on a camera tracking system 140. And this camera tracking system 140 can be an encoded pedestal, dolly, jib, crane, or any other form of camera position, orientation, and field-of-view measuring system. Focus distance may also be measured, as the parameters of a lens can change while focusing. There may be more than one scene camera to enable different views of the subject's performance to be captured.
  • The scene camera 130 and the camera tracking system 140 are connected to a video processing system 150, as depicted in FIG. 8. The video processing system 150 takes the incoming live action video image 20, generates the corresponding background virtual matte shape 60, and performs the automatic tracking matte process using the two images. The video processing system can include a computer with a live video input, a camera tracking data input, and a video card capable of processing 2D and 3D computer graphics calculations.
  • An embodiment of the present disclosure is illustrated in FIG. 2. A live action subject 10 is shown in the center of a live action image 20 with a perimeter 30 around the edge. A matte shape 60 is displayed along with the live action image 20 in a user interface 240. The matte shape 60, outlined by a set of points 40 connected by segments 50, is drawn around subject 10. Perimeter points 42 are similar to points 40, but are located within the boundary set by perimeter 30. The user selects the location of points 40 one at a time by clicking on the screen.
  • Pursuant to one embodiment, if the user starts and ends the shape by creating a perimeter boundary point 42 within the perimeter 30, the matte shape 60 will extend off the screen in the direction of the lines 50 as they extend off the screen. On the other hand, if the user selects the start point after selecting several other points, the program will recognize this as a closed ring.
  • Once the matte outline 50 has been defined, the user selects whether the matte shape 60 is to be on the inside or the outside of the outline 50. This can be done by detecting to which side of the closed shape that the user moves the mouse pointer, and then clicks to set the inside or the outside of outline 50 to select.
  • The display in the user interface is 2D, but for correct alignment, all of the various components exist as 3D objects in a virtual scene. Referring to FIG. 3, the transformation of 3D to 2D coordinates can be done using standard projection geometry calculations that are well known to those skilled in the art. A virtual camera 70 has a frustum 80 that describes the field of view of the virtual camera. The parameters of virtual camera 70 and frustum 80 match the parameters of the live action camera 130 and the lens 190.
  • A live action image 20 containing a subject 10 is located at a distance from virtual camera 70, and is centered on the optical axis 82 of the virtual camera. The size of the live action image 20 in the virtual space is determined by its distance from virtual camera 70; the further away the live action image is placed in the virtual space, the larger the image must be to fill the view angle described by virtual frustum 80. The matte shape 60 is shown as located in the 3D space in between virtual camera 70 and live action image 20. Since the user controls the distance between virtual camera 70 and matte shape 60, the matte shape can also be located further away from the virtual camera than live action image 20. In this image (see FIG. 3), an automatic extension 62 of matte shape 60 is seen; this extrapolates the direction of segments 50 that end in perimeter points 42 to go past the limits of the screen. The automatic extension 62 can extend five meters, for example, past the original visible edge of matte shape 60.
  • The orientation of matte shape 60 can be created perpendicular to the optical axis 82, and at a user specified distance from the virtual camera 70. Thereby, the user can measure out how far away the live action subject 10 is, enter that distance into the interface, and know that the matte shape 60 is being created at a matching distance from the virtual camera 70.
  • When entered as 2D points on a plane normal to the user's viewing axis, the points 40 all lie on the same plane. Matte shape 60 can also be created using direct input of 3D survey data, measured with an architectural survey tool such as a total station. The entered points of matte shape 60 can be forced to lie on the same plane by using the first three entered points to set the plane definition, with additional entered points projected into that plane to enforce planarity.
  • To correctly render a 3D shape, the outline can be broken up into individual triangular elements. FIG. 4 demonstrates an embodiment of this method. The outline 50 is automatically converted into a set of polygons 90 with internal edges 92 using an automatic tessellation routine. The automatic extension 62 is similarly tessellated. This automatic tessellation routine can be done using an algorithm called Delaunay triangulation as is well known to practitioners.
  • As the user can adjust the distance of the matte shape 60 from the camera, the size of matte shape 60 must increase and decrease as it is moved closer to or further away from the virtual camera 70. FIG. 5 demonstrates an embodiment of this process wherein virtual camera 70 and virtual frustum 80 are viewed from the top down to make their geometry clearer. Three positions of matte shape 60 are shown increasing and decreasing in size as they are closer or further away from virtual camera 70. The size of matte shape 60 is scaled in proportion to the viewing angle or field of view of virtual frustum 80.
  • The user can also adjust the overall matte shape 60 by moving the points 40 after the original shape has been created. The points 40 can be constrained to their original created plane in 3D space as they are moved around. This enables the artist to manipulate points using a convenient interactive 2D interface common in computers, but have the points stay in the correct 3D plane.
  • In some cases the matte shape 60 will need to move along with the subject 10. This can occur when a foreground subject 10 is moving. (On the other hand, the matte shape does not need to move when it is drawn around a background object, such as a green screen wall, that does not move.)
  • Referring to FIG. 6, a 3D tracking device 130 can be used to measure the position 120 of the subject 10 in the stage. The position 120 of subject 10 is measured with respect to a coordinate system 110. This coordinate system 110 can be located identically to the virtual coordinate system used for the rest of the background. The 3D tracking device 130 can be any type of system that can resolve the location of the subject 10 on the stage; and as an example it can be a markerless motion capture system. Since the position 120 of the subject 10 is known by the system, the position of matte shape 60 can be connected to the position 120 of subject 10, with the result being that the movement of matte shape 60 will be locked to subject 10 even as both subject 10 and virtual camera 110 move around the scene. The orientation of matte shape 60 can change to remain normal to the virtual camera 70 as it is moved.
  • All physical lenses exhibit distortion, which must be handled to correctly match the matte shapes to live action, an example of which is shown in FIG. 7. As before, matte shape 60 is created by connecting segments 50 together and the end points of segments 50 are the selected points 40. However, the live action image from which the user is selecting points 40 has lens distortion. To create a correct matte object 60 that works correctly in 3D coordinate space, the selected points 40 in the interface have distortion, which is removed before being converted to a polygon mesh.
  • To render a matte shape 60 that correctly fits to distorted points 40, an undistorted matte shape 62 is created by generating undistorted points 44 based on applying lens distortion removal calculations to the original segment points 40, and connecting them with undistorted segments 52. The calculation of undistorted points 42 on the X,Y plane of the user interface and the rendered matte from the original points location can be computed with the following equations:

  • Xundistorted=Xdistorted*(1+K1*radius2)

  • Yundistorted=Ydistorted*(1+K1*radius2)
  • The value of K1 can be generated by a lens calibration system that measures the current distortion of the physical lens at its current setting. An example of a lens calibration system is described in U.S. patent application Ser. No. 12/832,480, which was published as U.S. Patent Publication No. 20110026014 and whose entire contents are hereby incorporated by reference. The conversion of the undistorted points 42 and segments 52 into 3D coordinates can be completed with standard projection geometry calculations well known to practitioners in the field. To then display the correctly distorted matte shape 60, the undistorted matte shape 62 can be rendered in 2D space and the reverse of the above distortion calculations can be applied to it. In this way, the undistorted matte shape 62 is properly displayed no matter what the current live action lens is doing.
  • The data flow of the system is illustrated in FIG. 8. A number of the processing steps described in earlier figures are combined into the video processing system 150. A scene camera 130 transmits a live action image 20 to 2D compositor 180. Camera tracking system 140 measures and transmits camera data 160 to data combiner 300. Lens 190 transmits lens position data 200 to the lens calibration table 210. Lens calibration table 210 looks up the appropriate lens data 230 and transmits that data to data combiner 300. Data combiner 300 then transmits the combined camera and lens data 310 to the 3D renderer 290, to distortion removal processor 260, to 2D-to-3D converter 270, and to distortion addition processor 170.
  • The user clicks perimeter points 40 and perimeter boundary points 42 on the user interface 240, which transmits these points to the distortion removal processor 260. Using the combined data 310, which includes lens data 230, the distortion removal processor 260 creates a set of undistorted points 44 that are transmitted to the 2D-to-3D converter 270. The distortion removal processor 260 can use the distortion algorithms mentioned with respect to FIG. 7. Using the current camera and lens data contained in combined data 310, the 2D-to-3D converter 270 calculates the matte shape 60 and sends it to 3D renderer 290. The calculation of the 3D matte geometry based on the undistorted points 44 and the combined data 310 can use projection geometry calculations that are well known to those skilled in the art.
  • 3D renderer 290 can use matte shape 60 and the combined camera and lens data 310 to place a virtual camera 70 and frustum 80. The 3D renderer generates a 2D undistorted matte shape 62. The creation of a 2D undistorted shape from 3D geometry is essentially the reverse of the 2D-to-3D conversion mentioned in the previous paragraph, and is well known to those skilled in the art. The 3D renderer 290 then sends the 2D undistorted matte shape 62 to the distortion addition processor 170. The distortion addition processor 170, using the lens data 230 contained in combined data 310, creates a distorted 2D matte image 175 and sends it to 2D compositor 180. The calculations to add this distortion can be the same as described for FIG. 7.
  • A goal of this 2D-to-3D and 3D-to-2D conversion is to allow the user to select and manipulate points on a 2D user interface 240 containing live action image 20 that actually generate correct matte shape 60 which when rendered with the same lens distortion as the live action image 20, results in a matte image 175 that lines up with the original perimeter point 40 selected by the user. Otherwise, the matte image 175 would appear in a different place than that selected, and this would be a frustrating interface for the user.
  • The same rendering and distortion addition process can be used to create virtual background scenes that will be combined with the live action image 20 in the 2D compositor 180. Background scene geometry 320 from an external 3D content creation software program such as Maya is loaded into the 3D renderer 290, which generates an undistorted background image 340. This is sent to the distortion addition processor 170, which then applies the same lens distortion addition used for the matte image 175 to result in background image 185.
  • 2D compositor 180 uses the matte image 175 to selectively process portions of the live action image 20 in combination with background image 185 to generate a composited image 320. (The composited image 320 can be delivered in the form of a live action actor placed into a virtual background, for example.) Because of the correct removal, rendering, and addition of lens distortion information, the user simply clicks on perimeter points 40 and they appear correctly on the screen of the user interface 240 in the expected position. This is because they have been correctly converted to accurate 3D spatial coordinates and re-drawn with matching lens and camera data. Thus, the convenience of 2D drawn mattes is preserved, while operating in a fully-tracked 3D world, which is needed for complex real-time visual effects.
  • According to one program of a system of this disclosure, the following prompts are provided to the user at the user interface: selectable and draggable points that overlay a live action image. An alternative program provides the following prompts: numerical XYZ entry fields for direct input of 3D coordinate points.
  • The resulting drawn or surveyed mattes can be used in a variety of manners. The mattes can be used as a garbage matte or a despill matte.
  • Garbage mattes are used to completely remove unwanted sections (like a hanging microphone in front of the green screen) of the live action image. The garbage mattes replace that part of the live action image with the computer-generated image underneath.
  • On the other hand, despill mattes are used to preserve part of the foreground image from being keyed (the green area made transparent), but still “clamping” the green (limiting the green level to the largest of either the red or the blue levels) to remove the greenish cast that otherwise permeates all through the image from the reflected light off the green screen. An example is a green screen placed outside a window, but the green reflects onto a glass table indoors, making it green. A despill matte removes the green tinge from the glass top, but without making it transparent. That is, a despill matte defines the part of the live action foreground to apply only the despill process, as opposed to the keying process, both of which are well known to practitioners in the art.
  • An alternative embodiment is the creation of the ‘holdout’ matte. This is typically based on live action objects in the scene, and is used to force virtual objects to be behind the live action objects, or to enable virtual objects to cast virtual shadows on live action objects. This is the area of use most likely for 3D mattes generated from natural feature tracking.
  • In addition, the 3D objects that are used to describe the matte positions can be saved and exported to external applications for post-production. They can be saved into a Collada or other 3D file format that is easily imported into other standard visual effects applications.
  • Alternative embodiments include using the mattes to drive a color grading process, so that the matte defines the part of the image to which to apply a color transformation. In this way, the process of correcting images manually, shot by shot, can be heavily automated.
  • Additional alternative embodiments include the automated movement of different points in the matte according to different tracking points from a 3D tracking system, or using facial tracking connected to the main camera to drive the matte tracking to only track facial features.
  • Thus, systems of the present disclosure have many unique advantages such as those discussed immediately below. The artist can edit the 3D points by dragging them around in a 2D interface, while preserving their location on their original 2D plane. This gives the artist fast interaction, while avoiding confusing “out of plane” geometry. Using a 2D interface can be accomplished by real time undistortion and re-distortion, to create correctly matched geometry while providing a convenient, familiar 2D interface. Most compositors only work with 2D, and 3D can be confusing to them. Automatically extending the matte beyond the edges when using the perimeter points allows the compositor to extend the matte without requiring the camera operator to move back and forth. The system allows the mattes to be stored and exported for future use, which is particularly useful for example for the following applications: Nuke, After Effects, Shake, Flame, and Inferno.
  • A system of the present disclosure can include a graphics card or CPU that includes: (a) a distortion removal processor 260 programmed to create a set of undistorted points; (b) a 2D-to-3D converter 270 configured to use the set of undistorted points to calculate 3D matte geometry; (c) a 3D renderer 290 configured to use the 3D matte geometry to generate a 2D undistorted matte shape; (d) a distortion addition processor 170 programmed to use the 2D undistorted matte shape to create a distorted 2D matte shape; and (e) a 2D compositor 180 configured to use the distorted 2D matte shape to combine at least one portion of a live action image with at least one other image to generate a composited image. The composited image can be delivered in the form of a high definition serial digital interface signal to an external recording system. An example of a commercially available graphics card that can be so programmed is the Quadro card available from nVidia Corporation of Santa Clara, Calif.
  • The above-mentioned graphics card or CPU can also include data combiner 300 and lens calibration table 210, or the processes can be divided between a graphics card and a CPU.
  • The present automatic matte tracking system can be based on the prior art Previzion system, which is/was available from Lightcraft Technology of Venice, Calif. The Previzion system includes a camera tracking system, a lens calibration system, a real-time compositing system, and a built-in 3D renderer. The tracking mattes feature adds the ability to hand draw mattes in 2D on the screen, that are then converted into a 3D space by the system, enabling it to move automatically as the camera moves, and in real time. An example of a publication disclosing the prior art Previzion system is the Previzion product brochure, entitled Previzion Specifications 2011, published on Apr. 8, 2011, and whose contents are incorporated by reference.
  • An embodiment of the present system can be made by modifying the prior art Previzion system by adding a tab to the user interface where the user can create the present matte and adjust it. The prior art Previzion system can be adapted by the addition of the drawable mattes, the computations of their positions and orientations and their adjustments using the saved common plane of the 3D points.
  • Previzion is unique in that the 2D video processing and the 3D rendering are being done in the same product. In contrast, most other systems have separate consoles for 2D and 3D, which are used to separately create the 3D background virtual scene and merge it with the 2D live action scene.
  • However, the 3D box that has the tracking matte information can send it to the 2D box, in the form of another 2D video signal that is a black-and-white garbage matte. This would essentially be the 3D box rendering the matte shapes, as it does in the Previzion system, but the final image assembly would be done externally in another system (like an Ultimatte HD, which is available from the Ultimatte Corporation of Chatsworth, Calif.) that takes in both the black/white garbage matte signal and the live action blue or green screen signal.
  • Most Ultimatte/other third party keyers already have a live input for the garbage matte signal, so it is straightforward to interface the tracking garbage mattes of the present disclosure to external keyers. However, the 2D Ultimatte system has no user interface that can select points that are connected to the separate 3D rendering system, such as is described here.
  • The more complicated uses of the mattes (like despill, color correction, etc.) that are easy to do in Previzion can be re-created with an external keying system. They can be done, for example, by manually tracking points of high contrast in the 2D image in Nuke available from The Foundry Visionmongers Ltd. of London, UK, or similar compositing packages, and then creating outlines from these points. This is typically not a real time process, and requires days or weeks of work for a single shot.
  • Pursuant to an aspect of the present disclosure what the camera and camera lens are doing are knowable to the present system. Thus, 2D-3D and 3D-2D conversions can be done quickly while taking into account lens distortion. The distortion removal processor, 2D-to-3D converter, 3D renderer, distortion addition processor, and 2D compositor can all be performed on a graphics card of the system. A video I/O card handles the video input and output.
  • A program of the present disclosure can be delivered as an executable code that is installed on a target system. The same math can work in a browser as it is largely a matter of geometry and input.
  • Although the inventions disclosed herein have been described in terms of the preferred embodiments above, numerous modifications and/or additions to the above-described preferred embodiments would be readily apparent to one skilled in the art. The embodiments can be defined, for example, as methods carried out by any one, any subset of or all of the components as a system of one or more components in a certain structural and/or functional relationship; as methods of making, installing and assembling; as methods of using; methods of commercializing; as methods of making and using the terminals; as kits of the different components; as an entire assembled workable system; and/or as sub-assemblies or sub-methods. It is intended that the scope of the present inventions extend to all such modifications and/or additions and that the scope of the present inventions is limited solely by the claims set forth below.

Claims (20)

1. A method comprising:
generating an outline around an object, which is to be removed or preserved, in a 2D display of a live action image from a live action camera;
converting the shape defined by the 2D outline into a 3D mesh, wherein the converting includes removing lens distortion;
projecting the 3D mesh from a virtual camera having the same position and orientation as the live action camera;
creating a 2D undistorted matte shape from the 3D mesh;
distorting the 2D undistorted matte shape to match the as-shot, distorted live action image; and
processing, using the 2D distorted matte shape, portions of the live action image with other images to form a composite image.
2. A method comprising:
generating an outline around an object, which is to be removed or preserved, in a 2D display of a live action image from a live action camera;
converting the shape defined by the 2D outline into a 3D mesh;
projecting the 3D mesh from a virtual camera having the same position and orientation as the live action camera;
creating a 2D undistorted matte shape from the 3D mesh; and
processing, using the 2D undistorted matte shape, portions of the live action image with other images to form a composite image.
3. The method of claim 2 wherein the converting includes removing lens distortion.
4. The method of claim 2 wherein the processing includes distorting the 2D undistorted matte shape to match the as-shot, distorted live action image.
5. The method of claim 2 wherein the user interface automatically handles distortion processing of points of the user selected outline points.
6. The method of claim 2 wherein the normal of the direction the camera was facing when the geometry was first created is preserved in the system, so that even after the camera moves, the matte size and 3D position can be adjusted while preserving the correct match of the outline to the live action object.
7. The method of claim 2 wherein the outline generation step includes editing the 3D points by dragging them around in a 2D interface, while preserving their location on their original 2D plane.
8. The method of claim 2 wherein the method includes real time undistortion and re-distortion, to thereby create correctly matched geometry.
9. A video processing system, comprising:
a distortion removal processor programmed to create a set of undistorted points;
a 2D-to-3D converter configured to use the set of undistorted points to calculate 3D matte geometry;
a 3D renderer configured to use the 3D matte geometry to generate a 2D undistorted matte shape;
a distortion addition processor programmed to use the 2D undistorted matte shape to create a distorted 2D matte shape; and
a 2D compositor configured to use the distorted 2D matte shape to combine at least one portion of a live action image with at least one other image to generate a composited image.
10. The video processing system of claim 9 wherein the distortion removal processor uses camera and lens data to create the set of undistorted points.
11. The video processing system of claim 9 wherein the 2D-to-3D converter uses lens data to create the distorted 2D matte shape.
12. The video processing system of claim 9 wherein the 3D renderer uses camera and lens data to generate the undistorted matte shape.
13. The video processing system of claim 9 wherein the distortion removal processor, the 3D renderer and the 2D-to-3D converter are all configured to receive combined data combined from camera data from a camera tracking system and lens data from a lens calibration table.
14. The video processing system of claim 9 wherein the 3D renderer is configured to receive background scene geometry and to generate an undistorted background image, and wherein the distortion addition processor is programmed to receive the undistorted background image and to generate a background image for delivery to the 2D compositor.
15. (canceled)
16. (canceled)
17. (canceled)
18. (canceled)
19. (canceled)
20. (canceled)
US14/344,878 2011-12-01 2012-11-30 Automatic tracking matte system Abandoned US20140218358A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/344,878 US20140218358A1 (en) 2011-12-01 2012-11-30 Automatic tracking matte system

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201161565884P 2011-12-01 2011-12-01
PCT/US2012/067460 WO2013082539A1 (en) 2011-12-01 2012-11-30 Automatic tracking matte system
US14/344,878 US20140218358A1 (en) 2011-12-01 2012-11-30 Automatic tracking matte system

Publications (1)

Publication Number Publication Date
US20140218358A1 true US20140218358A1 (en) 2014-08-07

Family

ID=48536139

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/344,878 Abandoned US20140218358A1 (en) 2011-12-01 2012-11-30 Automatic tracking matte system
US14/209,403 Expired - Fee Related US9014507B2 (en) 2011-12-01 2014-03-13 Automatic tracking matte system

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/209,403 Expired - Fee Related US9014507B2 (en) 2011-12-01 2014-03-13 Automatic tracking matte system

Country Status (6)

Country Link
US (2) US20140218358A1 (en)
EP (1) EP2786303A4 (en)
CN (1) CN104169941A (en)
CA (1) CA2857157A1 (en)
HK (1) HK1204378A1 (en)
WO (1) WO2013082539A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9014507B2 (en) 2011-12-01 2015-04-21 Lightcraft Technology Llc Automatic tracking matte system
WO2018089041A1 (en) * 2016-11-14 2018-05-17 Lightcraft Technology Llc Team augmented reality system
US10015478B1 (en) 2010-06-24 2018-07-03 Steven M. Hoffberg Two dimensional to three dimensional moving image converter
US10164776B1 (en) 2013-03-14 2018-12-25 goTenna Inc. System and method for private and point-to-point communication between computing devices

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9171379B2 (en) 2012-04-13 2015-10-27 Lightcraft Technology Llc Hybrid precision tracking
NO20140637A1 (en) * 2014-05-21 2015-11-23 The Future Group As Virtual protocol
JP6564295B2 (en) * 2015-10-20 2019-08-21 オリンパス株式会社 Composite image creation device
US10372228B2 (en) 2016-07-20 2019-08-06 Usens, Inc. Method and system for 3D hand skeleton tracking
CN106295525B (en) * 2016-07-29 2019-12-03 深圳迪乐普数码科技有限公司 A kind of video image method of replacing and terminal
CN107707843A (en) * 2016-08-09 2018-02-16 深圳泰和创邦文化传媒有限公司 A kind of Video Composition and the method for imprinting
WO2018045532A1 (en) * 2016-09-08 2018-03-15 深圳市大富网络技术有限公司 Method for generating square animation and related device
CN106485788B (en) * 2016-10-21 2019-02-19 重庆虚拟实境科技有限公司 A kind of game engine mixed reality image pickup method
US20190279428A1 (en) * 2016-11-14 2019-09-12 Lightcraft Technology Llc Team augmented reality system
CN107888811B (en) * 2017-11-22 2018-09-14 黄海滨 It is automatic stingy as camera based on human body attitude
US11067805B2 (en) * 2018-04-19 2021-07-20 Magic Leap, Inc. Systems and methods for operating a display system based on user perceptibility
DE102018118187A1 (en) * 2018-07-27 2020-01-30 Carl Zeiss Ag Process and data processing system for the synthesis of images
CN111178127B (en) * 2019-11-20 2024-02-20 青岛小鸟看看科技有限公司 Method, device, equipment and storage medium for displaying image of target object
CN113253890B (en) * 2021-04-02 2022-12-30 中南大学 Video image matting method, system and medium
CN113244616B (en) * 2021-06-24 2023-09-26 腾讯科技(深圳)有限公司 Interaction method, device and equipment based on virtual scene and readable storage medium
CN114417075B (en) * 2022-03-31 2022-07-08 北京优锘科技有限公司 Method, device, medium and equipment for establishing path-finding grid data index

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5714997A (en) * 1995-01-06 1998-02-03 Anderson; David P. Virtual reality television system
US6134346A (en) * 1998-01-16 2000-10-17 Ultimatte Corp Method for removing from an image the background surrounding a selected object
US6208347B1 (en) * 1997-06-23 2001-03-27 Real-Time Geometry Corporation System and method for computer modeling of 3D objects and 2D images by mesh constructions that incorporate non-spatial data such as color or texture
US20020024520A1 (en) * 2000-08-31 2002-02-28 Hong-Yang Wang Method of using a 3D polygonization operation to make a 2D picture show facial expression
US6445810B2 (en) * 1997-08-01 2002-09-03 Interval Research Corporation Method and apparatus for personnel detection and tracking
US20050168485A1 (en) * 2004-01-29 2005-08-04 Nattress Thomas G. System for combining a sequence of images with computer-generated 3D graphics
US20070065002A1 (en) * 2005-02-18 2007-03-22 Laurence Marzell Adaptive 3D image modelling system and apparatus and method therefor
US20070098296A1 (en) * 2005-10-29 2007-05-03 Christophe Souchard Estimating and removing lens distortion from scenes
US20080278479A1 (en) * 2007-05-07 2008-11-13 Microsoft Corporation Creating optimized gradient mesh of a vector-based image from a raster-based image
US20090202114A1 (en) * 2008-02-13 2009-08-13 Sebastien Morin Live-Action Image Capture
US20090209343A1 (en) * 2008-02-15 2009-08-20 Eric Foxlin Motion-tracking game controller
US20100189342A1 (en) * 2000-03-08 2010-07-29 Cyberextruder.Com, Inc. System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
US20110038536A1 (en) * 2009-08-14 2011-02-17 Genesis Group Inc. Real-time image and video matting
US20110128286A1 (en) * 2009-12-02 2011-06-02 Electronics And Telecommunications Research Institute Image restoration apparatus and method thereof
US20110128377A1 (en) * 2009-11-21 2011-06-02 Pvi Virtual Media Services, Llc Lens Distortion Method for Broadcast Video
US20120163656A1 (en) * 2005-12-15 2012-06-28 Trimble Navigation Limited Method and apparatus for image-based positioning
US20120183238A1 (en) * 2010-07-19 2012-07-19 Carnegie Mellon University Rapid 3D Face Reconstruction From a 2D Image and Methods Using Such Rapid 3D Face Reconstruction
US8339418B1 (en) * 2007-06-25 2012-12-25 Pacific Arts Corporation Embedding a real time video into a virtual environment
US8411931B2 (en) * 2006-06-23 2013-04-02 Imax Corporation Methods and systems for converting 2D motion pictures for stereoscopic 3D exhibition

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6342884B1 (en) 1999-02-03 2002-01-29 Isurftv Method and apparatus for using a general three-dimensional (3D) graphics pipeline for cost effective digital image and video editing, transformation, and representation
GB0114157D0 (en) 2001-06-11 2001-08-01 Canon Kk 3D Computer modelling apparatus
US6809745B1 (en) * 2001-10-01 2004-10-26 Adobe Systems Incorporated Compositing two-dimensional and 3-dimensional images
US7003161B2 (en) * 2001-11-16 2006-02-21 Mitutoyo Corporation Systems and methods for boundary detection in images
US20030202120A1 (en) * 2002-04-05 2003-10-30 Mack Newton Eliot Virtual lighting system
US6974373B2 (en) * 2002-08-02 2005-12-13 Geissler Technologies, Llc Apparatus and methods for the volumetric and dimensional measurement of livestock
GB2392072B (en) 2002-08-14 2005-10-19 Autodesk Canada Inc Generating Image Data
US7542034B2 (en) * 2004-09-23 2009-06-02 Conversion Works, Inc. System and method for processing video images
US20060165310A1 (en) 2004-10-27 2006-07-27 Mack Newton E Method and apparatus for a virtual scene previewing system
US20070248283A1 (en) * 2006-04-21 2007-10-25 Mack Newton E Method and apparatus for a wide area virtual scene preview system
US7854518B2 (en) 2006-06-16 2010-12-21 Hewlett-Packard Development Company, L.P. Mesh for rendering an image frame
US7692647B2 (en) 2006-09-14 2010-04-06 Microsoft Corporation Real-time rendering of realistic rain
US7764286B1 (en) * 2006-11-01 2010-07-27 Adobe Systems Incorporated Creating shadow effects in a two-dimensional imaging space
US8059894B1 (en) * 2006-12-19 2011-11-15 Playvision Technologies, Inc. System and associated methods of calibration and use for an interactive imaging environment
KR100894874B1 (en) 2007-01-10 2009-04-24 주식회사 리얼이미지 Apparatus and Method for Generating a Stereoscopic Image from a Two-Dimensional Image using the Mesh Map
US20080252746A1 (en) * 2007-04-13 2008-10-16 Newton Eliot Mack Method and apparatus for a hybrid wide area tracking system
US8031210B2 (en) * 2007-09-30 2011-10-04 Rdv Systems Ltd. Method and apparatus for creating a composite image
US7999862B2 (en) * 2007-10-24 2011-08-16 Lightcraft Technology, Llc Method and apparatus for an automated background lighting compensation system
US8824861B2 (en) * 2008-07-01 2014-09-02 Yoostar Entertainment Group, Inc. Interactive systems and methods for video compositing
EP2460054A4 (en) * 2009-07-31 2013-03-06 Lightcraft Technology Llc Methods and systems for calibrating an adjustable lens
JP2011101752A (en) * 2009-11-11 2011-05-26 Namco Bandai Games Inc Program, information storage medium, and image generating device
US20140218358A1 (en) 2011-12-01 2014-08-07 Lightcraft Technology, Llc Automatic tracking matte system

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5714997A (en) * 1995-01-06 1998-02-03 Anderson; David P. Virtual reality television system
US6208347B1 (en) * 1997-06-23 2001-03-27 Real-Time Geometry Corporation System and method for computer modeling of 3D objects and 2D images by mesh constructions that incorporate non-spatial data such as color or texture
US6445810B2 (en) * 1997-08-01 2002-09-03 Interval Research Corporation Method and apparatus for personnel detection and tracking
US6134346A (en) * 1998-01-16 2000-10-17 Ultimatte Corp Method for removing from an image the background surrounding a selected object
US20100189342A1 (en) * 2000-03-08 2010-07-29 Cyberextruder.Com, Inc. System, method, and apparatus for generating a three-dimensional representation from one or more two-dimensional images
US20020024520A1 (en) * 2000-08-31 2002-02-28 Hong-Yang Wang Method of using a 3D polygonization operation to make a 2D picture show facial expression
US20050168485A1 (en) * 2004-01-29 2005-08-04 Nattress Thomas G. System for combining a sequence of images with computer-generated 3D graphics
US20070065002A1 (en) * 2005-02-18 2007-03-22 Laurence Marzell Adaptive 3D image modelling system and apparatus and method therefor
US20070098296A1 (en) * 2005-10-29 2007-05-03 Christophe Souchard Estimating and removing lens distortion from scenes
US20120163656A1 (en) * 2005-12-15 2012-06-28 Trimble Navigation Limited Method and apparatus for image-based positioning
US8411931B2 (en) * 2006-06-23 2013-04-02 Imax Corporation Methods and systems for converting 2D motion pictures for stereoscopic 3D exhibition
US20080278479A1 (en) * 2007-05-07 2008-11-13 Microsoft Corporation Creating optimized gradient mesh of a vector-based image from a raster-based image
US8339418B1 (en) * 2007-06-25 2012-12-25 Pacific Arts Corporation Embedding a real time video into a virtual environment
US20090202114A1 (en) * 2008-02-13 2009-08-13 Sebastien Morin Live-Action Image Capture
US20090209343A1 (en) * 2008-02-15 2009-08-20 Eric Foxlin Motion-tracking game controller
US20110038536A1 (en) * 2009-08-14 2011-02-17 Genesis Group Inc. Real-time image and video matting
US20110128377A1 (en) * 2009-11-21 2011-06-02 Pvi Virtual Media Services, Llc Lens Distortion Method for Broadcast Video
US20110128286A1 (en) * 2009-12-02 2011-06-02 Electronics And Telecommunications Research Institute Image restoration apparatus and method thereof
US20120183238A1 (en) * 2010-07-19 2012-07-19 Carnegie Mellon University Rapid 3D Face Reconstruction From a 2D Image and Methods Using Such Rapid 3D Face Reconstruction

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Sony Eye Cam 2007, (Sony Eye Cam specification, describing an adjustable lens, Dec. 30 2010, Archive of amazon.com product sale page) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10015478B1 (en) 2010-06-24 2018-07-03 Steven M. Hoffberg Two dimensional to three dimensional moving image converter
US11470303B1 (en) 2010-06-24 2022-10-11 Steven M. Hoffberg Two dimensional to three dimensional moving image converter
US9014507B2 (en) 2011-12-01 2015-04-21 Lightcraft Technology Llc Automatic tracking matte system
US10164776B1 (en) 2013-03-14 2018-12-25 goTenna Inc. System and method for private and point-to-point communication between computing devices
WO2018089041A1 (en) * 2016-11-14 2018-05-17 Lightcraft Technology Llc Team augmented reality system

Also Published As

Publication number Publication date
EP2786303A1 (en) 2014-10-08
CN104169941A (en) 2014-11-26
US9014507B2 (en) 2015-04-21
HK1204378A1 (en) 2015-11-13
WO2013082539A1 (en) 2013-06-06
EP2786303A4 (en) 2015-08-26
CA2857157A1 (en) 2013-06-06
US20140192147A1 (en) 2014-07-10

Similar Documents

Publication Publication Date Title
US9014507B2 (en) Automatic tracking matte system
US9438878B2 (en) Method of converting 2D video to 3D video using 3D object models
US9117310B2 (en) Virtual camera system
JPH07262410A (en) Method and device for synthesizing picture
US8436852B2 (en) Image editing consistent with scene geometry
US7999862B2 (en) Method and apparatus for an automated background lighting compensation system
JP3104638B2 (en) 3D image creation device
US20230062973A1 (en) Image processing apparatus, image processing method, and storage medium
JP2004234350A (en) Image processing device, image processing method, and image processing program
JP2006098256A (en) Three-dimensional surface model preparing system, image processing system, program, and information recording medium
JP3643511B2 (en) 3D image processing method, 3D modeling method, and recording medium recording 3D image processing program
WO2013190645A1 (en) Automatic image compositing device
Musialski et al. Interactive Multi-View Facade Image Editing.
JP2005063041A (en) Three-dimensional modeling apparatus, method, and program
JP2006323450A (en) Simulation image generator, simulation image generation method, computation program, and recording medium recorded with program
Kunert et al. An efficient diminished reality approach using real-time surface reconstruction
JPH11213141A (en) Image compositing method, device therefor and information recording medium
KR20120118462A (en) Concave surface modeling in image-based visual hull
JP3309841B2 (en) Synthetic moving image generating apparatus and synthetic moving image generating method
JP3575469B2 (en) Synthetic moving image generating apparatus and synthetic moving image generating method
JP2021047468A (en) Image processing equipment, image processing method, and image processing program
JP2002260003A (en) Video display device
Hillman et al. Issues in adapting research algorithms to stereoscopic visual effects
US11393155B2 (en) Method for editing computer-generated images to maintain alignment between objects specified in frame space and objects specified in scene space
Lee et al. Interactive retexturing from unordered images

Legal Events

Date Code Title Description
AS Assignment

Owner name: LIGHTCRAFT TECHNOLOGY LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MACK, NEWTON ELIOT;MASS, PHILIP R;REEL/FRAME:035182/0223

Effective date: 20150310

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION