CN100413327C - A video object mask method based on the profile space and time feature - Google Patents

A video object mask method based on the profile space and time feature Download PDF

Info

Publication number
CN100413327C
CN100413327C CNB2006100533980A CN200610053398A CN100413327C CN 100413327 C CN100413327 C CN 100413327C CN B2006100533980 A CNB2006100533980 A CN B2006100533980A CN 200610053398 A CN200610053398 A CN 200610053398A CN 100413327 C CN100413327 C CN 100413327C
Authority
CN
China
Prior art keywords
key frame
color
frame
background
prospect
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006100533980A
Other languages
Chinese (zh)
Other versions
CN1997114A (en
Inventor
庄越挺
董兆华
肖俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CNB2006100533980A priority Critical patent/CN100413327C/en
Publication of CN1997114A publication Critical patent/CN1997114A/en
Application granted granted Critical
Publication of CN100413327C publication Critical patent/CN100413327C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention discloses one visual front view subject label method, which comprises the following steps: a, dividing one section into several parts each with one key frame and several non key frames; b, for key frame, requiring user to input indication information with designed front view subject and back subject key part for label; c, for non key frame, according to key frame label results, using the front view part color distribution and shape and back color as known information's to label non key frame.

Description

A kind of object video mask method based on profile space and time feature
Technical field
The present invention relates to field of video processing, relate in particular to a kind of object video mask method based on profile space and time feature.
Background technology
Along with the universalness of digital camera and Digital Video, interactive image and Video processing become very popular and forward position research direction.Wherein, realize the video foreground high efficiency extraction, and then the prospect that extracts is synthesized in the new video sequence, perhaps this prospect is carried out operations such as cartoon style editor, become important techniques of video field with interactive means.
Video is that successive image frame constitutes, and prospect and background object are cut apart and can be carried out interactive operation respectively to every frame in the realization video flowing, use the display foreground extracting method, obtain the prospect and the background of each frame, thereby generate the prospect and the background of whole section video.But there is following several problem in this means: at first, need a large amount of repeated tedious work, promptly the user needs to point out alternately to background on every frame and prospect; Secondly, this means are handled respectively every frame, do not consider the time continuity between them, and a spot of difference all can be shone into visual jump between the consecutive frame.
If can follow the tracks of object of which movement in the video flowing exactly, can extract prospect and background according to interaction mechanism to key frame so, mutual knowledge and motion tracking result are diffused into reach non-key frame prospect of automatic extraction and background purpose on the non-key frame.People such as Hertzmann just adopt the light stream algorithm for estimating to follow the tracks of the motion of object [3], but the light stream algorithm for estimating is difficult to robust and obtains motion tracking result in the ordinary video at present, therefore can not obtain prospect on the non-key frame with the light stream algorithm for estimating, come the dynamic update tracing process but the light stream results estimated can be used as a constraints.Based on such thinking, the present invention proposes and a kind ofly on key frame, carry out interactive prospect background and extract, non-key frame is according to carrying out the method that video foreground extracts with key frame correlation and the spatial coherence of non-key frame own on sequential.
It is to carry out foreground extraction on the basis that profile extracts that a large amount of work is arranged.People such as Hall have proposed a kind of user and have supervised bottom profiled extracting method [2], and it allows the user that foreground object profile in some frames is sketched the contours, and then other frame is carried out interpolation, obtains the prospect profile of other frames like this, thereby obtains foreground area.This method needs a large amount of manpowers to sketch the contours profile, and for the video of rapid movement, the frame number that need manually sketch the contours is just many more, otherwise the mistake that the interpolation result of intermediate frame produces is just big more.People such as Agarwala have proposed a kind of based on optimizing and user interactions extracts the method [1] of profile, reduced the user alternately.But these methods are to exist circumscribedly, and its represents body form with approximate silhouette edge, and the object that edge details is abundant is lost these detailed information easily.And these methods require the line of demarcation of foreground object and background to want obviously.
Also there is a few thing to be based on to carry out on the basis of object piecemeal foreground extraction.People such as Wang have proposed the video foreground extracting method [6] that a kind of interactive map is cut apart, and use Mean-Shift that image is carried out pre-piecemeal, reduce the partial amt that needs.They have increased local cost function on the basis of global cost function, promptly statistical modeling is carried out in background and the place of striding mark, then it are carried out minimal cut and handle.People's such as Li algorithm also is the method [4] that a kind of figure of utilization partitioning algorithm carries out Video Object Extraction, this method synthesis has considered that each part is with respect to the color correlation of prospect and background color distribution on the key frame, and the color distortion in adjacent two zones that stride across object edge is maximized, simultaneously, also considered the temporal correlation of object of which movement.But, when object color when background color is similar on every side, the edge misjudgment appears in these two kinds of methods.
[1]A.Agarwala,A.Hertzmann,D.H.Salesin,and?S.M.Seitz.Keyframe-Based?Tracking?forRotoscoping?and?Animation.In?Proceedings?of?ACM?SIGGRAPH?2004.2004.pp.584-591
[2]J.Hall,D.Greenhill?and?G.Jones.Segmenting?Film?Sequences?using?Active?Surfaces.InInternational?Conference?on?Image?Processing(ICIP).1997.pp.751-754
[3]A.Hertzmann?and?K.Perlin.Painterly?Rendering?for?Video?and?Interaction.In?Proceedings?ofthe?lst?International?Symposium?on?Non-photorealistic?Animation?and?Rendering.2000.pp.7-12
[4]Y.Li,J.Sun?and?H.Y.Shum.Video?Object?Cut?and?Paste.In?Proceedings?of?ACMSISGGRAPH?2005.2005.pp.595-600
[5]L.Vincent?and?P.Soille.Watersheds?in?Digital?Spaces:An?Efficient?Algorithm?Based?onImmersion?Simulations.IEEE?Tran.on?Pattern?Analysis?and?Machine?Intelligence.1991.13(6),pp.583-598
[6]J.Wang,P.Bhat,R.A.Colbum,M.Agrawala?and?M.F.Cohen.Interactive?Video?Cutout.InProceedings?of?ACM?SIGGRAPH?2005.2005.pp.585-594
Summary of the invention
The purpose of this invention is to provide a kind of object video mask method based on profile space and time feature.
It comprises the steps:
(1) one section video is divided into several portions, every part comprises some frames, in these frames, a width of cloth key frame is arranged, other all be non-key frame;
(2) for key frame, require user's input prompt information, specify some key components in foreground object and the background object, then key frame is marked, draw the subordinate relation of each part on this frame;
(3) for non-key frame, according to the annotation results on the key frame, according to the distribution of color and the shape information of prospect, and the colouring information of background, non-key frame is marked.
Described one section video is divided into several portions: according to the speed of object of which movement in the video, the frame number of each part can and it be inversely proportional to, fast under the situation, the frame number of each part is few for object of which movement, on the contrary then frame number is many.
For key frame, require user's input prompt information, specify some key components in foreground object and the background object: the user sketches the contours prospect or background with mouse on the image of key frame, draws some points, line segment and polygon, like this, for these points, line segment and polygon, they are the hard limit that mark on the key frame, and promptly the subordinate relation of these parts prospect or background in the process of mark can not change;
Key frame is marked, draw the subordinate relation of each part on this frame, key frame is marked, mask method comprises the steps:
(1) image is carried out preliminary treatment earlier, adopt the immersion watershed algorithm, adjacent in the image and pixel that color differs within certain threshold range are divided into same zone;
(2) for each zone, the field color of the average color in this zone as it;
(3) user is imported the prospect of appointment and the color value of the pixel on the background carries out cluster, obtain one group of background color center and foreground color center;
(4) the data value difference of defined range piece is the minimum value of difference between the color center of this field color and prospect or background, and difference is the distance between the color of adjacent area piece between the definition adjacent area piece;
(5) according to the difference of data value and the difference between the adjacent area piece, each region unit as a node, is constructed the figure that a width of cloth figure is cut apart, again this width of cloth figure is done minimal cut, obtain the mark that the near-optimization of this width of cloth image is dissolved.
For non-key frame, according to the annotation results on the key frame, according to the distribution of color and the shape information of prospect, and the colouring information of background, non-key frame is marked, mask method comprises the steps:
(1) according to the annotation results of key frame, the color of prospect and background is carried out cluster, this cluster result will be applied in the difference of data value of non-key frame;
(2), obtain the profile of foreground object according to the annotation results of key frame.Utilize the degree of belief broadcast algorithm, in certain range of movement, estimate the motion of object, obtain the approximate location of contour of object on the non-key frame, this profile information replenishing as adjacent area piece difference;
(3) according to the difference of data value and the difference between the adjacent area piece, as a node, the figure that structure-width of cloth figure is cut apart on non-key frame does minimal cut on this width of cloth figure, obtain the mark on the non-key frame each region unit.
Beneficial effect of the present invention
Some are about the method for video foreground mark at present, when the prospect object is similar with background color, ubiquity foreground object edge marks inaccurate situation, the present invention utilizes the degree of belief broadcast algorithm that interactive information on the key frame and prospect shape are sent on the non-key frame, and take all factors into consideration on the non-key frame each part (region unit) for the color correlation of prospect and background distributions, the color distortion in adjacent two zones, and shape information is found the solution the mark of non-key frame.Experimental result shows that the present invention can solve the foreground object edge and mark inaccurate problem.
Description of drawings
Fig. 1 is based on the object video mask method schematic flow sheet of profile space and time feature, and among the figure: 3 box indicating three steps of the present invention are input as video sequence and the user interactive information on key frame;
Fig. 2 is the structure that the figure of key frame of the present invention and non-key frame is cut apart, and among the figure: solid box is represented the figure on the key frame, and frame of broken lines is represented the figure on the non-key frame;
Fig. 3 represents the information exchanging process in the markov network;
Fig. 4 is that the radian of silhouette edge of the present invention calculates schematic diagram;
Fig. 5 is the customer interaction information on the key frame of the present invention and the result of mark;
Fig. 6 is silhouette edge motion estimation result of the present invention and annotation results;
Fig. 7 (a) is the result of the video foreground mark of people's method such as Li,
Fig. 7 (b) is an annotation results of the present invention;
Fig. 8 is the comparison diagram of people's method annotation results such as the present invention and Li, and among the figure: first row is an original video sequence, and middle row is the result of people's methods such as Li, and last column is result of the present invention.
Embodiment
The present invention utilizes the figure partitioning algorithm to realize that the prospect on key frame and the non-key frame marks.Because this mark is a two-value, therefore, definition mark X={0,1}, wherein 0 represents background, 1 expression prospect.Structure 2D figure on these images, as shown in Figure 2.Solid box is partly represented key frame among Fig. 2, and frame of broken lines is partly represented intermediate frame.Suppose that 2D figure is expressed as G={V, ε }, wherein V is the set in each zone on the image, ε is the set that connects the limit of these zones and mark.From succinct consideration, in Fig. 2, omitted the fillet of some zones and mark point.
In order to improve processing speed, utilize watershed algorithm [Vincent1991] that each frame in the video is carried out preliminary treatment earlier, be divided into some little region units.This watershed algorithm is over-segmentation, and it can keep the profile of object well.So the point shown in Fig. 2 is not a pixel, but these overdivided regions.For key frame, solve the mark problem and make Gibbs ENERGY E (X) minimize exactly:
E ( X ) = Σ i ∈ V E d ( x i ) + α Σ i , j ∈ ϵ E l ( x i , x j ) - - - ( 1 )
Wherein: E d(x i) be data-dependent function, just the average color of regional i is with respect to the correlation of the distribution of color in prospect and the background; E l(x i, x j) be to cross over two regional i of object edge and the color distortion between the j.α regulates parameter, is used for regulating the ratio of these two functions in whole energy function, and this paper gets α=1.5.α can rule of thumb obtain, and for the apparent in view video of some contour of object, this parameter can be provided with forr a short time, and for background color and the approximate video of foreground color, this parameter can be provided with more greatly.Several function numerical value are as giving a definition in the formula (1):
E d ( x i = 1 ) = 0 , E d ( x i = 0 ) = ∞ ∀ i ∈ F E d ( x i = 1 ) = ∞ , E d ( x i = 0 ) = 0 ∀ i ∈ B E d ( x i = 1 ) = d i F d i F + d i B , E d ( x i = 0 ) = d i B d i F + d i B ∀ i ∉ F ∪ B - - - ( 2 )
E l ( x i , x j ) = | x i - x j | e - α | | c i - c j | | 2 - - - ( 3 )
Wherein, d k F = min m | | c i - K m F | | , d k B = min n | | c i - K n B | | , || || the expression Euclidean distance.F represents the prospect seed point set of user's appointment, and B represents the background seed point set of user's appointment, c iBe the average color on the regional i, K m FBe the m class color value after prospect seed point carries out cluster, K n BIt is the n class color value after background seed point carries out cluster.For E l(x i, x j), when the given identical mark in adjacent two zones, promptly belonging under the situation of same object, this function value is 0, has only when these two zones different labeled is arranged, and when just the border of object is crossed in these two zones, non-0 value is arranged.
Because middle non-key frame itself does not have the interactive information that can directly utilize, so this paper uses the degree of belief broadcast algorithm to estimate the motion conditions of customer interaction information on the key frame, thereby obtains the approximate interactive information of user on the intermediate frame.The degree of belief broadcast algorithm will be introduced in the back, the estimation of customer interaction information is the same with pixel on other non-silhouette edge, but their observation function is only relevant with brightness, potential-energy function is only relevant with the space continuity, therefore the tracking of whole motion estimation process and silhouette edge is the same, just the λ in (11) and (12) formula GAnd λ CBe set to 0, make gradient and radian inoperative.On the numerical value meaning, these interactive information help to obtain E dThis function.E lOn the acquisition of function and the key frame is identical.But, owing to will utilize time continuity characteristic between frame of video, be differentiated at the energy function of non-key frame and key frame.The energy function of non-key frame can followingly be represented:
E ( X ) = Σ i ∈ V E d ( x i ) + α Σ i , j ∈ ϵ E l ( x i , x j ) + β Σ i , j ∈ ϵ E s ( x i , x j ) - - - ( 4 )
Compare formula (1) and (4), the difference of these two energy functions is that the energy function on the non-key frame has increased shape constraining component E s(x i, x j).We use the profile track algorithm based on shape facility, according to prospect profile feature on the key frame and the time continuity between the consecutive frame, calculate the approximate location of the contour of object on the non-key frame, obtain E then s(x i, x j).Profile track algorithm based on shape facility has detailed introduction in chapter 4.
The max-flow algorithm that the present invention uses [Boycov2001] to be proposed is asked minimized the separating of (1) and (4) formula, and this algorithm is a kind of method of approximate global optimum, with solving visual mark problem.
The present invention is the space-time characteristic stipulations of contour of object four: brightness, gradient, space continuously and radian keep, the constraint that these four space-time characteristics itself contain can instruct profile to follow the tracks of, and adopts the degree of belief broadcast algorithm to come the dynamic change of approximate resoning space-time restriction simultaneously.
The motion of finding the solution object is exactly that motion to object provides mark, makes posterior probability P (X|Y) maximum.Wherein, X={x iBe the mark set, x i=(u i, v i), u and v expression level respectively and the distance that moves both vertically; Y={I, I ' } be observed key frame and non-key frame.Construct a markov network, as shown in Figure 3.Posterior probability P (X|Y) can followingly represent:
P ( X | Y ) ∝ Π i φ i ( x i , y i ) Π i Π j ∈ N ( i ) ψ i , j ( x i , x j ) - - - ( 5 )
φ i(x i, y i) be to observe function, be used for calculating probability P (y i| x i); ψ I, j(x i, x j) be potential-energy function, be used for weighing the compatibility that marks between the adjacent node.
The markov theory is thought: in Markov Random Fields, the conditional probability of a node only is subjected on every side that consecutive points influence.Degree of belief diffusion main purpose is on one four connected graph, and information between the adjacent node is transmitted.Each information is one group of possible vector that mark constitutes.m Ij tBe the information that sends to j in t moment node i, m i tBe that t marks the information that sends to node j, b constantly iIt is the degree of belief of node i.The degree of belief broadcast algorithm is the algorithm of an iteration, and each iterative process is as follows:
m ij t + 1 ( x j ) = 1 Z max x i ψ ij ( x i , x j ) m i t ( x i ) Π k ∈ N ( j ) \ i m kj t ( x j ) - - - ( 6 )
m i t(x i) all be identical constantly at each, its value is φ i(x i, y i); N (j) i represent the non-i node set adjacent with node j, Z is a normalization numerical value.The value of last degree of belief is:
b i ( x i ) = 1 Z m i ( x i ) Π j ∈ N ( i ) m ji ( x i ) - - - ( 7 )
The mark value is:
x i = arg max x k b i ( x k ) - - - ( 8 )
When Numerical Implementation, the amount of calculation of multiplication is too big in (6) and (7) formula, so they are transformed into the number space is calculated, and can obtain:
m ij t + 1 ( x j ) = max x i ( ψ ij ( x i , x j ) + m i t ( x i ) + Σ k ∈ N ( j ) \ t m kj t ( x j ) ) - - - ( 9 )
b i ( x i ) = m i ( x i ) + Σ j ∈ N ( i ) m ji ( x i ) - - - ( 10 )
The brightness in motion process of video consecutive frame object, gradient and radian do not have big variation, and the adjacent moment object of which movement is continuous.Analyze these constraints, brightness and Grad can influence the observation function as can be known, and the spatial continuity and the radian of motion can influence potential-energy function, so these functions can followingly be represented:
φ i(x i)=exp(-(λ lE l(x i)+λ GE G(x i))) (11)
ψ ij(x i,x j)=exp(-(λ NE N(x i,x j)+λ CE C(x i,x j))) (12)
Wherein: E lBe that image brightness keeps constraint, E GBe that gradient keeps constraint, E NBe the spatial continuity constraint, E CBe that radian keeps constraint, λ l, λ G, λ NAnd λ CBe weights corresponding to sub-energy function.
Suppose that (x, y are that coordinate is that ((x+u, y+v are that coordinate is that (wherein u and v are respectively the level of this pixel and the distance that moves both vertically for x+u, y+v) gray values of pixel points on the t+dt frame t+dt) to f for x, y) gray values of pixel points on the t frame t) to f.According to Taylor expansion,
f ( x + u , y + v , t + dt ) = f ( x , y , t ) + f x u + f y u + f t dt + O ( ∂ 2 ) - - - ( 13 )
Figure C20061005339800094
Be very little numerical value, therefore have:
f(x+u,y+v,t+dt)≈f(x,y,t)+f xu+f yv+f tdt (14)
Object is in the consecutive frame motion process, and it is very little that brightness value changes, thus the brightness of image constraint be exactly make f (x+u, y+v, t+dt) and f (x, y, t) between difference minimize, therefore
E l(x i)=f xu i+f yv i+f tdt (15)
In general, contour of object part Grad is bigger, easily distinguishes with non-outline portion.Therefore, this paper Grad as judging a whether important indicator of contour of object of this position.If g (x, y, t) be on the t frame coordinate for (x, the y) Grad of pixel can obtain equally:
E G(x i)=g xu i+g yv i+g tdt (16)
In order to keep the object space continuity, the adjacent part motion should be continuous on the object.Therefore have:
E N(x i,x j)=|u i-u j|+|v i-v j| (17)
In the motion process of object, the contour of object shape roughly remains unchanged, and that is to say that certain some radian keeps constancy on the contour of object.We are similar to radian with the second dervative of outline line,
c=||p j+p k-2p i|| (18)
P wherein i, p jAnd p kBe adjacent 3 points on the outline line, as shown in Figure 4.
Make radian keep the bound energy function to be:
E C = | | ( p j t + dt + p k t + dt - 2 p i t + dt ) - ( p j t + p k t - 2 p i t ) | | 2 - - - ( 19 )
Wherein p i t + dt - p i t = ( u i · dt , v i · dt ) , Therefore following formula can be converted into:
E C(x i,x j)=((u j+u k-2u i) 2+(v j+v k-2v i) 2)·(dt) 2 (20)
(u k, v k) be approximately (u i, v j), like this following formula just only and i, the mark of j is relevant.Obtain:
E C(x i,x j)=((u j-u i) 2+(v j-v i) 2)·(dt) 2 (21)
Can obtain observing function and potential-energy function (11) and (12) like this, utilize the degree of belief broadcast algorithm, according to (9), (10) with (8 formulas can obtain the motion vector of the each point on the silhouette edge, and (u, v), the outline position on the non-key frame just can obtain.As shown in Figure 6, (a) being the profile that obtains according to annotation results on the key frame, (b) is the profile on the key frame to be followed the tracks of the result who obtains according to the degree of belief broadcast algorithm.The tracking of profile is more accurately, though in some error of head, as a kind of profile information, for the mark on the final non-key frame, this result is enough.
Shape components in the formula (4) is as follows:
E s ( x i , x j ) = 1 - e - d ij - - - ( 22 )
d IjBe the minimum distance of the mid point of i and j to silhouette edge.As can be seen and near more this component of limit of silhouette edge just more little, also possible more being cut apart just.
Embodiment 1
At one section indoor video, it is carried out prospect mark.Implementation process is as follows:
(1) at first it is divided into several portions, each part comprises 10 frames, and wherein a frame is a key frame.Use the immersion watershed algorithm that these frames are carried out preliminary treatment, make image form by these segments.
(2) on key frame, the user carries out interactive operation to it, specifies some prospect part and background parts, shown in Fig. 5 (a).Use the figure partitioning algorithm,, key frame is marked, obtain the result shown in Fig. 5 (b) as formula (1).
(3) profile of prospect utilizes the degree of belief broadcast algorithm then on the key frame shown in Fig. 6 (a), and these profile informations are delivered on the non-key frame, calculates shape components, as formula (22).Use the figure partitioning algorithm,, these non-key frames are marked, obtain the result shown in Fig. 6 (b) as formula (4).
Parameter wherein is provided with as follows: α=1.5, β=0.8, λ l=1.0, λ G=1.0, λ N=1.0, λ c=2.0.
Embodiment 2
At one section outdoor video, it is carried out the prospect mark.Implementation process is as follows:
(1) at first it is divided into several portions, each part comprises 10 frames, and wherein a frame is a key frame.Use the immersion watershed algorithm that these frames are carried out preliminary treatment, make image form by these segments.
(2) on key frame, the user carries out interactive operation to it, specifies some prospect part and background parts with lines.Use the figure partitioning algorithm,, key frame is marked, obtain the prospect annotation results of key frame as formula (1).
(3) use the degree of belief broadcast algorithm, the prospect profile information on the key frame is delivered on the non-key frame, calculate shape components by formula (22).Use the figure partitioning algorithm,, these non-key frames are marked, obtain the prospect annotation results on the non-key frame as formula (4).
Parameter can be provided with like this: α=1.0, other parameter is with embodiment 1.Obtain the video foreground annotation results shown in the third line among Fig. 8.

Claims (3)

1. the object video mask method based on profile space and time feature is characterized in that comprising the steps:
(1) one section video is divided into several portions, every part comprises some frames, in these frames, a width of cloth key frame is arranged, other all be non-key frame;
(2) for key frame, require user's input prompt information, specify some key components in foreground object and the background object, then key frame is marked, draw the subordinate relation of each part on this frame;
(3) for non-key frame, according to the annotation results on the key frame, according to the distribution of color and the shape information of prospect, and the colouring information of background, non-key frame is marked;
Described key frame is marked, draw the subordinate relation of each part on this frame, key frame is marked, mask method comprises the steps:
(1) image is carried out preliminary treatment earlier, adopt the immersion watershed algorithm, adjacent in the image and pixel that color differs within certain threshold range are divided into same zone;
(2) for each zone, the field color of the average color in this zone as it;
(3) user is imported the prospect of appointment and the color value of the pixel on the background carries out cluster, obtain one group of background color center and foreground color center;
(4) the data value difference of defined range piece is the minimum value of difference between the color center of this field color and prospect or background, and difference is the distance between the color of adjacent area piece between the definition adjacent area piece;
(5) according to the difference of data value and the difference between the adjacent area piece, each region unit as a node, is constructed the figure that a width of cloth is cut apart, again this width of cloth figure is done smallest partition, obtain the mark that the near-optimization of this width of cloth image is dissolved;
Described for non-key frame, according to the annotation results on the key frame, according to the distribution of color and the shape information of prospect, and the colouring information of background parts, non-key frame is marked, mask method comprises the steps:
(1) according to the annotation results of key frame, the color of prospect and background is carried out cluster, this cluster result will be applied in the difference of data value of non-key frame;
(2) according to the annotation results of key frame, obtain the profile of foreground object, utilize the degree of belief broadcast algorithm, the motion of estimation object in certain range of movement, obtain the approximate location of contour of object on the non-key frame, this profile information replenishing as adjacent area piece difference;
(3) according to the difference of data value and the difference between the adjacent area piece, as a node, the figure that structure one width of cloth is cut apart on non-key frame does smallest partition on this width of cloth figure, obtain the mark on the non-key frame each region unit.
2. a kind of object video mask method according to claim 1 based on profile space and time feature, it is characterized in that described one section video being divided into several portions: according to the speed of object of which movement in the video, one section video is divided into several portions, the frame number meeting of each part and the speed of object of which movement are inversely proportional to, when the speed of object of which movement is fast, the frame number of each part is few, otherwise then frame number is many.
3. a kind of object video mask method according to claim 1 based on profile space and time feature, it is characterized in that described for key frame, require user's input prompt information, specify some key components in foreground object and the background object: the user is on the image of key frame, with mouse prospect or background are sketched the contours, draw some points, line segment and polygon, like this, these the point, line segment and polygon are the hard limit that marks on the key frame, and promptly the subordinate relation of the part of being sketched the contours prospect or background in the process of mark can not change.
CNB2006100533980A 2006-09-14 2006-09-14 A video object mask method based on the profile space and time feature Expired - Fee Related CN100413327C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100533980A CN100413327C (en) 2006-09-14 2006-09-14 A video object mask method based on the profile space and time feature

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100533980A CN100413327C (en) 2006-09-14 2006-09-14 A video object mask method based on the profile space and time feature

Publications (2)

Publication Number Publication Date
CN1997114A CN1997114A (en) 2007-07-11
CN100413327C true CN100413327C (en) 2008-08-20

Family

ID=38252017

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100533980A Expired - Fee Related CN100413327C (en) 2006-09-14 2006-09-14 A video object mask method based on the profile space and time feature

Country Status (1)

Country Link
CN (1) CN100413327C (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101799929B (en) * 2009-02-11 2013-01-23 富士通株式会社 Designated color layer extracting device and method
CN101834981B (en) * 2010-05-04 2011-11-23 崔志明 Video background extracting method based on online cluster
CN103065300B (en) * 2012-12-24 2015-03-25 安科智慧城市技术(中国)有限公司 Method for video labeling and device for video labeling
CN105224669B (en) * 2015-10-10 2018-11-30 浙江大学 A kind of motion retrieval method based on GMM semantic feature
CN106682595A (en) * 2016-12-14 2017-05-17 南方科技大学 Image content marking method and apparatus thereof
CN109934852B (en) * 2019-04-01 2022-07-12 重庆理工大学 Video description method based on object attribute relation graph
CN110009654B (en) * 2019-04-10 2022-11-25 大连理工大学 Three-dimensional volume data segmentation method based on maximum flow strategy
CN112927238B (en) * 2019-12-06 2022-07-01 四川大学 Core sequence image annotation method combining optical flow and watershed segmentation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998021688A1 (en) * 1996-11-15 1998-05-22 Sarnoff Corporation Method and apparatus for efficiently representing, storing and accessing video information
EP1225518A2 (en) * 2001-01-20 2002-07-24 Samsung Electronics Co., Ltd. Apparatus and method for generating object-labelled images in a video sequence
US6560281B1 (en) * 1998-02-24 2003-05-06 Xerox Corporation Method and apparatus for generating a condensed version of a video sequence including desired affordances
US6956573B1 (en) * 1996-11-15 2005-10-18 Sarnoff Corporation Method and apparatus for efficiently representing storing and accessing video information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998021688A1 (en) * 1996-11-15 1998-05-22 Sarnoff Corporation Method and apparatus for efficiently representing, storing and accessing video information
US6956573B1 (en) * 1996-11-15 2005-10-18 Sarnoff Corporation Method and apparatus for efficiently representing storing and accessing video information
US6560281B1 (en) * 1998-02-24 2003-05-06 Xerox Corporation Method and apparatus for generating a condensed version of a video sequence including desired affordances
EP1225518A2 (en) * 2001-01-20 2002-07-24 Samsung Electronics Co., Ltd. Apparatus and method for generating object-labelled images in a video sequence

Also Published As

Publication number Publication date
CN1997114A (en) 2007-07-11

Similar Documents

Publication Publication Date Title
CN100413327C (en) A video object mask method based on the profile space and time feature
Xie et al. Semantic instance annotation of street scenes by 3d to 2d label transfer
Criminisi et al. Bilayer segmentation of live video
Nagaraja et al. Video segmentation with just a few strokes
Grady et al. Random walks for interactive alpha-matting
CN109035293B (en) Method suitable for segmenting remarkable human body example in video image
Levin et al. A closed-form solution to natural image matting
EP1899897B1 (en) Video object cut and paste
CN110853026B (en) Remote sensing image change detection method integrating deep learning and region segmentation
CN100583158C (en) Cartoon animation fabrication method based on video extracting and reusing
CN103559719A (en) Interactive graph cutting method
CN100505884C (en) A shed image division processing method
Rahnama et al. R3SGM: Real-time raster-respecting semi-global matching for power-constrained systems
CN101588459A (en) A kind of video keying processing method
Xiao et al. Accurate motion layer segmentation and matting
CN103400386A (en) Interactive image processing method used for video
Chen et al. Background estimation using graph cuts and inpainting
CN103279961A (en) Video segmentation method based on depth recovery and motion estimation
CN101989353A (en) Image matting method
Lu et al. Coherent parametric contours for interactive video object segmentation
Zhang et al. Multi-view video based multiple objects segmentation using graph cut and spatiotemporal projections
Zhou et al. An efficient two-stage region merging method for interactive image segmentation
Adam et al. On scene segmentation and histograms-based curve evolution
CN102270338B (en) Method for effectively segmenting repeated object based on image representation improvement
Liang et al. Helixsurf: A robust and efficient neural implicit surface learning of indoor scenes with iterative intertwined regularization

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080820

Termination date: 20120914