CN103561420B - Method for detecting abnormality based on data snapshot figure - Google Patents

Method for detecting abnormality based on data snapshot figure Download PDF

Info

Publication number
CN103561420B
CN103561420B CN201310549381.4A CN201310549381A CN103561420B CN 103561420 B CN103561420 B CN 103561420B CN 201310549381 A CN201310549381 A CN 201310549381A CN 103561420 B CN103561420 B CN 103561420B
Authority
CN
China
Prior art keywords
event
data
node
sequence
diagram
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310549381.4A
Other languages
Chinese (zh)
Other versions
CN103561420A (en
Inventor
吕建华
张柏礼
魏巨巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201310549381.4A priority Critical patent/CN103561420B/en
Publication of CN103561420A publication Critical patent/CN103561420A/en
Application granted granted Critical
Publication of CN103561420B publication Critical patent/CN103561420B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a kind of method for detecting abnormality based on data snapshot figure, comprise the steps: that the detection data in the currently monitored region of wireless sensor network are acquired and pretreatment by (1), it is determined that event area; (2) obtain the data set relevant to current event, by graph model abstract event data to, event data is converted event data snapshot plotting; (3) adopt the figure Similarity algorithm based on structure connection degree, inquire about in event schema chart database, search the event schema figure similar to occurrence diagram, it is judged that the type of current event; Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type. Method for detecting abnormality based on data snapshot figure provided by the invention, occurrence diagram can obtain based on domain-specialist knowledge, or obtains based on data analysis, detects for complicated event, improves event detection efficiency, reduces rate of false alarm.

Description

Method for detecting abnormality based on data snapshot figure
Technical field
The present invention relates to the method for detecting abnormality of a kind of wireless sensor network, particularly relate to a kind of method for detecting abnormality based on data snapshot figure.
Background technology
The abnormality detection present situation of wireless sensor network
In wireless sensor network, sensor node data exception Producing reason is varied, as sensor node itself there occurs fault, the data of collection there occurs anomalous event etc. containing in noise data and sensor network. The abnormality detection of wireless sensor network detects these abnormal datas exactly, feeds back to user, in order to user makes corresponding decision-making. But, a lot of users do not require nothing more than and detect that the data of which sensor node there occurs exception, also require to detect the concrete anomalous event type causing these data exceptioies. Such abnormality detection, also referred to as accident detection or event detection, has important practical significance. Such as, in fire detection is applied, when the data of sensor network occur abnormal, these abnormal datas will be judged, confirm that these abnormal datas are caused by that anomalous event, namely detect that monitored area is to there occurs fire, still there occurs other events.
Wireless sensor network is data-centered, and has very strong temporal correlation between data. Usually, if regarding certain node as summit in datagram in data sometime, the temporal correlation between data sees the limit in datagram as, then can naturally enough use graph model to describe event feature. A lot of water-lubricated sealing device prove that graph model has very strong ability on description complicated event, it is possible to be applied to the detection of line Sensor Network complicated event.If based on event atlas and information concerning events building database, the detection of Sensor Network complicated event can be considered the query processing problem of collection class diagram data. When an event occurs, gather related data and set up event query figure, data base inquires about the diagram data matched, just can obtain and present event relevant information, such as event type, event likely cause, following possible development trend, effective process means etc., these information are the important evidence of relevant Decision.
The top priority of event detection is event establishment model, and a suitable event model is to ensure that the basis of event detection accuracy. Radio sensing network event detection technology has been obtained for extensive and deep research, and major part is based on the event detection technology of threshold value, and whether whether event occur the detected value depending on detected attribute beyond threshold value set in advance. But this scheme also has deficiency in decision support, and it is likely to cause wrong report, for instance certain node is broken through threshold value and is likely to be the event of there occurs and is also possible to be because equipment fault or transmission fault.
For the shortcoming based on threshold test, occur in that event detection technology (the document W.Xue based on isogram (ContourMap), Q.Luo, L.ChenandY.Liu.ContourMapMatchingforEventDetectioninSens orNetworks [C] .InProceedingsofACMSIGMOD, 2006. and Y.Liu, M.Li.Iso-Map:Energy-EfficientContourMappinginWirelessSen sorNetworks [C] .InProceedingsofIEEEICDCS, Toronto, Canada, June2007.). Isogram technology is abstracted into the space-time model of node perceived data the event in sensor network domains, and a situation arises to carry out query event by Model Matching, can significantly improve detection efficiency. Contour is also a kind of graph model, can effectively describe the space-time data feature of event, but the ideograph of contour is all based on what expertise obtained, not have general universality.
The similarity query state of the art of diagram data
Figure similarity query can formal definitions be: given chart database D={g1,g2,��,gn, query graph q, similarity query returns set of graphs { gi|gi�� D, giGiven threshold value is met } with the similarity of q.
For figure similarity query problem, key problem is the similarity needing tolerance means to quantify two figure. Some researcher proposition figure editing distance (GraphEditDistance) carrys out measured similarity. Figure editing distance is transformed by the thought of string matching, namely utilizes editor and the alignment distance of the alignment distance of character string and the thought structural map of editing distance. The comparison of two figure needs three kinds of edit operations: inserts, delete and change. Method based on figure editing distance belongs to indirect calculation similarity, and its computation complexity is higher, belongs to np complete problem. Except figure editing distance, maximum public subgraph is also used to weigh the similarity between two graph structures, the i.e. maximum common portion of two width figures. Belong to based on the method for maximum public subgraph and directly calculate similarity, used the calculating of Subgraph Isomorphism, thus computation complexity is higher. Document H.BunkeandK.Shearer.Agraphdistancemetricbasedonmaximalco mmonsubgraph.PatternRecognition, 19:25-259,1998 employs maximum public subgraph (MaximalCommonSubgraph) and weighs graph structure similarity.
Due to calculating figure editing distance with to solve maximum public subgraph be all np complete problem, so when processing similarity query problem by both approaches, it is usually the upper bound or lower bound that first calculate two figure similarities, and it is little compared with the time overhead directly calculating two graph structure similaritys to calculate the time overhead of similarity Lower and upper bounds, and Lower and upper bounds can be utilized to filter out a part of non-result set. Grafil(document X.Yan, F.Zhu, P.S.Yu, etal.Feature-basedSimilaritySearchinGraphStructures [J] .ACMTransactionsonDatabaseSystems (TODS), 2006,31 (4): 1418-1453) being to solve for the algorithm of subgraph Similarity Problem, subgraph similarity query is exactly the diagram data set that inquiry and given query graph have the public subgraph meeting some condition. It utilizes maximum public subgraph to measure the similarity of two figure, gives limit the lax concept than (relaxationratio) in literary composition. Grafil extracts feature from chart database and sets up feature-graph matrix index, the sub-structural features being included in query graph is determined during inquiry, then the lax of query graph limit is transformed into Characteristic Number and the minimizing that query graph comprises, by being likely to lose the maximum number of feature after calculating slack side, just can filter out a part of non-results set in advance, thus reducing problem complexity.
Generally solve figure editing distance and mainly have two class algorithms: exact algorithm and Similarity algorithm. substantial amounts of exact algorithm (document K.Riesen, S.Fankhauser, H.Bunke.Speedingupgrapheditdistancecomputationwithabipar titeheuristic.InMLG ' 07 and document M.Neuhaus, k.Riesen, andH.Bunke.Fastsuboptimalalorithmsforthecomputaionofgrap heditdistance.InSSSSpR ' 06) it is typically based on A* algorithm (the document P.Hart that comparison is famous, N.Nilsson, B.Raphael.Aformalbasisfortheheuristicdeterminationofmini mumcostpaths.IEEETrans.SSC, 4 (2): 100-107, 1968.), but exact algorithm is typically only capable to process the figure less than 12 summits, the upper of editing distance is solved so substantial amounts of, the algorithm of lower bound is suggested.
BLP(document D.Justice, A.Hero.ABinaryLinearProgrammingFormulationoftheGraphEdit Distance [J] .IEEETrans.PatternAnal.Mach.Intell., 2006,28 (8): 1,200 1214) for having no right the label figure method providing two figure editing distances of calculating and Lower and upper bounds thereof, minimization problem is converted into 0,1 integral linear programming model by the method. Having no right label figure refers to have label not have the figure of weights on limit on summit. BLP will calculate two figure of the editing distance subgraph regarding editor figure representated by grid as, so the edit operation between two figure is necessarily not over this big editor's grid, because the size of this grid (length of grid and width) is precisely the sum of two figure number of vertices. The edit operation demonstrating figure in literary composition is equivalent to the change of this editor's trellis state, and if an edit operation cost be module, then thus calculated editing distance is also a module. This model is owing to being 0,1 integral linear programming problem, and integral linear programming problem is absent from polynomial time algorithm, so again scope of a variable being relaxed as [0-1], thus become general linear planning problem, and there is polynomial time algorithm in general linear planning problem, i.e. interior point method.Owing to lax rear linear programming scope of a variable is the superset of lax front scope of a variable, and this linear programming model represents is minimization problem, so what the model after lax calculated is the lower bound of two figure editing distances, utilize lower bound just can filter some database diagram.
ComparingStars(document Z.Zeng, A.K.T.Tung, J.Wangetat.ComparingStars:OnApproximatingGraphEditDistan ce [C] .InVLDB, 2009) adopt editing distance to measure the similarity between two figure. Document represents a figure with multiple hub-and-spoke configurations, is calculated the Lower and upper bounds of two figure by the hub-and-spoke configuration group compared corresponding to two figure, and this calculating process can complete in polynomial time.
Summary of the invention
Goal of the invention: in order to overcome the deficiencies in the prior art, event abstract is become data snapshot figure according to the snapshot data of wireless sensor network event by the present invention, provide the method for detecting abnormality (DataSnapshotGraphBasedAnomalyDetectionAlgorithm is called for short DSG) based on data snapshot figure; Occurrence diagram can obtain based on domain-specialist knowledge, or obtains based on data analysis.
Technical scheme: for achieving the above object, the technical solution used in the present invention is:
Based on the method for detecting abnormality of data snapshot figure, comprise the steps:
(1) the detection data in the currently monitored region of wireless sensor network are acquired and pretreatment, it is determined that event relevant range;
(2) obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to event data snapshot plotting;
(3) adopt the similar search algorithm of figure based on structure connection degree, inquire about in event schema chart database, search the event schema figure similar to the event data snapshot plotting of current event, it is judged that the type of current event;
Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type;
Described event schema figure is obtained by domain-specialist knowledge or obtains based on data analysis, is a kind of occurrence diagram based on data snapshot; Described data snapshot is the data set of each node in event generation time sensor network, based on the snapshot plotting that occurrence diagram is event time that this data set is set up, is also the event schema figure of this event;
The described similar search algorithm of the figure based on structure connection degree is specially, basic structure is extracted from diagram data, diagram data is converted for basic structure sequence with the degree of association between basic structure, similar for figure inquiry problem is converted into sequence similarity query problem, effectively reduce inquiry complexity, to apply suitable in event detection.
In described step (1), physical correlation property associated with the data based on sensor node sets up node associated diagram, determining event area according to node associated diagram, described node associated diagram includes global node associated diagram and the subgraph of global node associated diagram, node associated diagram to set up mode as follows:
The node associated diagram formalization of t is expressed as:
Gt=<V,E,ID,fv>
Wherein: V is the vertex set of figure, all event related top are comprised; E is the limit set of figure; ID is the numbering set on summit; fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding; A summit on each node configuration node associated diagram of wireless sensor network;
If d is (vi)tFor the vertex v Monitoring Data in t, the limit set E structure principle of figure is as follows: for any two vertex v1,v2�� E, if v1With v2Corresponding sensor node is single-hop communication neighbours, or v1With v2Corresponding sensor node is communication neighbours and existence function f in k-hop1With f2Make f1(d(v1)t)=f2(d(v2)t), then there is limit (v1,v2) �� E;
Described event relevant range defining method is: at the moment t of event detection, for any vertex vi�� E, if | d (vi)t-1-d(vi)t|/|d(vi)t-1+d(vi)t|��e, then vertex viFor event related top, the region at t all event related top place is event relevant range; Wherein constant e is preset value, is typically chosen in 2.5%��5%;
Determining the subgraph that the node associated diagram after event boundaries is global node associated diagram, the subgraph definition of global node associated diagram is as follows:
Get=<V,E,ID,fv>
Wherein: V is the vertex set of figure, comprise all event related top,E is the limit set of figure,ID is the numbering set on summit,fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding.
In described step (2), with graph model abstract event data collection, event data collection converting event data snapshot plotting to, described data snapshot is as follows with event data snapshot plotting:
1) wireless sense network data snapshot S definition is as follows:
For having the wireless sense network N of k node, it comprises node is { n1,n2,��,nk, N is set { d (n in the data snapshot of moment t1)t,d(n2)t,��,d(nk)t;
2) the event data snapshot plotting Gs of ttBy the node associated diagram Ge of ttObtaining according to node data correlation calculations, its formalization representation is:
Gst=<V,E,ID,DV,fv,gv>
Wherein: V is the vertex set of figure, all event related top are comprised; E is the limit set of figure; ID is the numbering set on summit; DV={d (vi)tBe in event area all the sensors node at the monitor value of t; fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding; gv: V �� DV is the data mapping function on summit;
The event data snapshot plotting Gs of ttThe node associated diagram Ge of vertex set and ttVertex set identical, comprise all the sensors node in event area;
The event data snapshot plotting Gs of ttLimit collection E (Gst) it is constructed as follows:
A) for any limit (v1,v2)��E(Get), if (d (v1)t-d(v2)t)/(d(v1)t+d(v2)t)>e, then there is directed edge<v2,v1>��E(Gst);
B) for any limit (v1,v2)��E(Get), if (d (v2)t-d(v1)t)/(d(v1)t+d(v2)t)>e, then there is directed edge<v1,v2>��E(Gst);
C) for any limit (v1,v2)��E(Get), if | (d (v1)t-d(v2)t)|/(d(v1)t+d(v2)t) < e then exists directed edge < v2,v1>��E(Gst), and there is directed edge < v1,v2>��E(Gst);
Wherein constant e is preset value, is typically chosen in 2.5%��5%; Described event data snapshot plotting is directed graph, is used for the contact describing between data mode and the data mode of each node in wireless sensor network event area;
3) event data snapshot plotting scale is still likely to very big so that event schema feature is inconspicuous, affects the effect of event detection; It is thus desirable to event data snapshot plotting is simplified, describe the pattern feature of event more abstractively, also be able to reduce the scale of diagram data simultaneously, improve storage efficiency and process performance; Event data snapshot plotting is carried out simplifying operation by the present invention, and event data snapshot plotting is simplified, and described simplified way is for merge sensor node, and the rule that node merges is:
A) the necessary approximately equal of data of node is merged: namely to v2,v1��V(Gst), if < v1,v2>��E(Gst) and | (d (v1)t-d(v2)t)|/(d(v1)t+d(v2)t) < e then merges v2,v1It it is a new node;
B) when approximately equalised two or more node merges into a new node, the limit being associated with these nodes is all associated with on new node.
Merge rule according to above-mentioned node event data snapshot plotting is simplified, it is possible to eliminate redundancy, more can describe data characteristics abstractively; The level of abstraction of data is determined by data pooled error scope e, and e is more big, and it is more high that data merge degree, and event schema is more simple;Otherwise, e is more little, and it is more low that data merge degree, and event schema is more complicated; Wherein constant e is preset value, is typically chosen in 2.5%��5%.
In described step (3), figure Similarity algorithm based on structure connection degree is specially, it is primarily based on structure connection degree and extracts the architectural feature sequence of diagram data, similar for diagram data inquiry is converted into the inquiry of architectural feature sequence similarity, then in event schema chart database, the event schema figure similar to event data snapshot plotting is searched, it is judged that the type of current event; Detailed process includes into lower step:
1) basic structure defining diagram data is ring-like (cycle) structure, star-like (star) structure and line style (line) structure, relative to some other structure type, such as Frequent tree mining, frequent subtree etc., these three basic structure is more easy to acquisition, and contain the basic structure information of figure, the basic structure definition of three kinds of diagram datas is as follows:
Ring type structure: in figure, a series of set forms a closed-loop, and the limit number in this closed-loop is be more than or equal to 3, note loop configuration is cycle (s), s={v | v �� V �� v node constitutes a ring }, wherein this closed-loop can not other rings nested, namely this closed-loop is simple ring;
Hub-and-spoke configuration: a certain core vertex v in figure0Connect other several summits, and do not connect between other any two summit, meet degress (v0) >=3, note hub-and-spoke configuration is star (v0, s), s={v | v0, v �� V �� v is v0Neighbors, degress (v0) represent node v0Degree;
Linear structure: by the end-to-end connected structure in a string summit, note linear structure is line (s), s={v | v �� V �� degress (v)��2}, degress (v) represent the degree of node v;
2) basic structure extraction step is as follows:
1. all of ring type structure in figure is first found out by extreme saturation method and backtracking thought;
2. comparing any two of which ring type structure A, B, if A is the subset of B, namely ring type structure B comprises ring type structure A, then delete ring type structure B;
3. 2. circulation performs step until not comprising the ring type structure of other ring type structures, obtains the loop configuration of all simple rings;
4. each degree of vertex in figure, the number of degrees one hub-and-spoke configuration of the conduct be more than or equal to 3 are calculated;
5. calculate each degree of vertex in figure, if certain degree of vertex equal to 1 and the number of degrees of its abutment points less than or equal to 2, then continue traversal abutment points, until certain degree of vertex, more than 2, is consequently formed a linear structure;
3) the graph data structure characteristic sequence extracting method based on structure connection degree is as follows:
Significance level according to each structure is different, the sequence of basic structure carries out the sequence of significance level, graph structure data converts to the sequence of basic structure, weighs the significance level of each structure by the degree of association between structure:
Association: any two basic structure s in a figureiAnd sjIf: meet cvNum (si,sj) >=1, then structure siWith structure sjIt is association, is designated as incident (si,sj)=1; If cvNum is (si,sj)=0, then incident (si,sj)=0, illustrates structure siWith structure sjDo not associate; Correlation form is defined as:
incident ( s i , s j ) = 1 if cvNum ( s i , s j ) &GreaterEqual; 1 0 if cvNum ( s i , s j ) = 0
Wherein cvNum (si,sj) represent structure siWith structure sjPublic vertex number, and i �� j;
The degree of association based on relational structure quantity: a given figure g, it is assumed that containing N number of basic structure, then i-th basic structure siThe degree of association be:
Wherein: 1��i��N, then it can be seen that sNum_CD (si)��(N-1);If one basic structure s and k basic structure association, the then degree of association sNum_CD (s) of this basic structure s=k;
According to above-mentioned definition, event data snapshot plotting is converted into the basic structure sequence based on the degree of association;
4) the architectural feature sequence similarity query algorithm of diagram data, specifically comprises the following steps that
The similarity of source string S and target string T is calculated by editing distance; Described editing distance refers to the quantity or the cost that are changed to minimum edit operation required for T by S, wherein proposed edit operation refers to the operation that the character of some position to character string is deleted, inserted, replaces, each conversion operation has a relevant cost, and the cost of a given conversion sequence of operation is equal to the cost sum of single operation in sequence;
In event data snapshot plotting basic structure sequence, the forward structure connection degree of level is more big, and namely importance is more big, then the probability of the main feature of this structure representative graph is more big; In structure sequence, first structure importance in the drawings is maximum, and the cost needed for editing this structure should also be maximum, thus defines exponential function f (the x)=a of a kind of monotone decreasing-xAs the cost function changing a character manipulation every time;
Sequence editing distance similarity: given sequence data base Set={s1,s2,��,sn, a search sequence qStr and an editing distance threshold tau, sequence similarity query result all in sequence library Set meets SED (qStr, s for returningi) < the sequence s of ��i;
A given search sequence, the editing distance between sequence and query graph sequence in string editing distance sequence of calculation data base, then result returns all sequence datas with search sequence editing distance less than given cost threshold tau in sequence library.
Beneficial effect: the method for detecting abnormality based on data snapshot figure provided by the invention, occurrence diagram can obtain based on domain-specialist knowledge, or obtains based on data analysis, it is possible to reduces rate of false alarm.
Accompanying drawing explanation
Fig. 1 is the flow chart of the present invention;
Fig. 2 is event schema Fig. 1;
Fig. 3 is event schema Fig. 2;
Fig. 4 is event schema Fig. 3;
The temperature relative error that Fig. 5 is different merges the event detection effect of node;
The humidity relative error that Fig. 6 is different merges the event detection effect of node;
The oxygen content relative error that Fig. 7 is different merges the event detection effect of node.
Detailed description of the invention
Below in conjunction with accompanying drawing, the present invention is further described.
Solution principle explanation
In wireless sensor network, the generation of certain event is necessarily reacted on the state of sensor node Monitoring Data changes, and event inherent feature will be derived by the specific data pattern of this event. If data are carried out abstract characteristics extraction, find out this data pattern, then when sensor network presents this data pattern again, it is possible to the generation according to the similarity determination corresponding event of data pattern. Wireless sensor network is data-centered, and between data, have very strong temporal correlation, if regarding certain node as summit in datagram in data sometime, temporal correlation between data sees the limit in datagram as, then can naturally enough use graph model to describe event feature. Then wireless sensor network event detection can be converted into the similarity query problem of graph model data.
It is illustrated in figure 1 a kind of method for detecting abnormality based on data snapshot figure (DSG), comprises the steps:
(1) the detection data in the currently monitored region of wireless sensor network are acquired and pretreatment, it is determined that event relevant range;
(2) obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to event data snapshot plotting;
(3) adopt the similar search algorithm of figure based on structure connection degree, inquire about in event schema chart database, search the event schema figure similar to the event data snapshot plotting of current event, it is judged that the type of current event;
Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type;
Described event schema figure is obtained by domain-specialist knowledge or obtains based on data analysis, is a kind of occurrence diagram based on data snapshot; Described data snapshot is the data set of each node in event generation time sensor network, based on the snapshot plotting that occurrence diagram is event time that this data set is set up, is also the event schema figure of this event;
The described similar search algorithm of the figure based on structure connection degree is specially, basic structure is extracted from diagram data, diagram data is converted for basic structure sequence with the degree of association between basic structure, similar for figure inquiry problem is converted into sequence similarity query problem, effectively reduce inquiry complexity, to apply suitable in event detection.
Based on domain-specialist knowledge: in some application of wireless sensor network, the data characteristics of particular event is known, and these known knowledge can be used for building event schema figure.
Based on data analysis: in many application of wireless sensor network, although the data characteristics of event presents certain regularity, but often it is hidden among substantial amounts of data, and the present invention data characteristics of diagram data abstractdesription event, build event schema figure.
Detection data in the currently monitored region of wireless sensor network being acquired and pretreatment, it is determined that event relevant range, step is as follows:
1) node associated diagram is set up
Node associated diagram is for describing in wireless sensor network, at the incidence relation of moment t between sensor node. Whether the degree of association between node comprises two aspect information, is respectively as follows: 1. physical interconnection degree: be the neighbor node of single-hop communication, 2. data association degree: whether there is dependency between nodal test data.
The node associated diagram formalization of t is expressed as:
Gt=<V,E,ID,fv>
Wherein: V is the vertex set of figure, all event related top are comprised; E is the limit set of figure; ID is the numbering set on summit; fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding; A summit on each node configuration node associated diagram of wireless sensor network.
If d is (vi)tFor the vertex v Monitoring Data in t, the limit set E structure principle of figure is as follows: for any two vertex v1,v2�� E, if v1With v2Corresponding sensor node is single-hop communication neighbours, or v1With v2Corresponding sensor node is communication neighbours and existence function f in k-hop1With f2Make f1(d(v1)t)=f2(d(v2)t), then there is limit (v1,v2)��E��
Wherein constant k is pre-defined value, and k more big then node associated diagram is more complicated; Otherwise, k more little then node associated diagram is more simple; If k=1, then node associated diagram is identical with wireless sensing net topology. In general, optional k=2, to guarantee that node associated diagram can take into account internodal physical correlation property associated with the data, and make graph structure be unlikely to too complex. Function f1With f2Selection principle relevant with data characteristics, it is possible to the quantitative correlation between data is described, it is also possible to describe the qualitative correlation between data.
By above-mentioned definition it can be seen that node associated diagram is a non-directed graph, for describing the incidence relation between wireless sensing net node, not only comprise physical correlation but also include data dependence.
2) event relevant range is determined
At the moment t of event detection, for any vertex vi�� E, if | d (vi)t-1-d(vi)t|/|d(vi)t-1+d(vi)t|��e, then vertex viFor event related top, the region at t all event related top place is event relevant range; Wherein constant e is preset value, is typically chosen in 2.5%��5%.
Determining the subgraph that the node associated diagram after event boundaries is global node associated diagram, the subgraph definition of global node associated diagram is as follows:
Get=<V,E,ID,fv>
Wherein: V is the vertex set of figure, comprise all event related top,E is the limit set of figure,ID is the numbering set on summit,fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding.
Obtaining the data set relevant to current event, with graph model abstract event data collection, event data collection converts time data snapshot plotting to, concrete steps are as follows with explanation:
1) t data snapshot is obtained
Wireless sense network data snapshot S definition is as follows:
For having the wireless sense network N of k node, it comprises node is { n1,n2,��,nk, N is set { d (n in the data snapshot of moment t1)t,d(n2)t,��,d(nk)t}��
2) event data snapshot plotting is calculated
The event data snapshot plotting Gs of ttBy the node associated diagram Ge of ttObtaining according to node data correlation calculations, its formalization representation is:
Gst=<V,E,ID,DV,fv,gv>
Wherein: V is the vertex set of figure, all event related top are comprised; E is the limit set of figure; ID is the numbering set on summit; DV={d (vi)tBe in event area all the sensors node at the monitor value of t; fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding; gv: V �� DV is the data mapping function on summit.
The event data snapshot plotting Gs of ttThe node associated diagram Ge of vertex set and ttVertex set identical, comprise all the sensors node in event area;
The event data snapshot plotting Gs of ttLimit collection E (Gst) it is constructed as follows:
A) for any limit (v1,v2)��E(Get), if (d (v1)t-d(v2)t)/(d(v1)t+d(v2)t)>e, then there is directed edge<v2,v1>��E(Gst);
B) for any limit (v1,v2)��E(Get), if (d (v2)t-d(v1)t)/(d(v1)t+d(v2)t)>e, then there is directed edge<v1,v2>��E(Gst);
C) for any limit (v1,v2)��E(Get), if | (d (v1)t-d(v2)t)|/(d(v1)t+d(v2)t) < e then exists directed edge < v2,v1>��E(Gst), and there is directed edge < v1,v2>��E(Gst);
Wherein constant e is preset value, is typically chosen in 2.5%��5%; Described event data snapshot plotting is directed graph, is used for the contact describing between data mode and the data mode of each node in wireless sensor network event area.
3) event data snapshot plotting is simplified
Event data snapshot plotting scale is still likely to very big so that event schema feature is inconspicuous, affects the effect of event detection; It is thus desirable to event data snapshot plotting is simplified, describe the pattern feature of event more abstractively, also be able to reduce the scale of diagram data simultaneously, improve storage efficiency and process performance; Event data snapshot plotting is carried out simplifying operation by the present invention, and event data snapshot plotting is simplified, and described simplified way is for merge sensor node, and the rule that node merges is:
A) the necessary approximately equal of data of node is merged: namely to v2,v1��V(Gst), if < v1,v2>��E(Gst) and | (d (v1)t-d(v2)t)|/(d(v1)t+d(v2)t) < e then merges v2,v1It it is a new node;
B) when approximately equalised two or more node merges into a new node, the limit being associated with these nodes is all associated with on new node.
Merge rule according to above-mentioned node event data snapshot plotting is simplified, it is possible to eliminate redundancy, more can describe data characteristics abstractively; The level of abstraction of data is determined by data pooled error scope e, and e is more big, and it is more high that data merge degree, and event schema is more simple;Otherwise, e is more little, and it is more low that data merge degree, and event schema is more complicated; Wherein constant e is preset value, is typically chosen in 2.5%��5%.
Adopt the figure Similarity algorithm based on structure connection degree, inquire about in event schema chart database, search the event schema figure similar to occurrence diagram, it is judged that the type of current event. The similarity query of diagram data is costly, is not suitable for the event detection scene that real-time is higher, and the present invention extracts the architectural feature sequence of diagram data based on structure connection degree, and similar for diagram data inquiry is converted into the inquiry of architectural feature sequence similarity. Event-mode data library storage is various architectural feature sequences corresponding for event schema figure. Concrete scheme is as follows:
1) diagram data basic structure is extracted
The basic structure of definition diagram data is ring-like (cycle) structure, star-like (star) structure and line style (line) structure, relative to some other structure type, such as Frequent tree mining, frequent subtree etc., these three basic structure is more easy to acquisition, and contain the basic structure information of figure, the basic structure definition of three kinds of diagram datas is as follows:
Ring type structure: in figure, a series of set forms a closed-loop, and the limit number in this closed-loop is be more than or equal to 3, note loop configuration is cycle (s), s={v | v �� V �� v node constitutes a ring }, wherein this closed-loop can not other rings nested, namely this closed-loop is simple ring;
Hub-and-spoke configuration: a certain core vertex v in figure0Connect other several summits, and do not connect between other any two summit, meet degress (v0) >=3, note hub-and-spoke configuration is star (v0, s), s={v | v0, v �� V �� v is v0Neighbors, degress (v0) represent node v0Degree;
Linear structure: by the end-to-end connected structure in a string summit, note linear structure is line (s), s={v | v �� V �� degress (v)��2}, degress (v) represent the degree of node v.
One figure being likely to containing a lot of ring type structures, and ring type structure is nested against one another sometimes, if considering all ring type structures, not only will cause that problem complexity increases, and can some ring structure of double counting. Owing to ring type structure is all made up of basic ring, therefore this case only considers basic ring type structure. Basic ideas are first to find out all of ring structure in figure by extreme saturation method and backtracking thought, then compare any two of which ring structure A, B, if A is the subset of B, namely ring structure B comprises ring structure A, then delete B structure. By this method, what finally obtain is exactly basic ring type structure.
When extracting hub-and-spoke configuration and linear structure, first calculate each degree of vertex in figure, the number of degrees one hub-and-spoke configuration of the conduct be more than or equal to 3. If certain degree of vertex equal to 1 and the number of degrees of its abutment points less than or equal to 2, then continue traversal abutment points, until certain degree of vertex, more than 2, is consequently formed a linear structure.
2) the architectural feature sequence of diagram data is extracted based on structure connection degree
After basic structure is all extracted, next step is exactly that the significance level according to each structure is different, the sequence of basic structure is carried out the sequence of significance level, graph structure data is converted to the sequence of basic structure, weigh the significance level of each structure by the degree of association between structure:
Association: any two basic structure s in a figureiAnd sjIf: meet cvNum (si,sj) >=1, then structure siWith structure sjIt is association, is designated as incident (si,sj)=1;If cvNum is (si,sj)=0, then incident (si,sj)=0, illustrates structure siWith structure sjDo not associate; Correlation form is defined as:
incident ( s i , s j ) = 1 if cvNum ( s i , s j ) &GreaterEqual; 1 0 if cvNum ( s i , s j ) = 0
Wherein cvNum (si,sj) represent structure siWith structure sjPublic vertex number, and i �� j;
The degree of association based on relational structure quantity: a given figure g, it is assumed that containing N number of basic structure, then i-th basic structure siThe degree of association be:
Wherein: 1��i��N, then it can be seen that sNum_CD (si)��(N-1); If one basic structure s and k basic structure association, the then degree of association sNum_CD (s) of this basic structure s=k;
According to above-mentioned definition, event data snapshot plotting is converted into the basic structure sequence based on the degree of association.
3) the architectural feature sequence similarity query algorithm of diagram data
This case calculates the similarity of source string S and target string T by editing distance; Described editing distance refers to the quantity or the cost that are changed to minimum edit operation required for T by S, wherein proposed edit operation refers to the operation that the character of some position to character string is deleted, inserted, replaces, each conversion operation has a relevant cost, and the cost of a given conversion sequence of operation is equal to the cost sum of single operation in sequence.
In event data snapshot plotting basic structure sequence, the forward structure connection degree of level is more big, and namely importance is more big, then the probability of the main feature of this structure representative graph is more big; In structure sequence, first structure importance in the drawings is maximum, and the cost needed for editing this structure should also be maximum, thus defines exponential function f (the x)=a of a kind of monotone decreasing-xAs the cost function changing a character manipulation every time;
Sequence editing distance similarity: given sequence data base Set={s1,s2,��,sn, a search sequence qStr and an editing distance threshold tau, sequence similarity query result all in sequence library Set meets SED (qStr, s for returningi) < the sequence s of ��i;
A given search sequence, the editing distance between sequence and query graph sequence in string editing distance sequence of calculation data base, then result returns all sequence datas with search sequence editing distance less than given cost threshold tau in sequence library. SED algorithm false code based on weights cost function is as follows:
Editing distance Algorithms T-cbmplexity is O (mn), space complexity is O (mn), if the order of edit operation need not be recorded, space complexity is O (min (m, n)), wherein, m, n represents source string S and the length of target string T respectively.
Realize algorithm citing
Sensor network generation data exception (or user's inquiry), it is determined that event area, builds occurrence diagram, and query event ideograph data base, according to returning result event type. Introduce defining method and the occurrence diagram querying method of event area separately below:
The defining method of event area: one is the querying command that user issues, and this event area is that user specifies; Two is there occurs data exception in wireless sensor network, and these occur the sensor node of data exceptioies and the region at other sensor node place that is associated just to constitute event area.
Occurrence diagram querying method: a given occurrence diagram, finds out the event schema figure similar to this occurrence diagram in event schema chart database, and result is returned.
Based on above analysis, it is as follows that we provide concrete DSG algorithm:
Experimental performance is analyzed
Event scenarios
Three kinds of particular events are defined: detect the event of detection oxygen high Areas in the event detecting current in the event of thermal source, region and region in region, separately below the scene of these three event is introduced according to actual application background:
Event scenarios 1: devise the scene detecting thermal source in a region according to the application background of fire preventing, in this scene cause with thermal source simulated fire, namely by detecting the effect of the fire detection that the effect of area heat source is similar in reflection actual environment.
Event scenarios 2: devise detection in a region according to detection infiltration or permeable application background in tunnel and there is the scene of current, is namely similar to by the effect of current in detection event area and reflects the effect of water seepage of tunnel or permeable detection in actual environment.
Event scenarios 3: the application background according to detecting oxygen high-load region in colliery devises the scene detecting oxygen high Areas in a region, is namely similar to by the effect of oxygen high Areas in detection event area and reflects the effect of oxygen high-load region detection in actual environment.
This experiment heat source temperature data according to actual monitoring, current flow through region relative humidity data and colliery oxygen density data, the analog data that synthesis of artificial experiment is required, carries out emulation experiment respectively, and event detection effect is evaluated these three particular event.
In emulation experiment, wireless sensor network is made up of 256 nodes, is distributed in the spatial dimension of 16 �� 16, and by this region representation event area.
Event schema figure describes
Event schema Fig. 1 is the abstractdesription to certain moment fire snapshot data. Hot spot temperature is the highest, and temperature around increases along with distance and reduces; The such data mode of Fig. 2 approximate description. Event schema Fig. 1 less node and directed edge approximate description data pattern of fire.
Event schema Fig. 2 is the abstractdesription that certain moment current flows through region snapshot data. Along current nearest edge relative humidity is the highest and also all approximately equals, and along current region relative humidity farther out is relatively low and also all approximately equals, the such data mode of Fig. 3 approximate description. 4 node data approximately equals of above and below in figure, thus connect with two-way directed edge between them; And with unidirectional limit, 4 node datas, higher than 4 node datas above, represent that the node by data are low points to the node that data are high below in figure. Thus event schema Fig. 2 describes current with less node and directed edge and flows through the data pattern in region.
Event schema Fig. 3 is the abstractdesription that certain moment colliery oxygen high-load is distributed snapshot data. The oxygen density of oxygen high Areas is higher and all basically identical, and oxygen density around is relatively low; The such data mode of Fig. 4 approximate description. 6 vertex data all approximately equals above in figure, namely all use two-way directed edge to connect between them, and two summit oxygen densities following in figure are relatively low, thus point to, with unidirectional limit, the summit that oxygen density is higher. Thus the data pattern of event schema Fig. 3 approximate description oxygen high-load distribution.
Three above-mentioned event schema figure represent three particular event data patterns respectively, but these three event schema figure can also represent the data pattern of other event. Such as, event schema Fig. 1 can also describe the data pattern of gas leak event; This is because the coal gas density at gas leak center is the highest, coal gas density around is gradually reduced, from center more away from coal gas density more low, so event schema Fig. 1 can the data pattern of approximate description gas leak event. This also illustrates that event schema figure has good versatility, it is possible to describe the data pattern of complicated event.
Experiment
Experiment 1: detect the emulation experiment of thermal source event in region. First, adopting data simulator to generate each node snapshot data of 120 normal conditions, normal condition is exactly the data mode being absent from thermal source in region, including the environment of temperature plateau, environment etc. that range of temperature is bigger;Then, data simulator is adopted to generate 120 each node snapshot datas containing thermal source, the central temperature stochastic generation of simulation thermal source, the centre coordinate of thermal source determines at random in event area, and the outside radial extension of thermal source is not quite similar, as when there being wind, when calm and when near heating sources there is barrier. The parameter of emulation data: normal temperature range is in [0 DEG C, 40 DEG C], and source center temperature range is in [50 DEG C, 100 DEG C], and the data width of whole temperature is 100 DEG C; The node associated diagram in whole event region is that each node associates with 8 nodes about; The data pattern detecting thermal source in region is event schema Fig. 1.
Fig. 5 shows that different temperatures relative error merges the event detection effect of node. As seen from the figure, carrying out node merging when temperature relative error is less, correctly can not there is the situation of thermal source in distinguishable region normal condition and region in the occurrence diagram obtained, and normal condition wrong report in region is existed the situation of thermal source for region. This is because carry out node merging when temperature relative error is less, node merges less, and the temperature grade existed between node is more, it is impossible to the main body trend that effectively reflection variations in temperature rises, and causes and reports phenomenon in a large number by mistake. Node merging is carried out, it is possible to efficiently differentiate region normal condition and region exists the situation of thermal source, but region can be caused to there is failing to report of thermal source situation when temperature relative error is bigger. This is because carry out node merging when temperature relative error is bigger, node merges more, it is possible to the main body trend that effectively reflection variations in temperature rises. In this case, for region normal condition, in the occurrence diagram obtained, state of temperature is basically identical, there is no the trend that variations in temperature rises; And region is existed thermal source situation, main temperature ascendant trend can be retained in the occurrence diagram obtained, it is thus possible to effectively distinguishable region normal condition and region exist the situation of thermal source. But, when there is thermal source for subregion, owing to source center temperature is relatively low, variations in temperature rises distant, and in the occurrence diagram obtained, state of temperature change is less, it is impossible to effectively reflect the main body trend that change rises, having thus resulted in failing to report of this situation, recall ratio declines.
Experiment 2: detect the emulation experiment of current event in region. First, adopting data simulator to generate each node snapshot data of 120 normal conditions, normal condition is exactly the data mode being absent from current in region, including the stable environment of relative humidity, environment etc. that relative humidity variations amplitude is bigger; Then, adopting data simulator to generate 120 each node snapshot datas containing current, the current of simulation flow into event area from different directions, flow out as the crow flies, the outflow of bending or because cutout is without outflow in event area. The parameter of emulation data: normal RH range is in [30,80], and the relative humidity near current to exceed [5,15] than normally, and the data width of whole relative humidity is 50, and wherein the unit of relative humidity is percentage ratio (%); The node associated diagram in whole event region is that each node associates with 8 nodes about; In region, the data pattern of current event is event schema Fig. 2.
Fig. 6 shows that different humidity relative error merges the event detection effect of node. As seen from the figure, carrying out node merging when humidity relative error is less, no matter be normal condition or there is streamflow regime, the occurrence diagram obtained contains substantially no the water flow mode figure of definition.This is because data have small change near normal data and current, owing to humidity relative error is less, node can not effectively merge, thus the occurrence diagram obtained contains substantially no water flow mode figure and causes failing to report of a large amount of event. Along with the increase of humidity relative error, recall ratio begins to ramp up, this is because the node near current obtains effective merging, reflects the pattern of current; But, precision ratio has rising after falling before, this is that namely relative error is [4% owing to the amplitude of variation of relative humidity under normal circumstances concentrates on the scope of [2,3], 6%] in scope, the occurrence diagram of such normal condition also comprises water flow mode figure, causes the wrong report of a large amount of normal condition, but along with the increase of humidity relative error, the amplitude of variation of relative humidity is just eliminated under normal circumstances, and rate of false alarm will reduce; But humidity relative error is when becoming much larger, the data in region tend to unanimously, just can not detect current event, fail to report more and more.
Experiment 3: detect the emulation experiment of oxygen high Areas in region. First, adopting data simulator to generate each node snapshot data of 120 normal conditions, normal condition is exactly the data mode being absent from oxygen high Areas in region, including the stable environment of oxygen density, environment etc. that oxygen density amplitude of variation is bigger; Then, adopting data simulator to generate 120 each node snapshot datas containing oxygen high Areas, the centre coordinate of the oxygen high Areas of simulation determines at random in event area, and the area of high Areas is not quite similar. The parameter of emulation data: the normal oxygen content in region ranges for [15,18], and the oxygen content of high Areas ranges in [18,21], and the data width of whole oxygen content is 6, and wherein the unit of oxygen content is %; The node associated diagram in whole event region is that each node associates with 8 nodes about; In region, the data pattern of oxygen high Areas event is event schema Fig. 3.
Fig. 7 shows that different oxygen content relative error merges the event detection effect of node. As seen from the figure; along with being gradually increased of oxygen content relative error; event recall ratio also gradually rises; this is because the data that node merges increase; the value approximately equal degree of neighbor node strengthens, and the data of oxygen high Areas reach unanimity, and apparently higher than normal region; meet the event schema of oxygen high Areas, thus effectively detected; But during oxygen content relative error bigger, the data in oxygen high Areas and normal district reach unanimity, and normal district or oxygen high Areas just be can not be identified, and recall ratio will decline. Along with being gradually increased of oxygen content relative error, event precision ratio maintains essentially in higher level, this is because normal condition is when carrying out node and merging, it is hardly formed an overall region data consistent and the event schema of the oxygen high Areas of the projecting data of this data, thus rate of false alarm is relatively low, precision ratio is higher.
By experiment above it can be seen that based on the method for detecting abnormality of data snapshot figure in the process building occurrence diagram, error in data scope when node merges directly affects the effect of event detection. Therefore, rational setting data pooled error scope is this key factor, it is possible to realize the efficient detection of event.
Method for detecting abnormality based on data snapshot figure mainly carries out figure modeling according to the snapshot data of wireless sensor network, generates a data snapshot plotting, and this data snapshot figure can the data characteristics of abstractdesription certain event of sensor network. Experiments show that, when snapshot data carrying out figure and modeling, reasonably selecting the error in data scope that node merges is the key factor that datagram models.
The above is only the preferred embodiment of the present invention; it is noted that, for those skilled in the art; under the premise without departing from the principles of the invention, it is also possible to make some improvements and modifications, these improvements and modifications also should be regarded as protection scope of the present invention.

Claims (1)

1. based on the method for detecting abnormality of data snapshot figure, it is characterised in that: comprise the steps:
(1) the detection data in the currently monitored region of wireless sensor network are acquired and pretreatment, it is determined that event relevant range;
(2) obtain the data set relevant to current event, with graph model abstract event data collection, convert event data collection to event data snapshot plotting;
(3) adopt the similar search algorithm of figure based on structure connection degree, inquire about in event schema chart database, search the event schema figure similar to the event data snapshot plotting of current event, it is judged that the type of current event;
Described event schema chart database is the set of event schema figure, and described event schema figure is event data snapshot plotting, is the abstractdesription to event type;
Described event schema figure is obtained by domain-specialist knowledge or obtains based on data analysis, is a kind of occurrence diagram based on data snapshot; Described data snapshot is the data set of each node in event generation time sensor network, based on the snapshot plotting that occurrence diagram is event time that this data set is set up, is also the event schema figure of this event;
The described similar search algorithm of the figure based on structure connection degree, specifically, extract basic structure from diagram data, converts diagram data for basic structure sequence with the degree of association between basic structure, similar for figure inquiry problem is converted into sequence similarity query problem;
In described step (1), physical correlation property associated with the data based on sensor node sets up node associated diagram, event area is determined according to node associated diagram, described node associated diagram includes global node associated diagram and the subgraph of global node associated diagram, node associated diagram to set up mode as follows:
The node associated diagram formalization of t is expressed as:
Gt=< V, E, ID, fv>
Wherein: V is the vertex set of figure, all event related top are comprised; E is the limit set of figure; ID is the numbering set on summit; fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding; A summit on each node configuration node associated diagram of wireless sensor network;
If d is (vi)tFor the vertex v Monitoring Data in t, the limit set E structure principle of figure is as follows: for any two vertex v1,v2�� V, if v1With v2Corresponding sensor node is single-hop communication neighbours, or v1With v2Corresponding sensor node is communication neighbours and existence function f in k-hop1With f2Make f1(d(v1)t)=f2(d(v2)t), then there is limit (v1,v2) �� E;
Described event relevant range defining method is: at the moment t of event detection, for any vertex vi�� V, if | d (vi)t-1-d(vi)t|/|d(vi)t-1+d(vi)t|��e, then vertex viFor event related top, the region at t all event related top place is event relevant range; Wherein constant e is preset value;
Determining the subgraph that the node associated diagram after event boundaries is global node associated diagram, the subgraph definition of global node associated diagram is as follows:
Get=< V, E, ID, fv>
Wherein: V is the vertex set of figure, comprise all event related top,E is the limit set of figure,ID is the numbering set on summit,fv: V �� ID is the labeling function on summit, figure summit and sensor node one_to_one corresponding;
In described step (2), with graph model abstract event data collection, event data collection converting event data snapshot plotting to, described data snapshot is as follows with event data snapshot plotting:
21) wireless sense network data snapshot S definition is as follows:
For having the wireless sense network N of k node, it comprises node is { n1,n2,��,nk, N is set { d (n in the data snapshot of moment t1)t,d(n2)t,��,d(nk)t;
22) the event data snapshot plotting Gs of ttBy the node associated diagram Ge of ttObtaining according to node data correlation calculations, its formalization representation is:
Gst=< V, E, ID, DV, fv,gv>
Wherein: DV={d (vi)tBe in event area all the sensors node at the monitor value of t; gv: V �� DV is the data mapping function on summit;
The event data snapshot plotting Gs of ttThe node associated diagram Ge of vertex set and ttVertex set identical, comprise all the sensors node in event area;
The event data snapshot plotting Gs of ttLimit collection E (Gst) it is constructed as follows:
A) for any limit (v1,v2)��E(Get), if (d (v1)t-d(v2)t)/(d(v1)t+d(v2)t)>e, then there is directed edge<v2,v1>��E(Gst);
B) for any limit (v1,v2)��E(Get), if (d (v2)t-d(v1)t)/(d(v1)t+d(v2)t)>e, then there is directed edge<v1,v2>��E(Gst);
C) for any limit (v1,v2)��E(Get), if | (d (v1)t-d(v2)t)|/(d(v1)t+d(v2)t) < e then exists directed edge < v2,v1>��E(Gst), and there is directed edge < v1,v2>��E(Gst);
Wherein constant e is preset value; Described event data snapshot plotting is directed graph, is used for the contact describing between data mode and the data mode of each node in wireless sensor network event area;
23) event data snapshot plotting being simplified, described simplified way is for merge sensor node, and the rule that node merges is:
A) the necessary approximately equal of data of node is merged: namely to v2,v1��V(Gst), if < v1,v2>��E(Gst) and | (d (v1)t-d(v2)t)|/(d(v1)t+d(v2)t) < e then merges v2,v1It it is a new node;
B) when approximately equalised two or more node merges into a new node, the limit being associated with these nodes is all associated with on new node;
In described step (3), figure Similarity algorithm based on structure connection degree is specially, it is primarily based on structure connection degree and extracts the architectural feature sequence of diagram data, similar for diagram data inquiry is converted into the inquiry of architectural feature sequence similarity, then in event schema chart database, the event schema figure similar to event data snapshot plotting is searched, it is judged that the type of current event; Detailed process includes into lower step:
31) basic structure defining diagram data is ring type structure, hub-and-spoke configuration and linear structure, and the basic structure definition of three kinds of diagram datas is as follows:
Ring type structure: in figure, a series of set forms a closed-loop, and the limit number in this closed-loop is be more than or equal to 3, note loop configuration is cycle (s), s={v | v �� V �� v node constitutes a ring }, wherein this closed-loop can not other rings nested, namely this closed-loop is simple ring;
Hub-and-spoke configuration: a certain core vertex v in figure0Connect other several summits, and do not connect between other any two summit, meet degress (v0) >=3, note hub-and-spoke configuration is star (v0, s), s={v | v0, v �� V �� v is v0Neighbors, degress (v0) represent node v0Degree;
Linear structure: by the end-to-end connected structure in a string summit, note linear structure is line (s), s={v | v �� V �� degress (v)��2}, degress (v) represent the degree of node v;
32) basic structure extraction step is as follows:
1. all of ring type structure in figure is first found out by extreme saturation method and backtracking thought;
2. comparing any two of which ring type structure A, B, if A is the subset of B, namely ring type structure B comprises ring type structure A, then delete ring type structure B;
3. 2. circulation performs step until not comprising the ring type structure of other ring type structures, obtains the loop configuration of all simple rings;
4. each degree of vertex in figure, the number of degrees one hub-and-spoke configuration of the conduct be more than or equal to 3 are calculated;
5. calculate each degree of vertex in figure, if certain degree of vertex equal to 1 and the number of degrees of its abutment points less than or equal to 2, then continue traversal abutment points, until certain degree of vertex, more than 2, is consequently formed a linear structure;
33) the graph data structure characteristic sequence extracting method based on structure connection degree is as follows:
Significance level according to each structure is different, the sequence of basic structure carries out the sequence of significance level, graph structure data converts to the sequence of basic structure, weighs the significance level of each structure by the degree of association between structure:
Association: any two basic structure s in a figureiAnd sjIf: meet cvNum (si,sj) >=1, then structure siWith structure sjIt is association, is designated as incident (si,sj)=1; If cvNum is (si,sj)=0, then incident (si,sj)=0, illustrates structure siWith structure sjDo not associate; Correlation form is defined as:
i n c i d e n t ( s i , s j ) = 1 i f c v N u m ( s i , s j ) &GreaterEqual; 1 0 i f c v N u m ( s i , s j ) = 0
Wherein cvNum (si,sj) represent structure siWith structure sjPublic vertex number, and i �� j;
The degree of association based on relational structure quantity: a given figure g, it is assumed that containing N number of basic structure, then i-th basic structure siThe degree of association be:
Wherein: 1��i��N, sNum_CD (si)��(N-1); If one basic structure s and k basic structure association, the then degree of association sNum_CD (s) of this basic structure s=k;
According to above-mentioned definition, event data snapshot plotting is converted into the basic structure sequence based on the degree of association;
34) the architectural feature sequence similarity query algorithm of diagram data, specifically comprises the following steps that
The similarity of source string S and target string T is calculated by editing distance; Described editing distance refers to the quantity or the cost that are changed to minimum edit operation required for T by S, wherein proposed edit operation refers to the operation that the character of some position to character string is deleted, inserted, replaces, each conversion operation has a relevant cost, and the cost of a given conversion sequence of operation is equal to the cost sum of single operation in sequence;
In event data snapshot plotting basic structure sequence, the forward structure connection degree of level is more big, and namely importance is more big, then the probability of the main feature of this structure representative graph is more big; In structure sequence, first structure importance in the drawings is maximum, and the cost needed for editing this structure should also be maximum, thus defines exponential function f (the x)=a of a kind of monotone decreasing-xAs the cost function changing a character manipulation every time;
Sequence editing distance similarity: given sequence data base Set={s1,s2,��,sn, a search sequence qStr and an editing distance threshold tau, sequence similarity query result all in sequence library Set meets SED (qStr, s for returningi) < the sequence s of ��i; SED represents string editing distance algorithm;
A given search sequence, the editing distance between sequence and query graph sequence in string editing distance sequence of calculation data base, then result returns all sequence datas with search sequence editing distance less than given cost threshold tau in sequence library.
CN201310549381.4A 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure Expired - Fee Related CN103561420B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310549381.4A CN103561420B (en) 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310549381.4A CN103561420B (en) 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure

Publications (2)

Publication Number Publication Date
CN103561420A CN103561420A (en) 2014-02-05
CN103561420B true CN103561420B (en) 2016-06-08

Family

ID=50015537

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310549381.4A Expired - Fee Related CN103561420B (en) 2013-11-07 2013-11-07 Method for detecting abnormality based on data snapshot figure

Country Status (1)

Country Link
CN (1) CN103561420B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104536996B (en) * 2014-12-12 2017-12-12 南京理工大学 Calculate node method for detecting abnormality under a kind of homogeneous environment
CN107225571B (en) * 2017-06-07 2020-03-31 纳恩博(北京)科技有限公司 Robot motion control method and device and robot
CN107704332B (en) * 2017-09-28 2021-06-15 努比亚技术有限公司 Screen freezing solution method, mobile terminal and computer readable storage medium
CN109902564B (en) * 2019-01-17 2021-04-06 杭州电子科技大学 Abnormal event detection method based on structural similarity sparse self-coding network
CN114365505A (en) * 2019-11-07 2022-04-15 阿里巴巴集团控股有限公司 Data-driven object graph for data center monitoring
CN115551060B (en) * 2022-10-20 2023-11-17 浙江瑞邦科特检测有限公司 Low-power-consumption data monitoring method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7734451B2 (en) * 2005-10-18 2010-06-08 Honeywell International Inc. System, method, and computer program for early event detection
CN102291739A (en) * 2011-08-16 2011-12-21 哈尔滨工业大学 Method for detecting wireless sensor network sparse events based on compressed sensing and game theory
CN102665253A (en) * 2012-04-20 2012-09-12 山东大学 Event detection method on basis of wireless sensor network
CN102724686A (en) * 2012-05-17 2012-10-10 北京交通大学 Event detection mechanism applicable to wireless sensor network
CN103179602A (en) * 2013-03-15 2013-06-26 无锡清华信息科学与技术国家实验室物联网技术中心 Method and device for detecting abnormal data of wireless sensor network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7734451B2 (en) * 2005-10-18 2010-06-08 Honeywell International Inc. System, method, and computer program for early event detection
CN102291739A (en) * 2011-08-16 2011-12-21 哈尔滨工业大学 Method for detecting wireless sensor network sparse events based on compressed sensing and game theory
CN102665253A (en) * 2012-04-20 2012-09-12 山东大学 Event detection method on basis of wireless sensor network
CN102724686A (en) * 2012-05-17 2012-10-10 北京交通大学 Event detection mechanism applicable to wireless sensor network
CN103179602A (en) * 2013-03-15 2013-06-26 无锡清华信息科学与技术国家实验室物联网技术中心 Method and device for detecting abnormal data of wireless sensor network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于数据流模型的网络异常检测方法研究;王玉芹;《潍坊学院学报》;20060731;第6卷(第4期);全文 *
无线传感网中能量优化的异常检测算法;吕建华等;《南京航空航天大学学报》;20110731;第43卷;全文 *

Also Published As

Publication number Publication date
CN103561420A (en) 2014-02-05

Similar Documents

Publication Publication Date Title
CN103546916B (en) Method for detecting abnormality based on data increment figure
CN103561420B (en) Method for detecting abnormality based on data snapshot figure
KR102355178B1 (en) Computing system for virtual sensor implementation using digital twin and method for realtime data collection thereof
US20180203961A1 (en) Operation draft plan creation apparatus, operation draft plan creation method, non-transitory computer readable medium, and operation draft plan creation system
US11551106B2 (en) Representation learning in massive petroleum network systems
CN107610421A (en) A kind of geo-hazard early-warning analysis system and method
CN101482612B (en) River network regional water system connectivity measurement method based on geographic information system technology
CN108306756A (en) One kind being based on electric power data network holography assessment system and its Fault Locating Method
CN101379498A (en) Methods, systems, and computer-readable media for fast updating of oil and gas field production models with physical and proxy simulators
CN112035983A (en) Urban medium-voltage power grid planning and designing platform based on three-dimensional real scene
US20210042633A1 (en) Aggregation functions for nodes in ontological frameworks in representation learning for massive petroleum network systems
US20120036242A1 (en) Method and sensor network for attribute selection for an event recognition
CN109241223A (en) The recognition methods of behavior whereabouts and platform
Chuchro et al. A concept of time windows length selection in stream databases in the context of sensor networks monitoring
CN115654381A (en) Water supply pipeline leakage detection method based on graph neural network
CN102883359B (en) A kind of method of measurement of the key node for wireless sensor network, device and system
CN116010722A (en) Query method of dynamic multi-objective space-time problem based on grid space-time knowledge graph
Guo et al. Automatic data quality control of observations in wireless sensor network
Nunes et al. Analysis of large scale climate data: how well climate change models and data from real sensor networks agree?
Shekhar et al. What’s spatial about spatial data mining: three case studies
Li et al. Evolving a Bayesian network model with information flow for time series interpolation of multiple ocean variables
CN110807061A (en) Method for searching frequent subgraphs of uncertain graphs based on layering
CN114720665A (en) Method and device for detecting total nitrogen abnormal value of soil testing formulated fertilization soil
Merdekawan et al. Initial Study of Building Smart Air Pollution Sensors with the Decision Tree Algorithm
CN117668500B (en) Urban underground space resource assessment method based on GIS

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160608

Termination date: 20191107