CN102411687A - Deep learning detection method of unknown malicious codes - Google Patents

Deep learning detection method of unknown malicious codes Download PDF

Info

Publication number
CN102411687A
CN102411687A CN2011103735580A CN201110373558A CN102411687A CN 102411687 A CN102411687 A CN 102411687A CN 2011103735580 A CN2011103735580 A CN 2011103735580A CN 201110373558 A CN201110373558 A CN 201110373558A CN 102411687 A CN102411687 A CN 102411687A
Authority
CN
China
Prior art keywords
node
pond
input
space
htm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011103735580A
Other languages
Chinese (zh)
Other versions
CN102411687B (en
Inventor
李元诚
樊庆君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
North China Electric Power University
Original Assignee
North China Electric Power University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by North China Electric Power University filed Critical North China Electric Power University
Priority to CN201110373558.0A priority Critical patent/CN102411687B/en
Publication of CN102411687A publication Critical patent/CN102411687A/en
Application granted granted Critical
Publication of CN102411687B publication Critical patent/CN102411687B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a deep learning detection method of unknown malicious codes, belonging to the technical field of information security. The deep learning detection method of unknown malicious codes comprises the following steps of: firstly, extracting characteristic vectors of documents in a training set by using byte level n-gram; secondly, constituting an HTM (Hypertext Markup Language) network structure and determining the input data length of each node at the bottom layer of the HTM structure; thirdly, carrying out sequence pattern learning practice and classification derivation with an HTM algorithm by using the characteristic vector as input; fourthly, extracting characteristic vectors of documents in a testing set by using byte level n-gram; fifthly, inputting the characteristic vectors into an HTM network with finished practice for sequence identification, so as to determine whether the test centralized documents contain malicious codes or not. The invention has the beneficial effects of relatively high noise resistance and fault-tolerant ability, and strong adaptability. Simultaneously, the deep learning detection method disclosed by the invention has the advantages of improving the identification ability and identification rate of malicious code detection and realizing accurate detection of new targets of malicious codes.

Description

The degree of depth study detection method of unknown malicious code
Technical field
The invention belongs to field of information security technology, relate in particular to the degree of depth study detection method of unknown malicious code.
Background technology
The continuous development of Along with computer technology and network technology; Computing machine has become instrument indispensable in people's daily life; In order to obtain economy, political interest or to carry out individual's revenge, tissue or individual use various malicious codes to carry out unlawful activities in a large number, and the thing followed is that all kinds of malicious codes emerge in an endless stream; The technology that malicious code adopted is also more and more advanced, and its propagation, harm, ability such as hide constantly strengthen.Though the detection technique of various malicious codes is also in continuous development; But the detection technique of malicious code and the development that detectability still lags behind malicious code at present, particularly the detectability to unknown malicious code has proposed great challenge to the malicious code detection technique.
The computer malevolence code detection technique mainly contains two kinds at present, a kind of mode-matching technique that is based on condition code, and another kind is based on the detection technique of malicious code rule of conduct.
Mode-matching technique based on condition code is that the feature code of file to be detected and the malicious code feature string in the property data base are mated; In the successful interval scale of coupling file to be detected, contain malicious code, otherwise think that file to be detected does not contain malicious code.Malicious code sample is found and obtained to this Technology Need technician's very first time, and can extract the unique identification condition code of malicious code.Need in addition in time signature update in malicious code condition code storehouse, so that before this malicious code wide-scale distribution and outburst, detect.The malicious code that this detection technique is not suitable for introducing polymorphic and deformation technology detects, and the detection of propagating the malicious code rapid, that destructive power is strong, with strong points.Based on the detection technique of malicious code rule of conduct, be to come the detection of malicious code according to the common rule of conduct of the predefined malicious code of expert.This technology cardinal principle is that the operation action of malicious code is often followed behaviors such as user right change, Registry Modifications, open, the unusual network service of the network port, perhaps certain particular system sequence of operation.There is serious lag property defective in this technology, particularly along with the lifting significantly of computer run speed, when detecting the malicious code behavior by the time, has often brought irreparable damage to system.Above-mentioned two kinds of detection techniques all are a kind of detection techniques afterwards, can only detect known malicious code, perhaps after malicious code is performed, just can be detected, yet malicious code have caused destruction during this period.
Summary of the invention
The present invention is directed to above-mentioned defective and disclose the degree of depth study detection method of unknown malicious code, it comprises the following steps:
1) utilize the byte level n-gram syntax to extract the proper vector of training set file;
2) make up the HTM network structure, and the input data length of each node of bottom in definite HTM structure;
3) with proper vector as input, utilize the HTM algorithm to carry out the sequence pattern learning training and derive with classification;
4) utilize the byte level n-gram syntax to extract the proper vector of the file in the test set;
5) proper vector is input to the HTM network of accomplishing training and carries out recognition sequence, whether contain malicious code to confirm the file in the test set.
Said step 2) specifically comprises the following steps:
21) the HTM network model of a F layer of selection, outer each node of definite division bottom has the M node;
22) utilize formula l=L/M (F-1) intercepting document characteristic vector, and with the document characteristic vector of intercepting successively as the input sample of each node of HTM network bottom layer.
Said step 3) specifically comprises the following steps:
31) with the input as the HTM network of the document characteristic vector of intercepting, bottom layer node gets into learning phase, and the study, transient state pond of all accomplishing spatial model up to the pond, space of all nodes of bottom is the deadline study of dividing into groups all;
32) bottom layer node is through step 31) in after learning phase finishes; Bottom layer node gets into the derivation stage; The input of new sample is exported to the father node that it is positioned at one deck under the HTM network after bottom layer node is derived, the output of lower level node with identical father node is after connecting; Become the input of next node layer learning phase, next node layer gets into learning phase repeating step 31) in the learning process of node;
Step 33) process of repeating step 3.2 has all been accomplished the learning training of sequence pattern up to the node of all layers of HTM network.
Said step 31) specifically comprises the following steps:
311) binary sequence that is input to node is input to the pond, space, and the pond, space uses ultimate range parameter D to learn the cluster of these sequences; The pond, space uses the method for ultimate range D to store the subclass of input pattern, is called cluster centre; Along with the increase of time, the quantity of the new sequence pattern that the pond, space produced in the unit interval can reduce, and when the quantity of the new cluster centre of each time cycle is lower than the threshold value that configures, cluster process will stop;
312) the transient state pond is exported to the sequence pattern of having learnt in the pond, space, divides into groups to sequence pattern according to the time adjacency of sequence pattern in the transient state pond, after all sequence patterns all are grouped, divides set of calculated to finish.
Said step 32) specifically comprises the following steps:
321) utilize formula
Figure BDA0000111093300000041
Calculate list entries e -Spatial model c based on the node space pond iProbability distribution, after regularization is handled as the output in pond, space, wherein
Figure BDA0000111093300000042
Figure BDA0000111093300000043
Represent the spatial model of non-zero, M is the quantity of the child node of this node, e -Be list entries to be identified from bottom;
322) based on the output y in pond, space, utilize formula Calculate the output in transient state pond, wherein, N cBe vector length and the space pool space number of modes of y, λ length is N g
Beneficial effect of the present invention is: introduce the HTM algorithm; The structure of mimic human neopallium and the novel artificial of principle of work intelligence degree of depth learning algorithm; Adopt the hierarchical tree network structure and use in the Bayesian network that information continues to share principle and degree of belief transfer principle between node, challenge is converted into pattern match and prediction.And, need not carry out complicated pre-service to the input data, have stronger anti-noise, fault-tolerant ability, adaptability is strong.Simultaneously, in the process that old model is derived, can learn, improve recognition capability and discrimination that malicious code detects, realize the target of the emerging malicious code of accurate detection new input pattern.
Description of drawings
Fig. 1 is the process synoptic diagram of the detection method of unknown malicious code;
Fig. 2 a is a HTM hierarchical tree structural model synoptic diagram;
Fig. 2 b is pond, space and the transient state pond synoptic diagram of node K;
Fig. 3 is the learning process synoptic diagram of a node of HTM algorithm training process;
Fig. 4 is that the degree of belief of node k in the HTM algorithm derivation is transmitted computational details synoptic diagram.
Embodiment
Below in conjunction with accompanying drawing, preferred embodiment is elaborated.Should be emphasized that following explanation only is exemplary, rather than in order to limit scope of the present invention and application thereof.
The thinking that the present invention deals with problems is: with the file set that contains malicious code is training sample; Adopt the byte level n-gram syntax that the training set file is carried out feature selecting; Thereby the corresponding proper vector of each file, proper vector is trained the HTM network as the input of HTM algorithm.Whether at last unknown file is carried out feature selecting and produce the characteristic of correspondence vector, as the input of the HTM network of accomplishing training it is carried out pattern-recognition, be the file that comprises malicious code thereby tell it.
As shown in Figure 1, the intelligent detecting method of unknown malicious code comprises the steps:
1) utilize the byte level n-gram syntax to extract the proper vector of training set file.
Can download on the net and be used for carrying out the standard data set that malicious code detects specially, to concentrate select File to construct training set from normal data, such as constructing training set according to malicious code kind select File according to ad hoc rules.
The byte level n-gram syntax are to adopt the moving window of a n byte-sized to get speech to binary word throttling or text, and each speech all is a n byte-sized.Content such as a text is " abcdef ", and its 2-grams sequence is so: ab, bc, cd, de, ef, its 3-grams sequence is: abc, bcd, cde, def.
Content with a file is that " abcd " is example, this document is extracted the 2-grams sequence be: ab, bc, cd, so just say that this file has three attributes; The vector that can utilize these three attributes to form is represented this file; Vector is: { ab, bc, cd}.
Each attribute is quantized, can obtain the proper vector of this document.With above-mentioned vector ab, bc, cd} are example, a is changed to 1 at the alphabet meta, b is 2; C is 3, and d is 4, with the position and rule quantize, so, the quantized result of ab is 3; The quantized result 5 of bc, the quantized result of cd are 7, and the vector { 3,5,7} is the proper vector of this document.
2) make up the HTM network structure, and the input data length of each node of bottom in definite HTM structure.Be 3 layers of tree structure model shown in Fig. 2 a, except that bottom layer node, each node all has two node.Be the cut-away view of individual node k in the HTM structure shown in Fig. 2 b, its have living space pond and transient state pond constitutes.Step 2) specifically may further comprise the steps:
21) the HTM network structure of a F layer of selection, outer each node of definite division bottom has the M node.
Shown in Fig. 2 a, select F=3, M=2, then this HTM network L3 layer, L2 layer, L1 layer have 1,2,4 node respectively.
22) utilize formula l=L/M (F-1) intercepting document characteristic vector, and with the document characteristic vector of intercepting successively as the input sample of each node of HTM network bottom layer, wherein L is a document characteristic vector length of utilizing byte level n-gram syntax method to extract.
Suppose document characteristic vector for 1,2,3,4,5,6,7, the length L of 8} is 8, l=2 then, promptly the input sample of each node of HTM network bottom layer be respectively 1,2}, 3,4}, 5,6}, 7,8}.
3) with proper vector as input, utilize the HTM algorithm to carry out binary sequence pattern learning training and classification is derived.
The learning training process in this stage is successively to accomplish, i.e. after the study of bottom was accomplished, when new input arrived, the node of bottom got into the derivation stage, and the output result of derivation is as the input of next node layer learning phase; For individual node, also be that the transient state pond just begins to carry out time packet after sequence pattern training in node space pond is accomplished.
Step 3) specifically may further comprise the steps:
31) as shown in Figure 3, with the input as the HTM network of the document characteristic vector of intercepting, bottom layer node gets into learning phase, and the study, transient state pond of all accomplishing spatial model up to the pond, space of all nodes of bottom is the deadline study of dividing into groups all.
Specifically, step 31) specifically comprise the following steps: again
311) the binary sequence pattern with the document characteristic vector of intercepting is input in the pond, space of bottom node, and the pond, space uses ultimate range D to learn the cluster of these sequences.The pond, space uses the method for ultimate range D to store the subclass of the binary sequence pattern of input, is called cluster centre.Along with the increase of time, the quantity of the new sequence pattern that the pond, space produced in the unit interval can reduce, and when the quantity of the new cluster centre of a unit interval cycle T is lower than the threshold value that configures, cluster process will stop.Period of time T is optional not to be 0 arbitrary value, and threshold value is non-0 integer.In order to improve learning efficiency, period of time T and threshold value generally get a less value (such as period of time T get 5s, threshold value gets 1).
The implication of D is to assert that a binary sequence pattern is different from the minimum euclidean distance of already present cluster centre.For each input binary sequence pattern, to check that all the cluster centre that whether exists within the Euclidean distance D (is divided into two kinds of situation:, then maintain the statusquo if exist; If do not exist, add this new binary sequence pattern in the cluster centre tabulation).The Euclidean distance algorithm is following: establish x, y ∈ R N, x then, the Euclidean distance of y is: ( Σ i = 1 N ( x i - y i ) 2 ) 1 2
312) the transient state pond is exported to the binary sequence pattern of having learnt in the pond, space, divides into groups to sequence pattern according to the time adjacency of sequence pattern in the transient state pond, after all sequence patterns all are grouped, divides set of calculated to finish.
Step 312) specifically may further comprise the steps:
3121) when the input of transient state Chi Jieshoukongjianchi, the binary sequence pattern, must be cut apart in groups after the time, adjacency matrix formed by related rise time adjacency matrix of time.In HTM, adopt the Greedy algorithm to realize time packet.
3122) find the maximum that is not included in the grouping to connect the cluster point.The maximum cluster point that connects only is that its corresponding row in the time connection matrix has maximum and cluster value.
3123) select step 3122) the middle maximum preceding N that connects cluster point Top(N TopBe to specify parameter) the individual maximum neighbours of connection cluster point, the transient state pond adds these cluster points of selecting in the current group.
3124) the cluster point X that each new adding is divided into groups, repeating step 3123).All immediate N as X TopAfter individual neighbours' cluster point joins grouping as X, this grouping process will stop automatically.When packet count near a certain value (largest packet number), and grouping process is not when still stopping automatically, grouping process will be terminated.
3125) result set of cluster point will join the transient state pond as a new grouping.Return step 3122 then) be grouped up to all cluster points.
32) bottom layer node is through step 31) in after learning phase finishes; Bottom layer node gets into the derivation stage; New sample (binary sequence pattern) input is exported to its father node that is positioned at one deck under the HTM network (for child node) after bottom layer node is derived, the output of child node with identical father node is after connecting; Become the input of father node learning phase, father node gets into learning phase repeating step 31) in the learning process of node.
The binary sequence pattern that is illustrated in figure 4 as input is in the derivation stage synoptic diagram that node carries out, step 32) specifically comprise the following steps:
321) calculate the probability distribution P (e of the binary sequence pattern of input based on the spatial model in pond, space -| c i), after regularization is handled as the output vector y in pond, space.
The spatial model that learning phase input binary sequence pattern generates in the pond, space is i ThCluster centre c iDerivation node bottom list entries e -Based on i ThProbability distribution P (the e of cluster centre -| c i) be variable, can pass through computes:
P ( e - | c i ) = γ Π k = 1 M input ( m k i ) - - - ( 1 )
In the formula (1), γ is a proportionality constant, and the i cluster centre is expressed as
Figure BDA0000111093300000092
Figure BDA0000111093300000093
Represent non-vanishing spatial model, M is the quantity of the child node of this node, e -Be list entries to be identified from bottom. The representative input is if this node is a bottom node, then Input binary sequence pattern for this node; If this node is not a bottom node, then For the transient state pond output probability from the child node of this node distributes, promptly
Figure BDA0000111093300000097
(to P (e -| g i) computing formula see formula (4)).All i ThCluster centre c iProbability distribution all can pass through P (e -| c i) calculate, then with P (e -| c i) canonical turns to vectorial y (i), and therefore y (i) and P (e are arranged -| c i) proportional, can be designated as y (i) ∝ P (e -| c i), all y (i) have formed the output vector y in this node space pond, be designated as y=[y (1), y (2) ..., y (N c)] (N cBe space pool space pool space number of modes), all P (e -| c i) constituted P (e -| C) be designated as P ( e - | C ) = [ P ( e - | c 1 ) , P ( e - | c 2 ) , . . . , P ( e - | c N c ) ] , Therefore y and P (e are arranged -| C) proportional, be designated as y ∝ P (e -| C).
322), calculate the output in transient state pond based on the output vector y in pond, space.
Transient state pond Application of B elief Propagation principle is carried out reasoning.As shown in Figure 4, the pond, space is output as vectorial y, and this vector length is N c(also being space pool space number of modes), i element is corresponding to i cluster centre c in the vector i
Figure BDA0000111093300000101
These cluster centres are as vector
Figure BDA0000111093300000102
(length is M), wherein r representes the sub-packet index of these cluster centres.I the element computing formula of y is:
y ( i ) = α 1 Π j = 1 M λ m i ( r m j ) - - - ( 2 )
In the formula (2), α 1Be a random scaling constant, for fear of the underflow of information, it is set to fixed value usually, and M is the child node number,
Figure BDA0000111093300000104
Expression is from child node m iThe binary sequence pattern, Represent the i cluster centre from child node m iSub-packet index.
According to formula (1) and step 321 procedure declaration, k has y at node kWith P (e -| C k) proportional, i.e. y k∝ P (e -| C k), y kWith P (e -| C k) be respectively y and P (e -| C) at the instance at node k place.
Output is calculated based on the input in pond, space in the transient state pond.Be output as λ, its length is N g(transient state pond time packet number), λ=[λ (1), λ (2) ..., λ (N c)] i element computing formula following:
λ ( i ) = Σ j = 1 N c P ( c j | g i ) y ( j ) - - - ( 3 )
In the formula (3), P (c j| g i) represent spatial model c jFor the g that divides into groups in the transient state pond iConditional probability distribution, y (j) representative is from j the element of the y in pond, space, the value of j is 1-N cBecause y (j) ∝ P (e -| c j), and
P ( e - | g i ) = Σ j = 1 N c P ( c j | g i ) P ( e - | c j ) - - - ( 4 )
P (e wherein -| g i) represent bottom list entries e -Based on transient state pond grouping g iProbability distribution, P (e -| G k) be all P (e on the node k -| g i) the vector of formation.
So λ (i) ∝ P (e -| g i) set up for all i, on node k, λ is arranged kWith P (e -| G k) proportional, i.e. λ k∝ P (e -| G k).The output in transient state pond is exactly the output of this node.
33) repeating step 32) process, all accomplished the learning training of binary sequence pattern up to the node of all layers of HTM network.
Like step 32), after one deck training was accomplished, this node layer changed the derivation stage over to, and next node layer (father node) utilizes the output of last layer node (child node) to carry out the study of sequence pattern as input.
4) utilize the byte level n-gram syntax to extract the proper vector of the file in the test set.
As step 1), utilize the byte level n-gram syntax to extract the proper vector of test set file.Test set can be concentrated from the malicious code test data that network provides and choose.
5) proper vector is input to the HTM network of accomplishing training and carries out recognition sequence, whether contain malicious code to confirm the file in the test set.
Like step 322) derivation; All node layers all are in the derivation stage in the whole HTM structure; Utilize the proper vector that pond, space sequence pattern is derived and the grouping derivation of transient state pond is extracted step 4) to carry out the pattern derivation; The output vector λ of top mode is the output mode vector of whole HTM network, the output probability P (e in top mode transient state pond -| G k) be the malicious code matching rate.As the output probability P in top mode transient state pond (e -| G k) when enough big, be set at greater than 85% such as us, we just can think that input file contains malicious code so, otherwise think there is not malicious code.
The present invention as training set, utilizes HTM algorithm for pattern recognition training HTM network with the sample set of malice file, utilizes the HTM network that unknown file is carried out pattern-recognition then and derives with classification, to determine whether it is the malice file.In the process of file being carried out feature extraction, adopt byte level n-gram syntax algorithm, a large amount of file characteristic attributive character is extracted.In pattern-recognition and classification learning algorithm, introduce the HTM algorithm; This algorithm is the structure of mimic human neopallium and the novel artificial intelligent algorithm of principle of work, and lasting principle and the degree of belief transfer principle shared of information between node is converted into pattern match and prediction with challenge in its application Bayesian network; The spatial sequence pattern and the temporal mode that extract sample through training divide into groups; And utilize Belief Propagation method to gather classification to each layer local mode group, finally obtain the one-piece pattern group, at cognitive phase; According to the sequence pattern of each layer study, accomplish malicious code sample identification through overmatching.The HTM algorithm can effectively improve discrimination because of its good anti-noise, fault-tolerant, adaptability, self-learning capability.
The above; Be merely the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, any technician who is familiar with the present technique field is in the technical scope that the present invention discloses; The variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (5)

1. the degree of depth of unknown malicious code study detection method is characterized in that, comprises the following steps:
1) utilize the byte level n-gram syntax to extract the proper vector of training set file;
2) make up the HTM network structure, and the input data length of each node of bottom in definite HTM structure;
3) with proper vector as input, utilize the HTM algorithm to carry out the sequence pattern learning training and derive with classification;
4) utilize the byte level n-gram syntax to extract the proper vector of the file in the test set;
5) proper vector is input to the HTM network of accomplishing training and carries out recognition sequence, whether contain malicious code to confirm the file in the test set.
2. the degree of depth of unknown malicious code according to claim 1 study detection method is characterized in that said step 2) specifically comprise the following steps:
21) the HTM network model of a F layer of selection, outer each node of definite division bottom has the M node;
22) utilize formula l=L/M (F-1) intercepting document characteristic vector, and with the document characteristic vector of intercepting successively as the input sample of each node of HTM network bottom layer.
3. the degree of depth of unknown malicious code according to claim 1 study detection method is characterized in that said step 3) specifically comprises the following steps:
31) with the input as the HTM network of the document characteristic vector of intercepting, bottom layer node gets into learning phase, and the study, transient state pond of all accomplishing spatial model up to the pond, space of all nodes of bottom is the deadline study of dividing into groups all;
32) bottom layer node is through step 31) in after learning phase finishes; Bottom layer node gets into the derivation stage; The input of new sample is exported to the father node that it is positioned at one deck under the HTM network after bottom layer node is derived, the output of lower level node with identical father node is after connecting; Become the input of next node layer learning phase, next node layer gets into learning phase repeating step 31) in the learning process of node;
Step 33) process of repeating step 3.2 has all been accomplished the learning training of sequence pattern up to the node of all layers of HTM network.
4. the degree of depth of unknown malicious code according to claim 3 study detection method is characterized in that said step 31) specifically comprise the following steps:
311) binary sequence that is input to node is input to the pond, space, and the pond, space uses ultimate range D to learn the cluster of these sequences; The pond, space uses the method for ultimate range D to store the subclass of input pattern, is called cluster centre; Along with the increase of time, the quantity of the new sequence pattern that the pond, space produced in the unit interval can reduce, and when the quantity of the new cluster centre of each time cycle is lower than the threshold value that configures, cluster process will stop;
312) the transient state pond is exported to the sequence pattern of having learnt in the pond, space, divides into groups to sequence pattern according to the time adjacency of sequence pattern in the transient state pond, after all sequence patterns all are grouped, divides set of calculated to finish.
5. the degree of depth of unknown malicious code according to claim 3 study detection method is characterized in that said step 32) specifically comprise the following steps:
321) utilize formula
Figure FDA0000111093290000021
Calculate list entries e -Spatial model c based on the node space pond iProbability distribution, after regularization is handled as the output in pond, space, wherein
Figure FDA0000111093290000022
Figure FDA0000111093290000023
Represent the spatial model of non-zero, M is the quantity of the child node of this node, e -Be list entries to be identified from bottom;
322) based on the output y in pond, space, utilize formula Calculate the output in transient state pond, wherein, N cBe vector length and the space pool space number of modes of y, λ length is N g
CN201110373558.0A 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes Expired - Fee Related CN102411687B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110373558.0A CN102411687B (en) 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110373558.0A CN102411687B (en) 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes

Publications (2)

Publication Number Publication Date
CN102411687A true CN102411687A (en) 2012-04-11
CN102411687B CN102411687B (en) 2014-04-23

Family

ID=45913758

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110373558.0A Expired - Fee Related CN102411687B (en) 2011-11-22 2011-11-22 Deep learning detection method of unknown malicious codes

Country Status (1)

Country Link
CN (1) CN102411687B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544392A (en) * 2013-10-23 2014-01-29 电子科技大学 Deep learning based medical gas identifying method
CN104715190A (en) * 2015-02-03 2015-06-17 中国科学院计算技术研究所 Method and system for monitoring program execution path on basis of deep learning
CN105205396A (en) * 2015-10-15 2015-12-30 上海交通大学 Detecting system for Android malicious code based on deep learning and method thereof
CN106096415A (en) * 2016-06-24 2016-11-09 康佳集团股份有限公司 A kind of malicious code detecting method based on degree of depth study and system
WO2017084586A1 (en) * 2015-11-17 2017-05-26 武汉安天信息技术有限责任公司 Method , system, and device for inferring malicious code rule based on deep learning method
CN107066302A (en) * 2017-04-28 2017-08-18 北京邮电大学 Defect inspection method, device and service terminal
CN107688822A (en) * 2017-07-18 2018-02-13 中国科学院计算技术研究所 Newly-increased classification recognition methods based on deep learning
CN108183902A (en) * 2017-12-28 2018-06-19 北京奇虎科技有限公司 A kind of recognition methods of malicious websites and device
WO2019025385A1 (en) * 2017-08-02 2019-02-07 British Telecommunications Public Limited Company Detecting malicious configuration change for web applications
CN109416719A (en) * 2016-04-22 2019-03-01 谭琳 Method for determining the defects of software code He loophole
CN109858251A (en) * 2019-02-26 2019-06-07 哈尔滨工程大学 Malicious code classification and Detection method based on Bagging Ensemble Learning Algorithms
TWI689831B (en) * 2018-02-05 2020-04-01 香港商阿里巴巴集團服務有限公司 Word vector generating method, device and equipment
CN111901282A (en) * 2019-05-05 2020-11-06 四川大学 Method for generating malicious code flow behavior detection structure
US11860994B2 (en) 2017-12-04 2024-01-02 British Telecommunications Public Limited Company Software container application security

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304409A (en) * 2008-06-28 2008-11-12 华为技术有限公司 Method and system for detecting malice code
CN101976313A (en) * 2010-09-19 2011-02-16 四川大学 Frequent subgraph mining based abnormal intrusion detection method
CN102142068A (en) * 2011-03-29 2011-08-03 华北电力大学 Method for detecting unknown malicious code
US8056136B1 (en) * 2010-11-01 2011-11-08 Kaspersky Lab Zao System and method for detection of malware and management of malware-related information
CN102243699A (en) * 2011-06-09 2011-11-16 深圳市安之天信息技术有限公司 Malicious code detection method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304409A (en) * 2008-06-28 2008-11-12 华为技术有限公司 Method and system for detecting malice code
CN101976313A (en) * 2010-09-19 2011-02-16 四川大学 Frequent subgraph mining based abnormal intrusion detection method
US8056136B1 (en) * 2010-11-01 2011-11-08 Kaspersky Lab Zao System and method for detection of malware and management of malware-related information
CN102142068A (en) * 2011-03-29 2011-08-03 华北电力大学 Method for detecting unknown malicious code
CN102243699A (en) * 2011-06-09 2011-11-16 深圳市安之天信息技术有限公司 Malicious code detection method and system

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544392B (en) * 2013-10-23 2016-08-24 电子科技大学 Medical science Gas Distinguishing Method based on degree of depth study
CN103544392A (en) * 2013-10-23 2014-01-29 电子科技大学 Deep learning based medical gas identifying method
CN104715190B (en) * 2015-02-03 2018-02-06 中国科学院计算技术研究所 A kind of monitoring method and system of the program execution path based on deep learning
CN104715190A (en) * 2015-02-03 2015-06-17 中国科学院计算技术研究所 Method and system for monitoring program execution path on basis of deep learning
CN105205396A (en) * 2015-10-15 2015-12-30 上海交通大学 Detecting system for Android malicious code based on deep learning and method thereof
US10503903B2 (en) 2015-11-17 2019-12-10 Wuhan Antiy Information Technology Co., Ltd. Method, system, and device for inferring malicious code rule based on deep learning method
WO2017084586A1 (en) * 2015-11-17 2017-05-26 武汉安天信息技术有限责任公司 Method , system, and device for inferring malicious code rule based on deep learning method
CN109416719A (en) * 2016-04-22 2019-03-01 谭琳 Method for determining the defects of software code He loophole
CN106096415A (en) * 2016-06-24 2016-11-09 康佳集团股份有限公司 A kind of malicious code detecting method based on degree of depth study and system
CN106096415B (en) * 2016-06-24 2019-05-21 康佳集团股份有限公司 A kind of malicious code detecting method and system based on deep learning
CN107066302B (en) * 2017-04-28 2019-11-05 北京邮电大学 Defect inspection method, device and service terminal
CN107066302A (en) * 2017-04-28 2017-08-18 北京邮电大学 Defect inspection method, device and service terminal
CN107688822A (en) * 2017-07-18 2018-02-13 中国科学院计算技术研究所 Newly-increased classification recognition methods based on deep learning
WO2019025385A1 (en) * 2017-08-02 2019-02-07 British Telecommunications Public Limited Company Detecting malicious configuration change for web applications
US11860994B2 (en) 2017-12-04 2024-01-02 British Telecommunications Public Limited Company Software container application security
CN108183902A (en) * 2017-12-28 2018-06-19 北京奇虎科技有限公司 A kind of recognition methods of malicious websites and device
CN108183902B (en) * 2017-12-28 2021-10-22 北京奇虎科技有限公司 Malicious website identification method and device
TWI689831B (en) * 2018-02-05 2020-04-01 香港商阿里巴巴集團服務有限公司 Word vector generating method, device and equipment
CN109858251A (en) * 2019-02-26 2019-06-07 哈尔滨工程大学 Malicious code classification and Detection method based on Bagging Ensemble Learning Algorithms
CN109858251B (en) * 2019-02-26 2023-02-10 哈尔滨工程大学 Malicious code classification detection method based on Bagging ensemble learning algorithm
CN111901282A (en) * 2019-05-05 2020-11-06 四川大学 Method for generating malicious code flow behavior detection structure

Also Published As

Publication number Publication date
CN102411687B (en) 2014-04-23

Similar Documents

Publication Publication Date Title
CN102411687B (en) Deep learning detection method of unknown malicious codes
CN105824802A (en) Method and device for acquiring knowledge graph vectoring expression
CN109561084A (en) URL parameter rejecting outliers method based on LSTM autoencoder network
CN105095494B (en) The method that a kind of pair of categorized data set is tested
CN105868108A (en) Instruction-set-irrelevant binary code similarity detection method based on neural network
CN104268629B (en) Complex network community detecting method based on prior information and network inherent information
CN107145943A (en) Method based on the detection echo state network small-signal for improving teaching optimized algorithm in Chaotic Background
CN105678401A (en) Global optimization method based on strategy adaptability differential evolution
Ochoa et al. Recent advances in fitness landscape analysis
CN109740057A (en) A kind of strength neural network and information recommendation method of knowledge based extraction
CN111400713B (en) Malicious software population classification method based on operation code adjacency graph characteristics
Pal et al. Deep learning for network analysis: problems, approaches and challenges
Xiao et al. Network security situation prediction method based on MEA-BP
CN112231775B (en) Hardware Trojan horse detection method based on Adaboost algorithm
Andreassen et al. Parameter estimation using neural networks in the presence of detector effects
CN115617395A (en) Intelligent contract similarity detection method fusing global and local features
Zhao et al. On inferring training data attributes in machine learning models
CN101853202B (en) Test case autogeneration method based on genetic algorithm and weighted matching algorithm
CN106874762A (en) Android malicious code detecting method based on API dependence graphs
CN113904844A (en) Intelligent contract vulnerability detection method based on cross-modal teacher-student network
CN104899283A (en) Frequent sub-graph mining and optimizing method for single uncertain graph
Weihong et al. Optimization of BP neural network classifier using genetic algorithm
Li et al. A community clustering algorithm based on genetic algorithm with novel coding scheme
Xiao et al. A locating method for reliability-critical gates with a parallel-structured genetic algorithm
CN109859062A (en) A kind of community discovery analysis method of combination depth sparse coding device and quasi-Newton method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140423

Termination date: 20141122

EXPY Termination of patent right or utility model