CN100520767C - Method and system for judging article importance in network, and sliding window - Google Patents

Method and system for judging article importance in network, and sliding window Download PDF

Info

Publication number
CN100520767C
CN100520767C CNB2007101052978A CN200710105297A CN100520767C CN 100520767 C CN100520767 C CN 100520767C CN B2007101052978 A CNB2007101052978 A CN B2007101052978A CN 200710105297 A CN200710105297 A CN 200710105297A CN 100520767 C CN100520767 C CN 100520767C
Authority
CN
China
Prior art keywords
word
article
moving window
importance
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2007101052978A
Other languages
Chinese (zh)
Other versions
CN101071419A (en
Inventor
董亮
邵荣防
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tencent Computer Systems Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CNB2007101052978A priority Critical patent/CN100520767C/en
Publication of CN101071419A publication Critical patent/CN101071419A/en
Priority to PCT/CN2008/070600 priority patent/WO2008145031A1/en
Application granted granted Critical
Publication of CN100520767C publication Critical patent/CN100520767C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis

Abstract

The invention involves a judgment of the importance of the article on the network. Including: the use of pre-set sliding window to the article as a starting point to start sliding, sliding window described in the words slip to not repeat collection. The sliding collection window to the default volume terms, the number of record slip term, and re-set the starting point, to continue to slide until slip entire article; sliding window in the records referred to in the number of access to the largest number of value, and based on access the number of articles of value judgement of the importance of size. At the same time, the invention also pertains to a judge in the article on the importance of network systems, and a sliding window. The invention to the overall vocabulary rich but poor local vocabulary articles effective recognition, user-friendly.

Description

On network, judge method and system, and the moving window of article importance
Technical field
The present invention relates to the network retrieval field, particularly relate to a kind of method and system of on network, judging article importance, and moving window.
Background technology
Utilizing the keyword retrieval related article on network, is one of shared important way of Internet resources.Because of Internet resources very abundant, the often corresponding a large amount of article of search key, this just needs network system can judge the importance of every piece of article, so that in result for retrieval important relatively article being come the front shows, unessential relatively article is come the back to be shown, make the user read more important article earlier, save user time.
General way is to judge its importance according to the degree of enriching of vocabulary in the article at present.If one piece of article vocabulary is abundant, illustrate that it has substance in speech, belong to important article; Otherwise, if article in the whole text or local just minority vocabulary repeat, the vocabulary poorness illustrates its just empty verbiage, belongs to inessential article.The method that prior art is based on word frequency statistics is judged the importance of article.
Consult Fig. 1, be the existing method flow diagram of judging article importance on network, concrete steps are as described below.
Step S101, with word adjacent in the article statement separately with the space.
For example:
Article before the participle: seen a TV play today, given by a boy student in the play and moved ...
Article behind the participle: seen a TV play today, given by a boy student in the play and moved ...
Step S102, the word frequency of adding up each word, the i.e. number of times that in article, occurs of this word.
For example:
Today, 5 times; See 35 times; , 100 times; TV play, 10 times ... ..
Step S103, calculating and to judge whether number of times that above-mentioned word occurs satisfies pre-conditioned as not satisfying, think that then this article is important relatively; As satisfying, think that then this article is inessential relatively.
Pre-conditioned can be:
1) the word total number is less than 5;
2) word frequency that maximum single speech occurs is greater than 30% of total word frequency, or 5% maximum speech word frequency occurs greater than 50% of total word frequency, or 20% maximum speech word frequency occurs greater than 80% of total word frequency;
3) average word frequency surpasses 5.
As, the average word frequency of above-mentioned word " article " surpasses 5, thinks that then this article is inessential relatively.
Said method is judged its importance by the word frequency of each word in the statistics article, but the word frequency reflection is the characteristic of the article overall situation, can not reflect the characteristic of article part.And that the article of local vocabulary poorness mostly is practicality is not strong, unessential article.If one piece of whole vocabulary is abundant, but local vocabulary poorness, existing determination methods just is important article with such article erroneous judgement easily.Therefore, existing determination methods can't be to whole vocabulary is abundant but the article of local vocabulary poorness judges that effectively using to the user makes troubles.
Summary of the invention
Technical matters to be solved by this invention provides a kind of method of judging article importance on network, and this method can be to whole vocabulary is abundant but the article of local vocabulary poorness carries out effective recognition, is user-friendly to.
Another object of the present invention provides a kind of system of judging article importance on network, and this system can be to whole vocabulary is abundant but the article of local vocabulary poorness carries out effective recognition, is user-friendly to.
The present invention also provides a kind of moving window, and this moving window is used for traveling through article on network, can obtain the correlation parameter that vocabulary in this article enriches degree effectively.
The present invention relates to a kind of method of judging article importance on network, comprising: use pre-set sliding window to begin to slide as starting point with the article section start, described moving window carries out not repeated collection to words slip; When the word of described sliding collection window reached predetermined number, record slipped over the quantity of word, and resets starting point, continued to slide, until slipping over entire article; In the quantity that slips over word of described moving window record, obtain quantitative value the maximum, and judge the importance of article according to the quantitative value the maximum that obtains.
Preferably, also comprise: word adjacent in the article is spaced apart with the space.
Preferably, by following step, described moving window carries out not repeated collection to words slip: described moving window judges that whether each words slip repeats with the word of having collected, as not, collects this word.
Preferably, by following step, reset starting point: the last word of collecting of described moving window is set to starting point; Empty the word of described sliding collection window.
Preferably, by following step, with quantitative value the maximum of obtaining and predetermined value relatively, as less than, then definite described article is important article.
Preferably,, judge the importance of article according to the quantitative value the maximum that obtains: will obtain the weights of quantitative value the maximum, and press the significance level that the weights size is determined described article as described article importance by following step.
The invention still further relates to a kind of system of judging article importance on network, comprising: moving window, greatest measure acquiring unit and importance judging unit, described moving window comprise word collector unit, word record cell, reach start unit:
Described start unit, being used to control moving window is that starting point begins to slide with the article section start;
Described word collector unit is used for words slip is carried out not repeated collection, when the word of collecting reaches predetermined number, sends log-on message to described start unit and described word record cell; Described start unit is reset starting point, and starts described moving window continuation slip, until slipping over entire article;
Described word record cell is used to write down the quantity that described moving window slips over word;
Described maximal value acquiring unit is used for obtaining quantitative value the maximum in the quantity that slips over word of described word recording unit records;
Described importance judging unit is used for the importance according to the quantitative value the maximum judgement article that obtains.
Preferably, comprise that also word breaks up the unit, it is spaced apart with the space to be used for the word that article is adjacent.
The invention still further relates to a kind of moving window, described moving window comprises word collector unit, word record cell, reaches start unit:
Described start unit, being used to control moving window is that starting point begins to slide with the article section start;
Described word collector unit is used for words slip is carried out not repeated collection, when the word of collecting reaches predetermined number, sends log-on message to described start unit and described word record cell; Described start unit is reset starting point, and starts described moving window continuation slip, until slipping over entire article;
Described word record cell is used to write down the quantity that described moving window slips over word.
Preferably, described moving window also comprises left margin and right margin, during slip, described right margin moves to right from the off, when the word of described word collector unit collection reaches predetermined number, described right margin stops to move, and described left margin moves to right, and comprises a word until between border, the described left and right sides.
Compared with prior art, the present invention has the following advantages:
The present invention uses pre-set sliding window to begin to slide as starting point with the article section start, moving window carries out not repeated collection to words slip, when the word of sliding collection window reaches predetermined number, record slips over the quantity of word, reset starting point, continue to slide, moving window repeats this process, until slipping over entire article; In the quantity of moving window record, obtain quantitative value the maximum, and judge the importance of article according to the size of this quantity.The local vocabulary poorness if the whole vocabulary of article is abundant, the phenomenon that the part exists a large amount of words to repeat.When moving window of the present invention slips over this part, because of moving window carries out not repeated collection to words slip, under the condition of the word fixed amount of collecting, the relative other parts of moving window words slip quantity are more, the quantity that slips over word of record was relatively large during this time slided, and can become the foundation of judging article importance.Like this, the present invention adopts and can reflect the correlation values of poor part of article vocabulary, as the foundation of judging article importance.
With respect to the method for word word frequency in the prior art simple statistics article, the present invention can effectively judge the abundant but article of local vocabulary poorness of whole vocabulary, be user-friendly to.
Description of drawings
Fig. 1 is the existing method flow diagram of judging article importance on network;
The method flow diagram of on network, judging article importance that Fig. 2 provides for first embodiment of the invention;
The method flow diagram of on network, judging article importance that Fig. 3 provides for second embodiment of the invention;
The method flow diagram of on network, judging article importance that Fig. 4 provides for third embodiment of the invention;
The system schematic of on network, judging article importance that Fig. 5 provides for fourth embodiment of the invention;
The system schematic of on network, judging article importance that Fig. 6 provides for fifth embodiment of the invention;
The structural representation of the moving window that Fig. 7 provides for sixth embodiment of the invention.
Embodiment
For above-mentioned purpose of the present invention, feature and advantage can be become apparent more, the present invention is further detailed explanation below in conjunction with the drawings and specific embodiments.
The present invention uses pre-set sliding window to begin to slide as starting point with the article section start, moving window carries out not repeated collection to words slip, when the word of sliding collection window reaches predetermined number, record slips over the quantity of word, reset starting point, continue to slide, moving window repeats this process, until slipping over entire article; In the quantity of moving window record, obtain quantitative value the maximum, and judge the importance of article according to the size of this quantity.
With reference to Fig. 2, be the method flow diagram of on network, judging article importance that first embodiment of the invention provides, concrete steps are as described below.
Step S201, use pre-set sliding window to begin to slide as starting point with the article section start.
One moving window is set, and this moving window comprises left margin, right margin and database, and database is not repeatedly stored each word between the left and right border.Database can comprise the word of predetermined number at most.Predetermined number is preferably 6.
Use this moving window to be starting point, begin to slide with first word of article section start.During slip, the left margin of moving window is motionless, and right margin slides to the right.
Step S202, moving window carry out not repeated collection to words slip.
Moving window carries out not repeated collection to words slip in sliding process, promptly the word of Shou Jiing does not repeat mutually.Moving window stores the word of collecting into database.
When the word of step S203, sliding collection window reached predetermined number, record slipped over the quantity of word.
Along with the slip of moving window, the word amount of sliding collection window constantly increases, and when the word amount of sliding collection window reached predetermined number, the record moving window is institute's words slip total amount from the off.
Step S204, reset starting point, continue to slide, until slipping over entire article.
Moving window is reset starting point, empties collected word, continues to slide.The starting point of resetting can be last collected word of moving window, also can be the next word or a last word of this word, can also be last several words of this word.
When the word of sliding collection window reached maximal value once more, moving window write down this time words slip total amount once more, and emptied collected word, reset starting point once more, continued to slide, until slipping over entire article.
When the word of sliding collection window reached predetermined number, right margin stopped to move, and left margin is shifted to the right to the word position as starting point again.
Step S205, in the quantity of moving window record, obtain quantitative value the maximum.
Behind the sliding complete piece of writing article of moving window, obtain the quantity that slips over word of the each record of moving window, and in above-mentioned quantity, extract one of the numerical value maximum.
Step S206, judge the importance of article according to the size of this quantitative value.
As the foundation of judging article importance, numerical value is big with the quantitative value obtained, and article importance is just relatively low; Numerical value is little, and article importance is just higher relatively.
The local vocabulary poorness if the whole vocabulary of article is abundant, the phenomenon that the part exists a large amount of words to repeat.When moving window of the present invention slips over this part, because of moving window carries out not repeated collection to words slip, under the condition of the word fixed amount of collecting, the relative other parts of moving window words slip quantity are more, the quantity that slips over word of record was relatively large during this time slided, and can become the foundation of judging article importance.Like this, the present invention adopts and can reflect the correlation values of poor part of article vocabulary, as the foundation of judging article importance, can effectively judge the abundant but article of local vocabulary poorness of whole vocabulary, is user-friendly to.
The present invention is before moving window begins to slide, and is with the space that word adjacent in the article statement is spaced apart, makes things convenient for moving window to discern word in slip.And the present invention can also judge the importance of article by the mode that predetermined value is set.
With reference to Fig. 3, be the method flow diagram of on network, judging article importance that second embodiment of the invention provides, concrete steps are as described below.
Step S301, word adjacent in the article is spaced apart with the space.
As, the beginning of article in short is: " flower is also beautiful, and grass is also luxuriant, and scene is pretty good here ", employing space after adjacent word is separated are: " the also luxuriant scene here of the also beautiful grass of flower is pretty good ".
Step S302, use pre-set sliding window to begin to slide as starting point with the article section start.
This moving window comprises left margin " [", right margin "] " and database, and database is not repeatedly stored each word between the left and right border.Database can comprise the word of predetermined number at most.Predetermined number is 6.
As, moving window position at this moment is: " the also luxuriant scene here of the also beautiful grass of [] flower is pretty good ", when beginning to slide, the right margin "] of moving window " begin to move to right.
Step S303, moving window carry out not repeated collection to words slip.
As, the position that moving window moves is: when " the also luxuriant scene here of [flower] also beautiful grass is pretty good ", judge that the word that word " flower " and database have been collected does not repeat, and then collects word " flower "; The position that moving window moves is: when " [the also beautiful grass of flower also] luxuriant scene here is pretty good ", judge that existing word " also " repeats in word " also " and the database, then do not regather word " also ".
When the word of step S304, sliding collection window reached predetermined number, record slipped over the quantity of word.
As, the position that moving window moves is: when " [the also beautiful grass of flower is also here luxuriant] scene is pretty good ", the word number of sliding collection window is 6, reach default value, then write down moving window words slip quantity, promptly " flower, also, beautiful, careless, also, luxuriant, here " etc. 7 words, write down numerical value 7.
Step S305, the last word of collecting of moving window are set to starting point, empty the word of sliding collection window, continue to slide, until slipping over entire article.
As, be starting point with word " here ", move to right boundary place of the left margin of moving window, right margin continues to slide to the right, and the position of moving window is: " the also beautiful grass of flower also luxuriant [] here scene is pretty good ".
When the word of sliding collection window reached maximal value once more, moving window write down this time words slip total amount once more, and emptied collected word, slided again once more, until slipping over entire article.
Step S306, in the quantity of moving window record, obtain quantitative value the maximum.
Behind the sliding complete piece of writing article of moving window, obtain the quantity that slips over word of the each record of moving window, and in above-mentioned quantity, extract one of the numerical value maximum.
As, the quantity of record is 7, is for the second time 8, is 12... for the third time for the first time, through comparing, 12 maximums are then with 12 maximal values as word quantity.
Step S307, relatively this quantitative value and predetermined value, as less than, determine that then article is important article.
As, predetermined value is 16, with 12 and 16 comparisons, and 12<16, this article is important relatively article.
This embodiment judges that by the mode of predetermined value article is important article or non-important article, can directly judge accurately the importance of article, and is convenient and practical.
The present invention can write down words slip quantity by the mode of calculating moving window length, also can each article that retrieve be sorted according to the maximal value of the word quantity of obtaining, and each article is arranged by its sequence of importance.The length of moving window is the word amount that is comprised between the border, the moving window left and right sides.
With reference to Fig. 4, be the method flow diagram of on network, judging article importance that third embodiment of the invention provides, concrete steps are as described below.
Step S401, word adjacent in the article is spaced apart with the space.
As, the middle part of article has one section to be: " today, good happiness was good glad, and is very glad, more than happy ", the employing space after adjacent word is separated is: " today is good glad good glad, very the happiness more than happy ".
Step S402, use pre-set sliding window to begin to slide as starting point with the article section start.
This moving window comprises left margin " [", right margin "] " and database, and database is not repeatedly stored each word between the left and right border.Database can comprise the word of predetermined number at most.Predetermined number is 6.
As, moving window position at this moment is: " [] today is good glad good glad, very the happiness more than happy ".When beginning to slide, the right margin "] of moving window " begin to move to right.
At this moment, the length of moving window is 0, and the word of sliding collection window is 0.
Step S403, moving window carry out not repeated collection to words slip, write down the moving window quantity of words slip simultaneously.
As, the position that moving window moves is: when " [today] is good glad good glad, very the happiness more than happy ", judge that the word that word " today " and database have been collected does not repeat, and then collects word " today "; The position that moving window moves is: when " [today, good happiness was good] happiness, very happiness more than happy ", judge that existing word " good " repeats in word " good " and the database, then do not regather word " good ".
At this moment, the length of moving window is 5, and the word of sliding collection window is 4.
When the word of step S404, sliding collection window reached predetermined number, record slipped over the quantity of word.
As, the position that moving window moves is: " [today is good glad good glad, very the happiness more than happy] ", at this moment, the length of moving window is 10, the word of sliding collection window is 6.
Step S405, the last word of collecting of moving window are set to starting point, empty the word of sliding collection window, continue to slide, until slipping over entire article.
As, the left margin of the moving window place, boundary that moves to right, right margin continues to slide to the right, when the word of sliding collection window reached maximal value once more, moving window write down this time words slip total amount once more, and emptied collected word, again slide once more, until slipping over entire article.
Step S406, in the quantity of moving window record, obtain quantitative value the maximum.
Behind the sliding complete piece of writing article of moving window, obtain the quantity that slips over word of the each record of moving window, and in above-mentioned quantity, extract one of the numerical value maximum.
As, the moving window length of record is 10, is for the second time 11, is 18... for the third time for the first time, through comparing, 18 maximums are then with 18 maximal values as word quantity.
Step S407, with the weights of this quantitative value as article importance, press the significance level that weights sizes is determined article.
As, with 18 weights, compare with the weights of other article as this article importance, sort by the weights size.
This embodiment can be with the foremost of most important relatively article arrangement by sorting by the weights size, and all the other are arranged in order by importance, and very convenient user uses.And this embodiment does not need judging the whether important correlation values that is provided with of article, importance that can more objective reflection this article.
Based on the above-mentioned method of on network, judging article importance, the invention provides a kind of system of on network, judging article importance, this system can be to whole vocabulary is abundant but the article of local vocabulary poorness carries out effective recognition, is user-friendly to.
With reference to Fig. 5, the system schematic of on network, judging article importance that provides for fourth embodiment of the invention, comprise moving window 51, greatest measure acquiring unit 52 and importance judging unit 53, moving window 51 comprises word collector unit 511, word record cell 512, reaches start unit 513.
Start unit 513 control moving windows 51 are that the starting point place begins to slide with the article section start.
5111 pairs of words slip of word collector unit are carried out not repeated collection, when the word of collecting reaches predetermined number, send log-on message to start unit 513 and word record cell 512.Start unit 513 is reset starting point with word collector unit 511, starts moving window 51, until slipping over entire article.Predetermined number is preferably 6.
Word record cell 512 record moving windows 51 slip over the quantity of word.
Maximal value acquiring unit 52 obtains quantitative value the maximum in the quantity of word record cell 512 records, and sends it to importance judging unit 53.
Importance judging unit 53 is judged the importance of article according to the size of obtaining numerical value.
For better being convenient to word collector unit 511,512 collections of word record cell and record word, the present invention can break up the unit by word the word in the article is broken up.
With reference to Fig. 6, the system schematic of on network, judging article importance that provides for fifth embodiment of the invention, comprise that moving window 51, greatest measure acquiring unit 52, importance judging unit 53 and word break up unit 54, moving window 51 comprises word collector unit 511, word record cell 512, and start unit 513.
It is spaced apart with the space with word adjacent in the article that unit 54 broken up in word.
Moving window 51, greatest measure acquiring unit 52, importance judging unit 53 effect in the present embodiment and function repeat no more with embodiment illustrated in fig. 5 identical.
The present invention is by moving window traversal entire article, can obtain the correlation parameter that vocabulary in this article enriches degree effectively.
With reference to Fig. 7, the structural representation of the moving window 51 that provides for sixth embodiment of the invention comprises word collector unit 511, word record cell 512, and start unit 513.
Start unit 512 control moving windows 51 are that starting point begins to slide with the article section start.
512 pairs of words slip of word collector unit are carried out not repeated collection, when the word of collecting reaches predetermined number, send log-on message to start unit 513 and word record cell 512; Last word that start unit 513 is collected with word collector unit 511 is a starting point, restarts moving window 51, until slipping over entire article.Predetermined number is preferably 6.
Word record cell 512 record moving windows 51 slip over the quantity of word.
Moving window 51 also comprises left margin and right margin, and during slip, right margin moves to right from the off, and when the word of word collector unit 512 collections reached predetermined number, right margin stopped to move, and left margin moves to right, and only comprises a word until between border, the left and right sides.
More than to a kind of method, system and moving window of on network, judging article importance provided by the present invention, be described in detail, used specific case herein principle of the present invention and embodiment are set forth, the explanation of above embodiment just is used for helping to understand method of the present invention and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to thought of the present invention, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as limitation of the present invention.

Claims (10)

1, a kind of method of judging article importance on network is characterized in that, comprising:
Use pre-set sliding window to begin to slide as starting point with the article section start, described moving window carries out not repeated collection to words slip;
When the word of described sliding collection window reached predetermined number, record slipped over the quantity of word, and resets starting point, continued to slide, until slipping over entire article;
In the quantity that slips over word of described moving window record, obtain quantitative value the maximum, and judge the importance of article according to the quantitative value the maximum that obtains.
2, the method for claim 1 is characterized in that, also comprises:
Word adjacent in the article is spaced apart with the space.
3, the method for claim 1 is characterized in that, by following step, described moving window carries out not repeated collection to words slip:
Described moving window judges that whether each words slip repeats with the word of having collected, as not, collects this word.
4, the method for claim 1 is characterized in that, by following step, resets starting point:
The last word of collecting of described moving window is set to starting point;
Empty the word of described sliding collection window.
5, as each described method of claim 1 to 4, it is characterized in that,, judge the importance of article according to the quantitative value the maximum that obtains by following step:
With quantitative value the maximum of obtaining and predetermined value relatively, as less than, determine that then described article is important article.
6, as each described method of claim 1 to 4, it is characterized in that,, judge the importance of article according to the quantitative value the maximum that obtains by following step:
To obtain the weights of quantitative value the maximum, press the significance level that the weights size is determined described article as described article importance.
7, a kind of system of judging article importance on network is characterized in that, comprising: moving window, greatest measure acquiring unit and importance judging unit, described moving window comprise word collector unit, word record cell, reach start unit:
Described start unit, being used to control moving window is that starting point begins to slide with the article section start;
Described word collector unit is used for words slip is carried out not repeated collection, when the word of collecting reaches predetermined number, sends log-on message to described start unit and described word record cell; Described start unit is reset starting point, and starts described moving window continuation slip, until slipping over entire article;
Described word record cell is used to write down the quantity that described moving window slips over word;
Described maximal value acquiring unit is used for obtaining quantitative value the maximum in the quantity that slips over word of described word recording unit records;
Described importance judging unit is used for the importance according to the quantitative value the maximum judgement article that obtains.
8, system as claimed in claim 7 is characterized in that, comprises that also word breaks up the unit, and it is spaced apart with the space to be used for the word that article is adjacent.
9, a kind of moving window is characterized in that, described moving window comprises word collector unit, word record cell, reaches start unit:
Described start unit, being used to control moving window is that starting point begins to slide with the article section start;
Described word collector unit is used for words slip is carried out not repeated collection, when the word of collecting reaches predetermined number, sends log-on message to described start unit and described word record cell; Described start unit is reset starting point, and starts described moving window continuation slip, until slipping over entire article;
Described word record cell is used to write down the quantity that described moving window slips over word.
10, moving window as claimed in claim 9, it is characterized in that, described moving window also comprises left margin and right margin, during slip, described right margin moves to right from the off, and when the word of described word collector unit collection reached predetermined number, described right margin stopped to move, described left margin moves to right, and comprises a word until between border, the described left and right sides.
CNB2007101052978A 2007-05-31 2007-05-31 Method and system for judging article importance in network, and sliding window Active CN100520767C (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CNB2007101052978A CN100520767C (en) 2007-05-31 2007-05-31 Method and system for judging article importance in network, and sliding window
PCT/CN2008/070600 WO2008145031A1 (en) 2007-05-31 2008-03-27 Method and system for judging of the inportance of article, and sliding window

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2007101052978A CN100520767C (en) 2007-05-31 2007-05-31 Method and system for judging article importance in network, and sliding window

Publications (2)

Publication Number Publication Date
CN101071419A CN101071419A (en) 2007-11-14
CN100520767C true CN100520767C (en) 2009-07-29

Family

ID=38898646

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2007101052978A Active CN100520767C (en) 2007-05-31 2007-05-31 Method and system for judging article importance in network, and sliding window

Country Status (2)

Country Link
CN (1) CN100520767C (en)
WO (1) WO2008145031A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100520767C (en) * 2007-05-31 2009-07-29 腾讯科技(深圳)有限公司 Method and system for judging article importance in network, and sliding window
CN100545847C (en) * 2007-09-25 2009-09-30 腾讯科技(深圳)有限公司 A kind of method and system that blog articles is sorted
CN103336771B (en) * 2013-04-02 2016-12-28 江苏大学 Data similarity detection method based on sliding window

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5576954A (en) * 1993-11-05 1996-11-19 University Of Central Florida Process for determination of text relevancy
JPH1049549A (en) * 1996-05-29 1998-02-20 Matsushita Electric Ind Co Ltd Document retrieving device
CN1818908A (en) * 2006-03-16 2006-08-16 董崇军 Feedbakc information use of searcher in search engine
CN100520767C (en) * 2007-05-31 2009-07-29 腾讯科技(深圳)有限公司 Method and system for judging article importance in network, and sliding window

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于滑动窗口的优化贝叶斯邮件过滤算法. 兰亚,吴渝等.重庆邮电学院学报(自然科学版),第18卷第4期. 2006
基于滑动窗口的优化贝叶斯邮件过滤算法. 兰亚,吴渝等.重庆邮电学院学报(自然科学版),第18卷第4期. 2006 *

Also Published As

Publication number Publication date
CN101071419A (en) 2007-11-14
WO2008145031A1 (en) 2008-12-04

Similar Documents

Publication Publication Date Title
CN101246499B (en) Network information search method and system
CN105183897A (en) Method and system for ranking video retrieval
CN101155182A (en) Garbage information filtering method and apparatus based on network
CN104516903A (en) Keyword extension method and system and classification corpus labeling method and system
CN102915335A (en) Information associating method based on user operation record and resource content
CN101393555A (en) Rubbish blog detecting method
CN103593371A (en) Method and device for recommending search keywords
CN103812877B (en) Data compression method based on Bigtable distributed memory system
CN102955812B (en) A kind of method of index building storehouse, device and querying method and device
CN103186556A (en) Method for obtaining and searching structural semantic knowledge and corresponding device
CN101706790A (en) Clustering method of WEB objects in search engine
CN100520767C (en) Method and system for judging article importance in network, and sliding window
CN107463711A (en) A kind of tag match method and device of data
CN109918448A (en) A kind of cloud storage data classification method based on user behavior
CN103559185A (en) Method for parsing and storing test data documents
CN102982099A (en) Personalized concurrent word segmentation processing system and processing method thereof
CN103034656B (en) Chapters and sections content layered approach and device, article content layered approach and device
CN107451120B (en) Content conflict detection method and system for open text information
CN112200209A (en) Poor user identification method based on day-to-day power consumption
CN102314464B (en) Lyrics searching method and lyrics searching engine
CN103853771A (en) Search result pushing method and search result pushing system
CN101102316A (en) A method and system for removing duplicate webpages
CN107133321B (en) Method and device for analyzing search characteristics of page
CN103688256A (en) Method, device and system for determining video quality parameter based on comment
CN102467537B (en) The method and apparatus deleting vocabulary

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20151223

Address after: The South Road in Guangdong province Shenzhen city Fiyta building 518057 floor 5-10 Nanshan District high tech Zone

Patentee after: Shenzhen Tencent Computer System Co., Ltd.

Address before: 2, 518044, East 410 room, SEG science and Technology Park, Zhenxing Road, Shenzhen, Guangdong, Futian District

Patentee before: Tencent Technology (Shenzhen) Co., Ltd.