CN104361115A - Entry weight definition method and device based on co-clicking - Google Patents

Entry weight definition method and device based on co-clicking Download PDF

Info

Publication number
CN104361115A
CN104361115A CN201410718382.1A CN201410718382A CN104361115A CN 104361115 A CN104361115 A CN 104361115A CN 201410718382 A CN201410718382 A CN 201410718382A CN 104361115 A CN104361115 A CN 104361115A
Authority
CN
China
Prior art keywords
term
weight
query
entry
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410718382.1A
Other languages
Chinese (zh)
Other versions
CN104361115B (en
Inventor
邹启波
周连强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Qizhi Software Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd, Qizhi Software Beijing Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201410718382.1A priority Critical patent/CN104361115B/en
Publication of CN104361115A publication Critical patent/CN104361115A/en
Application granted granted Critical
Publication of CN104361115B publication Critical patent/CN104361115B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Abstract

The embodiment of the invention provides an entry weight definition method and device based on co-clicking. The method comprises the following steps of acquiring an input statement query set corresponding to a co-clicking uniform resource locator (URL) based on search log data; segmenting each query in the query set to obtain multiple basic entry term; summarizing emerging frequency of each term in the query set, and acquiring entry weight of each term based on the emerging frequency. The method and the device can be used for accurately acquiring the entry weight based on co-clicking, play an important role in core word extracting of the input statement query and document sorting, and overcomes the defects of the existing TF-IDF technology, so that the accuracy of search results is improved.

Description

A kind of entry Weight Determination based on common click and device
Technical field
The present invention relates to information advancing technique field, particularly relate to a kind of entry Weight Acquisition method based on common click and device.
Background technology
Along with the fast development of network and information technology, the quantity of information of network also presents explosive growth, so fast and correctly inside the data of these magnanimity, obtain the key problem that correct information becomes present search engine technique, then the input of user but presents very large otherness, different people accepts different education, and different culture, cause statement same problem above widely different.So be necessary to the marking that the input entry of user carries out entry weight, this extracts for query core word, and document ordering etc. are all very important technology.
Current TF-IDF (Term Frequency – Inverse Document Frequency) technology, in order to assess the significance level of a words for a copy of it file in a file set or a corpus.It is a kind of conventional weighting technique prospected for information retrieval and information.The weight of an entry is described from documentation level, but it and context-free.
For example, in different query, because under different contexts or semantic background, the weight of same word can be significant different; Such as: a query is " Beijing's Imperial Palace admission ticket ", another query is " Beijing is to the high ferro in Wuhan ", " Beijing " this word has all been there is in these two query, but " Beijing " this word, be certain to different to the significance level of Search Results corresponding to these two query, and existing TF-IDF technology can not describe such situation, cause the error of final Search Results.
Summary of the invention
In view of the above problems, propose the present invention in case provide a kind of overcome the problems referred to above or solve the problem at least in part a kind of based on the common entry Weight Determination clicked and device.
Based on the common entry Weight Determination clicked, comprising:
Based on search daily record data, obtain the common read statement query corresponding to uniform resource locator URL clicked and gather;
Participle is carried out to each query in described query set, obtains multiple basic entry term;
Add up the frequency that each term occurs in described query set, the height based on the frequency of occurrences obtains the entry weight of each term.
Present invention also offers a kind of based on the common entry weight determining device clicked, described device comprises:
Query gathers acquiring unit, for based on search daily record data, obtains the common read statement query corresponding to uniform resource locator URL clicked and gathers;
Participle unit, during the query obtained for gathering acquiring unit to described query gathers, each query carries out word segmentation processing, obtains multiple basic entry term;
Entry Weight Acquisition unit, for adding up the frequency that each term that described participle unit obtains occurs in described query set, and obtains the entry weight of each term based on the height of the frequency of occurrences.
As known from the above, the method and device can Obtaining Accurate based on the common entry weight clicked, the core word for read statement query extracts, and document ordering all serves vital role, overcome the shortcoming of existing TF-IDF technology, and then improve the accuracy of Search Results.
Above-mentioned explanation is only the general introduction of technical solution of the present invention, in order to technological means of the present invention can be better understood, and can be implemented according to the content of instructions, and can become apparent, below especially exemplified by the specific embodiment of the present invention to allow above and other objects of the present invention, feature and advantage.
Accompanying drawing explanation
By reading hereafter detailed description of the preferred embodiment, various other advantage and benefit will become cheer and bright for those of ordinary skill in the art.Accompanying drawing only for illustrating the object of preferred implementation, and does not think limitation of the present invention.And in whole accompanying drawing, represent identical parts by identical reference symbol.In the accompanying drawings:
The entry Weight Determination schematic flow sheet based on common click that Fig. 1 provides for the embodiment of the present invention;
The enumeration process schematic diagram that Fig. 2 provides for the embodiment of the present invention;
The schematic flow sheet carrying out corresponding retrieval according to user's input that Fig. 3 provides for the embodiment of the present invention;
The structural representation based on the common entry weight determining device clicked that Fig. 4 provides for the embodiment of the present invention;
Another structural representation based on the common entry weight determining device clicked that Fig. 5 provides for the embodiment of the present invention.
Embodiment
Below with reference to accompanying drawings exemplary embodiment of the present disclosure is described in more detail.Although show exemplary embodiment of the present disclosure in accompanying drawing, however should be appreciated that can realize the disclosure in a variety of manners and not should limit by the embodiment set forth here.On the contrary, provide these embodiments to be in order to more thoroughly the disclosure can be understood, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.
Below in conjunction with Figure of description, be described the entry Weight Acquisition method that the embodiment of the present invention provides, be illustrated in figure 1 the entry Weight Determination schematic flow sheet based on common click that the embodiment of the present invention provides, described method comprises:
Step 11: based on search daily record data, obtains the common read statement query corresponding to uniform resource locator URL clicked and gathers;
In this step, daily record data can be kept in the search server of backstage.
Here, read statement query corresponding to the URL of common click, its implication is exactly the query clicking identical URL, these query, can think there is potential synonymy, their core should be keep stable, just change a kind of expression, such as " Beijing's Imperial Palace admission ticket is how many ", " the Forbidden City admission ticket how much ", " Beijing's Imperial Palace admission ticket ", " the Forbidden City admission ticket admission fee " ... etc. what ask is all the problem of the Forbidden City admission ticket, again such as below several query:{ " 360 search ", " 360 search website ", " 360 ", " 360 search engine ", " 360 search network address " }, user clicks URL:www.so.com, one group of query so is also considered to common click.
Step 12: participle is carried out to each query in described query set, obtains multiple basic entry term;
In this step, the rule of concrete participle and mode can with reference to existing participle techniques, such as can carry out word segmentation processing to each query in described query set based on n-gram, namely adopt the multistage mode enumerated to generate multiple fragment gram, obtain the basic entry term of multiple fragment gram.
For example, such as Q={T1, T2, T3 ... Tn}, when enumerating, the exponent number of n-gram can be preset, then enumerate one by one, preferably, in embodiments of the present invention, the mode of 1-4 rank gram can be adopted, shown in the process reference Fig. 2 enumerated, during the mode adopting 1-4 rank to enumerate, can from the beginning (T1) start to enumerate 1-4gram, multiple fragment gram can be obtained.
Such as to Q={a, b, c, d} carry out 4 rank enumerate time, following several fragment gram can be generated:
Single order gram:a, b, c, d;
Second order gram:ab, bc, cd;
Three rank gram:abc, bcd;
Quadravalence gram:abcd.
Step 13: add up the frequency that each term occurs in described query set, the height based on the frequency of occurrences obtains the entry weight of each term.
In this step, the detailed process that height based on the frequency of occurrences obtains the entry weight of each term can be: choose the number of times of the highest term of the frequency of occurrences as denominator, occurrence number according to each term calculates the entry weight obtaining each term, the number of times namely occurred in described query set by each term is as molecule, and the ratio obtained is the entry weight of each term.
For example, if carry out word segmentation processing to each query based on n-gram, obtain the basic entry term of multiple fragment gram, then for each gram, add up the number of times that its term comprised occurs in query set respectively, suppose that gram is for " 360 search ", poll query gathers, and occurring once increases by 1, until end of polling(EOP), the statistics finally obtained is: " 360 " this term has occurred 5 times in query set, and " search " this term has occurred 4 times in Qs set; Then according to the method described above, the ratio of number of times can be obtained for " 1,0.8 ".
Above-mentioned " 360 search: 1; 0.8 " add up for the some query in query set the numerical value obtained, in whole query set (containing the various query of enormous amount), according to the method described above, the numerical value (numerical value similar with " 1; 0.8 ") that several " 360 search " are corresponding can be calculated equally, then average in whole query range of convergence for this gram, just can obtain the entry weight that in " 360 search " this gram, each term is corresponding.
In specific implementation, after the entry weight obtaining each term, weight dictionary can also be formed according to the entry weight of each term described and correspondence, comprise multiple " 360 search: 1,0.8 " such data that are similar in this weight dictionary for inquiry.
In addition, after composition weight dictionary, can also carry out corresponding retrieval and Output rusults according to user's input, as shown in Figure 3, retrieving comprises in concrete operations:
Step 31: a certain query first receiving user's input, carries out participle to this query and obtains multiple term;
The method of concrete participle is shown in described in above-described embodiment.
Step 32: inquire about described weight dictionary, obtains the entry weight of each term;
Further, if above-mentioned steps 31 carries out word segmentation processing based on n-gram, obtain the basic entry term of multiple fragment gram, then for each term, use multiple gram that this term hits, inquiry weight dictionary, obtains each entry weight that in multiple gram of term hit, this term is corresponding.
Specifically, preserve the entry weight of each term in multiple gram and each gram in weight dictionary, here is the content of a certain example in weight dictionary:
360:1;
360 search: 1,0.8;
Search: 0.8.
In above-mentioned fragment, " 360 ", " 360 search " and " search " are gram, and each gram numeral is below the entry weight of term in this gram.Such as, in " 360 search ", the entry weight of " 360 " is 1, and the entry weight of " search " is 0.8.
Above-mentioned weight dictionary can store in the mode of database, or other storage modes, and the embodiment of the present invention does not limit this.
In specific implementation, due to each term that query participle obtains, all may hit the one or more gram in multiple gram, like this, based on above-mentioned weight dictionary, use the gram of hit, inquire about in weight dictionary, just can obtain in the gram of each term hit, the weighted value that this term is corresponding.
When supposing that will inquire about " term " in weight dictionary is " 360 ", " 360 " and " 360 search " these two gram can be hit in weight dictionary, obtain two entry weights namely: 1 and 1.
Due to the entry weight of the gram and corresponding term that there is enormous amount in weight dictionary, so carry out each term after participle for the query of user's input, capital obtains several entry weights, can adopt the entry weight that following two kinds of each term of formulae discovery are corresponding like this:
Formula one: score = W 1 * X 1 + W 2 * X 2 + . . . + W m * X m m , Wherein W 1< W 2<. ... W m;
Formula two: score = X 1 + X 2 + . . . + X m m ;
In above-mentioned formula one, score is the entry weight that term finally calculates, and X1 ~ Xm is entry weight corresponding in the gram of the term hit that inquiry weight dictionary obtains, and W1 ~ Wm is weight corresponding to each entry weight of inquiring.
Above-mentioned formula two, employing be the term entry weight that the method for arithmetic mean calculates, the implication of wherein score with X1 ~ Xm is identical with formula one.
It should be noted that, above-mentioned two formula are not realize sole mode of the present invention, only as a kind of implementation of embodiment.Technician can do suitable distortion according to service needed to formula, still falls within the scope of the present invention, such as, increases parameter or multiple value etc.
Give one example, suppose that the query that user inputs is " 360 search network address ", after carrying out participle, it contains three term, one of them term is " 360 ", for this term, inquiry weight dictionary, suppose that the gram that it hits comprises: 360, 360 search, 360 search engines, 360 search engine network address, 360 search websites, then correspondence 5 entry weights are distinguished for 360 in 5 gram, again these 5 entry weights are weighted average calculating, just obtain the final entry weight of " 360 " this term in the query of user's input.
Step 33: the entry weight of each term and the weight threshold preset are compared, term entry weight being more than or equal to described weight threshold, as search keyword, exports corresponding Search Results.
In this step, when the entry weight of each term is compared with the weight threshold preset, can ignore the term that entry weight is less than described weight threshold, thus the core word being conducive to read statement query extracts and document ordering, improves the accuracy of Search Results.
Based on said method, the embodiment of the present invention additionally provides a kind of based on the common entry weight determining device clicked, and be illustrated in figure 4 the structural representation based on the common entry weight determining device clicked that the embodiment of the present invention provides, described device comprises:
Query gathers acquiring unit 41, for based on search daily record data, obtains the common read statement query corresponding to uniform resource locator URL clicked and gathers;
Participle unit 42, during the query obtained for gathering acquiring unit to described query gathers, each query carries out word segmentation processing, obtains multiple basic entry term;
Entry Weight Acquisition unit 43, for adding up the frequency that each term that described participle unit obtains occurs in described query set, and obtains the entry weight of each term based on the height of the frequency of occurrences.
Be illustrated in figure 5 another structural representation based on the common entry weight determining device clicked that the embodiment of the present invention provides, with reference to figure 5, in specific implementation, this device also can comprise:
Weight dictionary unit 44, for the entry weight composition weight dictionary according to each term described and correspondence.
User's input receiving unit 45, for receiving a certain query of user's input, and carries out participle to this query and obtains multiple term;
Entry weight query unit 46, for inquiring about described weight dictionary unit, obtains the entry weight of each term that described user's input receiving unit obtains;
Search result output unit 47, entry weight for each term described entry weight query unit obtained compares with the weight threshold preset, and term entry weight being more than or equal to described weight threshold is as search keyword, exports corresponding Search Results.
In specific implementation, above-mentioned participle unit 42 can comprise further:
Word segmentation processing module 421, for carrying out word segmentation processing to each query in described query set based on n-gram, obtains the basic entry term of multiple fragment gram.
Above-mentioned entry Weight Acquisition unit 43 can comprise further:
Weight computation module 431, for choosing the number of times of the highest term of the frequency of occurrences as denominator, the occurrence number according to each term calculates the entry weight obtaining each term.
The implementation procedure that in said apparatus, each unit is concrete is shown in described in said method embodiment.
In sum, the method that the embodiment of the present invention provides and device can Obtaining Accurate based on the common entry weight clicked, core word for read statement query extracts, and document ordering all serves vital role, overcome the shortcoming of existing TF-IDF technology, and then improve the accuracy of Search Results.
In instructions provided herein, describe a large amount of detail.But can understand, embodiments of the invention can be put into practice when not having these details.In some instances, be not shown specifically known method, structure and technology, so that not fuzzy understanding of this description.
Similarly, be to be understood that, in order to simplify the disclosure and to help to understand in each inventive aspect one or more, in the description above to exemplary embodiment of the present invention, each feature of the present invention is grouped together in single embodiment, figure or the description to it sometimes.But, the method for the disclosure should be construed to the following intention of reflection: namely the present invention for required protection requires feature more more than the feature clearly recorded in each claim.Or rather, as claims below reflect, all features of disclosed single embodiment before inventive aspect is to be less than.Therefore, the claims following embodiment are incorporated to this embodiment thus clearly, and wherein each claim itself is as independent embodiment of the present invention.
In addition, those skilled in the art can understand, although embodiments more described herein to comprise in other embodiment some included feature instead of further feature, the combination of the feature of different embodiment means and to be within scope of the present invention and to form different embodiments.Such as, in the following claims, the one of any of embodiment required for protection can use with arbitrary array mode.
All parts embodiment of the present invention with hardware implementing, or can realize with the software module run on one or more processor, or realizes with their combination.It will be understood by those of skill in the art that the some or all functions that microprocessor or digital signal processor (DSP) can be used in practice to realize according to the some or all parts in the search system of the embodiment of the present invention.The present invention can also be embodied as part or all equipment for performing method as described herein or device program (such as, computer program and computer program).Realizing program of the present invention and can store on a computer-readable medium like this, or the form of one or more signal can be had.Such signal can be downloaded from internet website and obtain, or provides on carrier signal, or provides with any other form.
The present invention will be described instead of limit the invention to it should be noted above-described embodiment, and those skilled in the art can design alternative embodiment when not departing from the scope of claims.In the claims, any reference symbol between bracket should be configured to limitations on claims.Word " comprises " not to be got rid of existence and does not arrange element in the claims or step.Word "a" or "an" before being positioned at element is not got rid of and be there is multiple such element.The present invention can by means of including the hardware of some different elements and realizing by means of the computing machine of suitably programming.In the unit claim listing some devices, several in these devices can be carry out imbody by same hardware branch.Word first, second and third-class use do not represent any order.Can be title by these word explanations.
Obviously, those skilled in the art can carry out various change and modification to the present invention and not depart from the spirit and scope of the present invention.Like this, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification.

Claims (10)

1., based on the common entry Weight Determination clicked, it is characterized in that, comprising:
Based on search daily record data, obtain the common read statement query corresponding to uniform resource locator URL clicked and gather;
Participle is carried out to each query in described query set, obtains multiple basic entry term;
Add up the frequency that each term occurs in described query set, the height based on the frequency of occurrences obtains the entry weight of each term.
2. the method for claim 1, is characterized in that, after the entry weight obtaining each term, described method also comprises:
Entry weight according to each term described and correspondence forms weight dictionary.
3. the method as described in any one of claim 1-2, is characterized in that, described method also comprises:
Receive a certain query of user's input, participle is carried out to this query and obtains multiple term;
Inquire about described weight dictionary, obtain the entry weight of each term;
The entry weight of each term and the weight threshold preset are compared, term entry weight being more than or equal to described weight threshold, as search keyword, exports corresponding Search Results.
4. the method as described in any one of claim 1-3, is characterized in that, when the entry weight of each term being compared with the weight threshold preset, ignores the term that entry weight is less than described weight threshold.
5. the method as described in any one of claim 1-4, is characterized in that, described daily record data is kept in the search server of backstage.
6. the method as described in any one of claim 1-5, is characterized in that, describedly carries out participle to each query in described query set, obtains multiple basic entry term, specifically comprises:
Based on n-gram, word segmentation processing is carried out to each query in described query set, obtains the basic entry term of multiple fragment gram.
7. the method as described in any one of claim 1-6, is characterized in that, described n-gram is 4 rank gram.
8. the method as described in any one of claim 1-7, is characterized in that, the described height based on the frequency of occurrences obtains the entry weight of each term, specifically comprises:
The number of times choosing the highest term of the frequency of occurrences is as denominator, and the occurrence number according to each term calculates the entry weight obtaining each term.
9., based on the common entry weight determining device clicked, it is characterized in that, described device comprises:
Query gathers acquiring unit, for based on search daily record data, obtains the common read statement query corresponding to uniform resource locator URL clicked and gathers;
Participle unit, during the query obtained for gathering acquiring unit to described query gathers, each query carries out word segmentation processing, obtains multiple basic entry term;
Entry Weight Acquisition unit, for adding up the frequency that each term that described participle unit obtains occurs in described query set, and obtains the entry weight of each term based on the height of the frequency of occurrences.
10. device as claimed in claim 9, it is characterized in that, described device also comprises:
Weight dictionary unit, for the entry weight composition weight dictionary according to each term described and correspondence.
CN201410718382.1A 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device Expired - Fee Related CN104361115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410718382.1A CN104361115B (en) 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410718382.1A CN104361115B (en) 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device

Publications (2)

Publication Number Publication Date
CN104361115A true CN104361115A (en) 2015-02-18
CN104361115B CN104361115B (en) 2018-07-27

Family

ID=52528375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410718382.1A Expired - Fee Related CN104361115B (en) 2014-12-01 2014-12-01 It is a kind of based on the entry Weight Determination clicked jointly and device

Country Status (1)

Country Link
CN (1) CN104361115B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183912A (en) * 2015-10-12 2015-12-23 北京百度网讯科技有限公司 Abnormal log determination method and device
CN105488209A (en) * 2015-12-11 2016-04-13 北京奇虎科技有限公司 Method and device for analyzing word weight
CN105528441A (en) * 2015-12-22 2016-04-27 北京奇虎科技有限公司 Automatic marking based head word extracting method and device
CN105528430A (en) * 2015-12-10 2016-04-27 北京奇虎科技有限公司 Method and device for determining weights of search terms
CN106919649A (en) * 2017-01-19 2017-07-04 北京奇艺世纪科技有限公司 A kind of method and device of entry weight calculation
CN106919603A (en) * 2015-12-25 2017-07-04 北京奇虎科技有限公司 The method and apparatus for calculating participle weight in query word pattern
CN108804511A (en) * 2018-04-20 2018-11-13 北京奇艺世纪科技有限公司 Method, apparatus and electronic equipment are recalled in a kind of search
CN108897736A (en) * 2018-06-20 2018-11-27 大连诺道认知医学技术有限公司 Document sort method and device based on Paper Rank algorithm
CN109815396A (en) * 2019-01-16 2019-05-28 北京搜狗科技发展有限公司 Search term Weight Determination and device
CN110147421A (en) * 2019-05-10 2019-08-20 腾讯科技(深圳)有限公司 A kind of target entity link method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080065617A1 (en) * 2005-08-18 2008-03-13 Yahoo! Inc. Search entry system with query log autocomplete
CN102043845A (en) * 2010-12-08 2011-05-04 百度在线网络技术(北京)有限公司 Method and equipment for extracting core keywords based on query sequence cluster
US20110137886A1 (en) * 2009-12-08 2011-06-09 Microsoft Corporation Data-Centric Search Engine Architecture
CN103150409A (en) * 2013-04-08 2013-06-12 深圳市宜搜科技发展有限公司 Method and system for recommending user search word
CN103425687A (en) * 2012-05-21 2013-12-04 阿里巴巴集团控股有限公司 Retrieval method and system based on queries

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080065617A1 (en) * 2005-08-18 2008-03-13 Yahoo! Inc. Search entry system with query log autocomplete
US20110137886A1 (en) * 2009-12-08 2011-06-09 Microsoft Corporation Data-Centric Search Engine Architecture
CN102043845A (en) * 2010-12-08 2011-05-04 百度在线网络技术(北京)有限公司 Method and equipment for extracting core keywords based on query sequence cluster
CN103425687A (en) * 2012-05-21 2013-12-04 阿里巴巴集团控股有限公司 Retrieval method and system based on queries
CN103150409A (en) * 2013-04-08 2013-06-12 深圳市宜搜科技发展有限公司 Method and system for recommending user search word

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105183912A (en) * 2015-10-12 2015-12-23 北京百度网讯科技有限公司 Abnormal log determination method and device
CN105183912B (en) * 2015-10-12 2019-03-01 北京百度网讯科技有限公司 Abnormal log determines method and apparatus
CN105528430B (en) * 2015-12-10 2019-05-31 北京奇虎科技有限公司 A kind of method and apparatus of the weight of determining search terms
CN105528430A (en) * 2015-12-10 2016-04-27 北京奇虎科技有限公司 Method and device for determining weights of search terms
CN105488209A (en) * 2015-12-11 2016-04-13 北京奇虎科技有限公司 Method and device for analyzing word weight
CN105488209B (en) * 2015-12-11 2019-06-07 北京奇虎科技有限公司 A kind of analysis method and device of word weight
CN105528441A (en) * 2015-12-22 2016-04-27 北京奇虎科技有限公司 Automatic marking based head word extracting method and device
CN106919603A (en) * 2015-12-25 2017-07-04 北京奇虎科技有限公司 The method and apparatus for calculating participle weight in query word pattern
CN106919603B (en) * 2015-12-25 2020-12-04 北京奇虎科技有限公司 Method and device for calculating word segmentation weight in query word mode
CN106919649A (en) * 2017-01-19 2017-07-04 北京奇艺世纪科技有限公司 A kind of method and device of entry weight calculation
CN106919649B (en) * 2017-01-19 2020-06-26 北京奇艺世纪科技有限公司 Entry weight calculation method and device
CN108804511A (en) * 2018-04-20 2018-11-13 北京奇艺世纪科技有限公司 Method, apparatus and electronic equipment are recalled in a kind of search
CN108804511B (en) * 2018-04-20 2022-04-22 北京奇艺世纪科技有限公司 Search recall method and device and electronic equipment
CN108897736A (en) * 2018-06-20 2018-11-27 大连诺道认知医学技术有限公司 Document sort method and device based on Paper Rank algorithm
CN108897736B (en) * 2018-06-20 2022-04-12 大连诺道认知医学技术有限公司 Document sorting method and device based on Paper Rank algorithm
CN109815396A (en) * 2019-01-16 2019-05-28 北京搜狗科技发展有限公司 Search term Weight Determination and device
CN110147421A (en) * 2019-05-10 2019-08-20 腾讯科技(深圳)有限公司 A kind of target entity link method, device, equipment and storage medium
CN110147421B (en) * 2019-05-10 2022-06-21 腾讯科技(深圳)有限公司 Target entity linking method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN104361115B (en) 2018-07-27

Similar Documents

Publication Publication Date Title
CN104361115A (en) Entry weight definition method and device based on co-clicking
US8560513B2 (en) Searching for information based on generic attributes of the query
US7949643B2 (en) Method and apparatus for rating user generated content in search results
CN104376115A (en) Fuzzy word determining method and device based on global search
US10565253B2 (en) Model generation method, word weighting method, device, apparatus, and computer storage medium
CN103377226A (en) Intelligent search method and system thereof
CN103886092A (en) Method and device for providing terminal failure problem solutions
CN103942264A (en) Method and device for pushing webpages containing news information
CN101661490B (en) Search engine, client thereof and method for searching page
CN103631889A (en) Image recognizing method and device
CN115905489A (en) Method for providing bid and bid information search service
CN104778232B (en) Searching result optimizing method and device based on long query
US20160154886A1 (en) Accounting for authorship in a web log search engine
CN111435406A (en) Method and device for correcting database statement spelling errors
Juan An effective similarity measurement for FAQ question answering system
CN110705285B (en) Government affair text subject word library construction method, device, server and readable storage medium
CN104462556A (en) Method and device for recommending question and answer page related questions
CN105528441A (en) Automatic marking based head word extracting method and device
CN105159899A (en) Searching method and searching device
CN103631890A (en) Method and device for mining image principal information
CN110737757B (en) Method and apparatus for generating information
CN111737428B (en) Target material matching method, device, equipment and readable storage medium
CN109299382B (en) Recommendation method and system for character data and computer storage medium
CN105528430A (en) Method and device for determining weights of search terms
CN117033552A (en) Information evaluation method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180727

Termination date: 20211201