CN104464733A - Multi-scene managing method and device of voice conversation - Google Patents

Multi-scene managing method and device of voice conversation Download PDF

Info

Publication number
CN104464733A
CN104464733A CN201410590076.4A CN201410590076A CN104464733A CN 104464733 A CN104464733 A CN 104464733A CN 201410590076 A CN201410590076 A CN 201410590076A CN 104464733 A CN104464733 A CN 104464733A
Authority
CN
China
Prior art keywords
scene
demand information
characteristic
proper vector
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410590076.4A
Other languages
Chinese (zh)
Other versions
CN104464733B (en
Inventor
陈洪亮
汪冠春
吴华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201410590076.4A priority Critical patent/CN104464733B/en
Publication of CN104464733A publication Critical patent/CN104464733A/en
Application granted granted Critical
Publication of CN104464733B publication Critical patent/CN104464733B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides a multi-scene managing method and device of a voice conversation. The method comprises the steps that the demand information input by a user is acquired from text information which is obtained from the voice information of the user in a text recognizing mode; at least one grade value corresponding to at least one of scenes is acquired according to the demand information; the scene switching action to be executed is determined according to the grade value, and the voice content corresponding to the switched scene is shown. The multi-scene managing method and device can solve the multi-scene switching problem in the voice conversation process of the user and a conversational system very well.

Description

A kind of many scene managements method of voice dialogue and device
Technical field
The present invention relates to technical field of voice recognition, particularly relate to a kind of many scene managements method and device of voice dialogue.
Background technology
Along with the development of speech recognition technology and development of Mobile Internet technology, phonetic entry obtains obvious all the more at the Heterosis of mobile terminal.Along with Ge great Internet firm issues speech dialogue system respectively, by from however the phonetic entry of low cost, thus understand the demand of user and deal with problems for user.
In speech recognition process, the dialogue that many scenes are multi-field may be there is, and need to solve the decision problem in the multi-field process of taking turns dialogue more.Many scene managements of the prior art, one is rule-based (rule-based), by formulating the management that a series of rule realizes switching between scene; One is based on disaggregated model, uses disaggregated model prediction next to perform an action by current system conditions.
Rule-based method, needs rule author to have good background knowledge, and the factor related to along with rule becomes many, and processing logic becomes complicated, and effect can not reach optimum state; Rule-based many scene managements, not in conjunction with the feedback information of user, therefore do not understand the true service condition of user, and the final decision-making action generated may not be the most rational.
Summary of the invention
The embodiment of the present invention provides a kind of many scene managements method and device of voice dialogue, realizes effectively carrying out many scenes handover management.
For achieving the above object, embodiments of the invention adopt following technical scheme:
Many scene managements method of voice dialogue, the method comprises:
From text message, obtain the demand information of user's input, wherein, described text message carries out text identification and obtains from the voice messaging of described user;
Each at least one score value self-corresponding of at least one scene in scene is obtained according to described demand information;
Determine the scene switching action for performing according at least one score value described, and show the voice content corresponding with the scene after switching.
Many scene managements device of voice dialogue, this device comprises:
First acquisition module, for obtaining the demand information of user's input from text message, wherein, described text message carries out text identification and obtains from the voice messaging of described user;
Second acquisition module, for obtaining each at least one score value self-corresponding of at least one scene in scene according to described demand information;
Handover module, for determining the scene switching action for performing according at least one score value described, and shows the voice content corresponding with the scene after switching.
Many scene managements method of the voice dialogue that the embodiment of the present invention provides and device, by obtaining the demand information of user's input from Textual information, at least one scene in scene is obtained according to the demand information of user's input, thus for for provide for perform and be applicable to the voice content of user's request, the many scenes switching problem in user and conversational system in voice dialogue process can be solved well.
Accompanying drawing explanation
The relation schematic diagram of many scene managements that Fig. 1 is suitable for by the embodiment of the present invention.
The schematic flow sheet of many scene managements method of the voice dialogue that Fig. 2 provides for the embodiment of the present invention one.
The schematic flow sheet of many scene managements method of the voice dialogue that Fig. 3 provides for the embodiment of the present invention two.
Fig. 4 is the schematic diagram of the embodiment of the present invention two Scene switching action.
The schematic flow sheet of many scene managements method of the voice dialogue that Fig. 5 provides for the embodiment of the present invention three.
The structural representation of many scene managements device of the voice dialogue that Fig. 6 provides for the embodiment of the present invention four.
The structural representation of many scene managements device of the voice dialogue that Fig. 7 provides for the embodiment of the present invention five.
Embodiment
Below in conjunction with accompanying drawing, many scene managements method of the voice dialogue that the embodiment of the present invention provides and device are described in detail.
Take turns dialogue described in the embodiment of the present invention more, refer to and help user to complete specific demand or task by the mutual of many wheels of user and terminal.In embodiments of the present invention, current take turns dialogue more and divide according to field (or, scene), dialogue can be taken turns for Task more.The relation schematic diagram of many scene managements that Fig. 1 is suitable for by the embodiment of the present invention, as shown in Figure 1, in embodiments of the present invention, multiple scene can be comprised, such as, travelling scene, food and drink scene, weather scene and stock scene, more such as, travelling scene can be divided into multiple subfield scape: air ticket scene, train ticket scene and hotel's scene.
Particularly, the embodiment of the present invention carries out exemplary illustration to talk with as follows:
User: the air ticket going to Qingdao.
Mobile terminal: good, to the plane ticket in Qingdao.May I ask you when to leave?
User: tomorrow sets out.
Mobile terminal: good, tomorrow is to the plane ticket in Qingdao.You estimate by where?
User: Beijing.
Dialogue to now, mobile terminal by the relative information displaying of air ticket to user.
User: there is nice Sichuan cuisine in Qingdao?
(" having the Sichuan cuisine of nice acid? " has been arrived in speech recognition)
Mobile terminal: you will continue to order air ticket?
User: be not.
Mobile terminal: hear the Sichuan cuisine having nice acid?
User: tomorrow goes to Qingdao.
Dialogue is to now, and the corresponding Sichuan cuisine shop in Qingdao is showed user by mobile terminal.
User: the dining room near Qingdao Airport.
Mobile terminal: for you find following dining room (and carrying out corresponding result displaying)
User: can rain tomorrow in Qingdao?
Mobile terminal: Qingdao shower tomorrow, temperature 16 degrees Celsius to 20 degrees Celsius, east wind 5-6 level.
Embodiment one:
The schematic flow sheet of many scene managements method of the voice dialogue that Fig. 2 provides for the embodiment of the present invention one, as shown in Figure 1, the embodiment of the present invention comprises the steps:
Step 101, from text message, obtain the demand information of user's input, wherein, text message carries out text identification and obtains from the voice messaging of user.
Step 102, at least one scene according to demand in acquisition of information scene, and obtain at least one proper vector corresponding respectively with at least one scene.
Step 103, obtains the inner product of at least one proper vector and each self-corresponding weight vectors, obtains at least one inner product.
Step 104, determines for performing scene switching action according at least one inner product, and shows the voice content corresponding with the scene after switching.
In a step 101, by speech recognition, the voice messaging of user is converted to text message, according to one embodiment of the invention, the demand information of user is obtained from the text message that identification obtains, such as, user input voice " removes the air ticket in Qingdao ", after this voice messaging is identified as text message, gets the demand information of user's input for " air ticket ".
In a step 102, obtain at least one scene in scene according to the demand information obtained in step 101, in one embodiment, at least one scene in scene can judge according in the contextual information of voice dialogue.Wherein, in scene be multiple scenes default in conversational system (such as, travelling scene shown in Fig. 1, food and drink scene, weather scene and stock scene), particularly, get the demand information " air ticket " of user's input in a step 101, tourism scene in scene can be got (wherein according to this demand information, tourism scene can also comprise multiple subfield scapes such as air ticket scene, train ticket scene and hotel's scene), further, this subfield scape of air ticket scene in this tourism scene that this demand information is corresponding.In one embodiment, at least one proper vector corresponding with this tourism scene can be obtained from voice messaging, such as, in voice messaging " removes the air ticket in Qingdao ", " go, Qingdao, air ticket " form the feature of this voice messaging, above-mentioned feature is quantized, get final product morphogenesis characters vector, in this proper vector, specifically comprise: destination (Qingdao), air ticket (demand information), in addition, can also including but not limited to following information in proper vector in the embodiment of the present invention: departure place, date, type of seat, the information such as departure time.In one embodiment, departure place, destination and date are essential information, and type of seat, departure time are optional information; By above-mentioned proper vector, the embodiment of the present invention can be made to have good generalization ability, avoid in prior art and often increase the labeled data that a new scene will be corresponding, and the model again corresponding to Training scene.
In step 103, obtain at least one proper vector of obtaining in a step 102 and with it at least one inner product of each self-corresponding weight vectors (such as, obtaining inner product is: A 1, A 2, A 3..., A nn is the number of inner product), wherein, weight vectors is the weight vectors of training the scene characteristic that obtains corresponding according to the language material collected, it will be understood by those skilled in the art that, the embodiment of the present invention is specially inner product with score value and carries out exemplary illustration, and the concrete account form of inner product can not form the restriction to the embodiment of the present invention.
At step 104, determine the scene switching action for performing according at least one inner product obtained in step 103, and show the voice content corresponding with the scene after switching.According to one embodiment of the invention, at step 104, at least one inner product is sorted, obtain the maximal value at least one inner product, using the decision-making action of scene switching action corresponding for this inner product as corresponding scene, and it is fed back to user by the mode of voice content.In one embodiment, get the proper vector of demand information " air ticket " the corresponding scene of user, calculating its inner product is A 1, A 2, A 3, A 4, the maximal value obtained after sequence in inner product is A 2, then by A 2corresponding voice content (such as, its voice content is " good, to the plane ticket in Qingdao, to may I ask you and when leave ") exports to user.
Many scene managements method of the voice dialogue that the embodiment of the present invention provides, by obtaining the demand information of user's input from Textual information, at least one scene in scene is obtained according to the demand information of user's input, thus for for provide for perform and be applicable to the scene switching action of user's request, and show the voice content corresponding with the scene after switching, the problem that in conversational system, many scenes of voice dialogue switch can be solved well.In addition, represent that scene makes conversational system have good generalization ability by proper vector, new scene can be increased fast in system, and then effectively carry out many scenes handover management, fully can also understand the true service condition of user, for user provides the most rational action decision-making, enhance Consumer's Experience.
Embodiment two:
The schematic flow sheet of many scene managements method of the voice dialogue that Fig. 3 provides for the embodiment of the present invention two, Fig. 4 is the schematic diagram of the embodiment of the present invention two Scene switching action; As shown in Figure 3, the embodiment of the present invention comprises the steps:
Step 201, from text message, obtain the demand information of user's input, wherein, text message carries out text identification and obtains from the voice messaging of user.
Step 202, according to identifying in step 201 that the demand information obtained carries out scene classification to voice dialogue, obtain demand information at least one scene in the scene that is suitable for.
Step 203, at least one scene according to obtaining in step 202 carries out scene characteristic extraction to demand information, obtains at least one proper vector corresponding respectively with at least one scene.
Step 204, obtains the inner product of at least one proper vector and each self-corresponding weight vectors, obtains at least one inner product.
Step 205, sorts at least one inner product, obtains the maximal value in all inner products.
Step 206, the scene characteristic corresponding according to maximal value carries out scene switching action to demand information, and shows the speech response corresponding with the scene after switching.
In step 201, can the description of step 101 in reference example one, be no longer described in further detail at this.
In step 202., according to the demand information obtained in step 201, scene classification is carried out to voice dialogue, obtain at least one scene be applicable in scene, such as, the demand information of user's input is " Qingdao ", " air ticket ", this voice dialogue can be categorized in the subfield scape air ticket scene of travelling scene.After classification obtains multiple scene, in step 203, according to this scene, scene characteristic extraction is carried out to demand information, get and its characteristic of correspondence vector.
In step 203 and step 204, can step 102 in reference example one and step 103, be no longer described in further detail at this.
In step 205, sort, obtain the maximal value in inner product at least one inner product obtained in step 204, such as, get the proper vector of demand information " air ticket " the corresponding scene of user, calculating its inner product is A 1, A 2, A 3, the maximal value obtained after sequence in inner product is A 2.
In step 206, Fig. 4 is the scene characteristic that the schematic diagram of the embodiment of the present invention two Scene switching action is corresponding according to maximal value, response and demand information adapt voice messaging, and voice content is fed back to user, such as, the maximal value A in the inner product mentioned in step 205 2corresponding voice content is " good, to the plane ticket in Qingdao, to may I ask you and when leave ", in the process of voice dialogue, then this section of voice content is fed back to user.
It will be understood by those skilled in the art that, in the application process of reality, for scene setting and study can not be exhaustive, also the scene characteristic (the outer feature of scene) outside default scene may be there is, according to one embodiment of the invention, the proper vector of action is confirmed according to the outer feature of scene and at least one scene characteristic generating scene, scene confirms that the proper vector of action is one at least one proper vector, further, if a scene characteristic in the corresponding scene of the maximal value obtained in step 205, according to this scene characteristic, demand information is responded, if the plural proper vector in the corresponding scene of maximal value, clarifies demand information according to plural proper vector, if the outer feature of the corresponding scene of maximal value and the scene characteristic in scene, the scene characteristic in scene outward feature and scene is confirmed.
In scene clarifying process, by obtaining the difference of scene vector corresponding at least plural scene characteristic, the exponent arithmetic of this difference can be obtained, determining two scene clarification proper vectors according to exponent arithmetic result; Such as, there are proper vector f_1 and the proper vector f_2 of two scenes, calculate the difference f_1-f_2 of two scene characteristic, the exponent e ^ (f_1-f_2) that this difference of further calculating is corresponding, wherein, e represents natural constant, certainly, other numerical value can also be adopted as the truth of a matter of exponent arithmetic.The proper vector that two scenes are clarified is determined according to the operation result of this index, particularly, the weight vector computation inner product of the proper vector clarify scene and scene clarification, obtains the score of these two scene clarifications, when this score value is maximum, two scenes are clarified.
Such as, in above-mentioned many wheel voice dialogue processes, when the voice messaging that user inputs by mobile terminal " there is nice Sichuan cuisine in Qingdao " identifies " Sichuan cuisine having nice acid ", mobile terminal is according to text information, now by the embodiment of the present invention two, mobile terminal can in conjunction with contextual information and resolving information, when performing scene switching action, have employed scene to confirm, and illustrate scene and confirm corresponding speech response " you will continue to order air ticket ", thus user is made to carry out scene confirmation.
Further, after user confirms "no", mobile terminal is in conjunction with contextual information and resolving information, when performing scene switching action, have employed scene clarification, and illustrate scene clarification and confirm that corresponding speech response " hears the Sichuan cuisine having nice acid ", thus user is clarified to scene.
As shown in Figure 5, the schematic flow sheet of many scene managements method of the voice dialogue provided for the embodiment of the present invention three; In embodiments of the present invention, the many scene managements method specifically performing voice dialogue for mobile terminal carries out exemplary illustration, and as shown in Figure 5, the embodiment of the present invention comprises the steps:
In off-line learning process in step 501, in crowd's survey process, multiple scene objects can be set, allow user and mobile terminal carry out taking turns interactive voice more, thus make mobile terminal have certain sticgastuc deicision; Wherein, many survey data are one of them foundations of the mobile terminal training data in the embodiment of the present invention, and the embodiment of the present invention can be made can to realize on-line prediction based on training data.
In on-line study process in step 502, if (namely voice dialogue relates to many wheels, user and mobile terminal have carried out repeatedly voice dialogue), contextual information and the resolving information of user and mobile terminal can be collected, thus get proper vector to represent the eigenstate of scene, strengthen learning model to proper vector and weight vector computation inner product; By this process, can the embodiment of the present invention be made to reach global gain maximum, by organizing contrast experiment, experiment effect all exceedes rule-based many scene managements of the prior art more.In addition, the embodiment of the present invention, by selecting the proper vector irrelevant with scene field, utilizes proper vector to represent scene characteristic, thus covers substantially and switch relevant factor to scene, improves generalization ability.The signal of proper vector can see Fig. 4.
In scene switching action in step 503, the embodiment of the present invention is with the 4 class actions exemplarily property explanation shown in table 1, include but not limited to: represent scene outer (present (NULL)), represent scene (present (d)), scene confirms clarification (clarify (d1, d2)) between (confirm (d)) and scene.Confirm clarify with scene by scene and enhance man-machine interaction capabilities of taking turns whole in dialog procedure more.
Table 1
In the Action Selection process of step 503, utilize the enhancing learning model after the optimization of training in step 502, can according to the demand information of active user, prediction performs which the class action in table 1.
By said process, the feedback information of user can be made full use of, the action that the long-term gain of user is maximum can be doped; In addition, because proper vector chooses the feature irrelevant with concrete scene, thus new scene characteristic can be introduced fast, make scheme have good extendability.
Embodiment four:
The structural representation of many scene managements device of the voice dialogue that Fig. 6 provides for the embodiment of the present invention four; As shown in Figure 6, the embodiment of the present invention comprises
First acquisition module 41, for obtaining the demand information of user's input from text message, wherein, described text message carries out text identification and obtains from the voice messaging of described user;
Second acquisition module 42, for obtaining each at least one score value self-corresponding of at least one scene in scene according to described demand information;
Handover module 43, for determining the scene switching action for performing according at least one score value described, and shows the voice content corresponding with the scene after switching.
Wherein, the second acquisition module 42 comprises:
First acquiring unit 421, for obtaining at least one scene in scene according to described demand information, and obtains at least one proper vector corresponding respectively with at least one scene described;
Second acquisition unit 422, for obtaining the score value of at least one proper vector described and each self-corresponding weight vectors, obtains at least one score value.
Further, described first acquiring unit comprises:
Scene classification subelement (not shown), for carrying out scene classification according to described demand information to described voice dialogue, obtain described demand information at least one scene in the scene that is suitable for;
Feature extraction subelement (not shown), for carrying out scene characteristic extraction according at least one scene described to described demand information, obtains at least one proper vector corresponding respectively with at least one scene described.
The detailed description of the embodiment of the present invention and Advantageous Effects with reference to the associated description in above-described embodiment one and Advantageous Effects, can not repeat them here.
Embodiment five:
The structural representation of many scene managements device of the voice dialogue that Fig. 7 provides for the embodiment of the present invention five; As shown in Figure 7, if also get feature scene from described demand information, the embodiment of the present invention also comprises:
3rd acquisition module 44, for obtaining the proper vector that scene confirms action according to the outer feature of described scene and at least one scene characteristic described from least one scene characteristic described.
Handover module 43 comprises:
Sequencing unit 431, for sorting at least one score value described, obtains the maximal value in all score values;
Determining unit 432, determines the scene switching action for performing for the scene characteristic corresponding according to described maximal value, and shows the voice content of the scene characteristic corresponding with described maximal value.
Further, described determining unit comprises:
First responds subelement (not shown), if for a scene characteristic in the corresponding described scene of described maximal value, responded described demand information according to this scene characteristic;
Second responds subelement (not shown), if for the plural proper vector in the corresponding described scene of described maximal value, clarified described demand information according to described plural proper vector;
3rd responds subelement (not shown), if for the scene characteristic in the described maximal value outer feature of corresponding described scene and described scene, confirmed the scene characteristic in the outer feature of described scene and described scene.
Further, the 3rd response subelement (not shown) comprises:
Difference obtains subelement, for the difference of scene vector corresponding at least plural scene characteristic described in obtaining;
Clarification subelement, for obtaining the exponent arithmetic of described difference, determines to clarify described plural scene characteristic according to exponent arithmetic result.
Further, this device also comprises:
4th acquisition module 45, for obtaining the target signature of at least one scene described in crowd's survey process, carries out taking turns voice training to described target signature by statistical model more;
5th acquisition module 46, for when described statistical model has sticgastuc deicision, obtains the initial value of described weight vectors.
The detailed description of the embodiment of the present invention and Advantageous Effects with reference to the associated description in above-described embodiment two and Advantageous Effects, can not repeat them here.
To sum up, the embodiment of the present invention can make full use of the feedback information of user, can dope the action that the long-term gain of user is maximum; In addition, because proper vector chooses the feature irrelevant with concrete scene, thus new scene characteristic can be introduced fast, make scheme have good extendability.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of described claim.

Claims (16)

1. many scene managements method of voice dialogue, is characterized in that, described method comprises:
From text message, obtain the demand information of user's input, wherein, described text message carries out text identification and obtains from the voice messaging of described user;
According to each at least one score value self-corresponding of at least one scene in described demand information scene;
Determine the scene switching action for performing according at least one score value described, and show the voice content corresponding with the scene after switching.
2. method according to claim 1, is characterized in that, the described step according to each at least one score value self-corresponding of at least one scene in described demand information acquisition scene comprises:
Obtain at least one scene in scene according to described demand information, and obtain at least one proper vector corresponding respectively with at least one scene described;
Obtain the inner product of at least one proper vector described and each self-corresponding weight vectors, obtain at least one inner product, described inner product is as score value.
3. method according to claim 2, is characterized in that, described at least one scene obtained according to described demand information in scene, and the step obtaining at least one proper vector corresponding respectively with at least one scene described comprises:
According to described demand information, scene classification is carried out to described voice dialogue, obtain described demand information at least one scene in the scene that is suitable for;
According at least one scene described, scene characteristic extraction is carried out to described demand information, obtain at least one proper vector corresponding respectively with at least one scene described.
4. method according to claim 2, is characterized in that, if also get feature scene from described demand information, described method also comprises:
From at least one scene characteristic described, the proper vector that scene confirms action is obtained according to the outer feature of described scene and at least one scene characteristic described.
5. method according to claim 2, is characterized in that, described method also comprises:
In crowd's survey process, obtain the target signature of at least one scene described, to described target signature carry out take turns voice training by statistical model more;
When described statistical model has sticgastuc deicision, obtain the initial value of described weight vectors.
6. according to the arbitrary described method of claim 1-5, it is characterized in that, the step of the scene switching action that at least one score value described in described basis is determined for performing comprises:
At least one score value described is sorted, obtains the maximal value in all score values;
The scene characteristic corresponding according to described maximal value determines the scene switching action for performing, and shows the voice content of the scene characteristic corresponding with described maximal value.
7. method according to claim 6, is characterized in that, the described scene characteristic corresponding according to described maximal value comprises the step that described demand information is responded:
If a scene characteristic in the corresponding described scene of described maximal value, responds described demand information according to this scene characteristic;
If the plural proper vector in the corresponding described scene of described maximal value, clarifies described demand information according to described plural proper vector;
If the outer feature of the corresponding described scene of described maximal value and the scene characteristic in described scene, the scene characteristic in described scene outward feature and described scene is confirmed.
8. method according to claim 7, is characterized in that, describedly comprises the step that described demand information is clarified according to described plural scene characteristic:
The difference of the scene vector that at least plural scene characteristic is corresponding described in acquisition;
Obtain the exponent arithmetic of described difference, determine to clarify described plural scene characteristic according to exponent arithmetic result.
9. many scene managements device of voice dialogue, is characterized in that, described device comprises:
First acquisition module, for obtaining the demand information of user's input from text message, wherein, described text message carries out text identification and obtains from the voice messaging of described user;
Second acquisition module, for obtaining each at least one score value self-corresponding of at least one scene in scene according to described demand information;
Handover module, for determining the scene switching action for performing according at least one score value described, and shows the voice content corresponding with the scene after switching.
10. device according to claim 9, is characterized in that, described second acquisition module comprises:
First acquiring unit, for obtaining at least one scene in scene according to described demand information, and obtains at least one proper vector corresponding respectively with at least one scene described;
Second acquisition unit, for obtaining the inner product of at least one proper vector described and each self-corresponding weight vectors, obtains at least one score value.
11. devices according to claim 10, is characterized in that, described first acquiring unit comprises:
Scene classification subelement, for carrying out scene classification according to described demand information to described voice dialogue, obtain described demand information at least one scene in the scene that is suitable for;
Feature extraction subelement, for carrying out scene characteristic extraction according at least one scene described to described demand information, obtains at least one proper vector corresponding respectively with at least one scene described.
12. devices according to claim 11, is characterized in that, if also get feature scene from described demand information, described device also comprises:
3rd acquisition module, for obtaining the proper vector that scene confirms action according to the outer feature of described scene and at least one scene characteristic described from least one scene characteristic described.
13. devices according to claim 9, is characterized in that, described device also comprises:
4th acquisition module, for obtaining the target signature of at least one scene described in crowd's survey process, carries out taking turns voice training to described target signature by statistical model more;
5th acquisition module, for when described statistical model has sticgastuc deicision, obtains the initial value of described weight vectors.
14. according to the arbitrary described device of claim 9-13, and it is characterized in that, described handover module comprises:
Sequencing unit, for sorting at least one score value described, obtains the maximal value in all score values;
Determining unit, determines the scene switching action for performing for the scene characteristic corresponding according to described maximal value, and shows the voice content of the scene characteristic corresponding with described maximal value.
15. devices according to claim 14, is characterized in that, described determining unit comprises:
First responds subelement, if for a scene characteristic in the corresponding described scene of described maximal value, responded described demand information according to this scene characteristic;
Second responds subelement, if for the plural proper vector in the corresponding described scene of described maximal value, clarified described demand information according to described plural proper vector;
3rd responds subelement, if for the scene characteristic in the described maximal value outer feature of corresponding described scene and described scene, confirmed the scene characteristic in the outer feature of described scene and described scene.
16. devices according to claim 15, is characterized in that, the described 3rd responds subelement comprises:
Difference obtains subelement, for the difference of scene vector corresponding at least plural scene characteristic described in obtaining;
Clarification subelement, for obtaining the exponent arithmetic of described difference, determines to clarify described plural scene characteristic according to exponent arithmetic result.
CN201410590076.4A 2014-10-28 2014-10-28 A kind of more scene management method and devices of voice dialogue Active CN104464733B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410590076.4A CN104464733B (en) 2014-10-28 2014-10-28 A kind of more scene management method and devices of voice dialogue

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410590076.4A CN104464733B (en) 2014-10-28 2014-10-28 A kind of more scene management method and devices of voice dialogue

Publications (2)

Publication Number Publication Date
CN104464733A true CN104464733A (en) 2015-03-25
CN104464733B CN104464733B (en) 2019-09-20

Family

ID=52910681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410590076.4A Active CN104464733B (en) 2014-10-28 2014-10-28 A kind of more scene management method and devices of voice dialogue

Country Status (1)

Country Link
CN (1) CN104464733B (en)

Cited By (54)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105913039A (en) * 2016-04-26 2016-08-31 北京光年无限科技有限公司 Visual-and-vocal sense based dialogue data interactive processing method and apparatus
CN106023991A (en) * 2016-05-23 2016-10-12 丽水学院 Handheld voice interaction device and interaction method orienting to multi-task interaction
CN106528522A (en) * 2016-08-26 2017-03-22 南京威卡尔软件有限公司 Scenarized semantic comprehension and dialogue generation method and system
CN107026940A (en) * 2017-05-18 2017-08-08 北京神州泰岳软件股份有限公司 A kind of method and apparatus for determining session feedback information
CN107133349A (en) * 2017-05-24 2017-09-05 北京无忧创新科技有限公司 One kind dialogue robot system
CN107169034A (en) * 2017-04-19 2017-09-15 畅捷通信息技术股份有限公司 A kind of method and system of many wheel man-machine interactions
CN107292696A (en) * 2017-04-27 2017-10-24 深圳虫门科技有限公司 A kind of automobile intelligent purchase guiding system and implementation method
CN107357855A (en) * 2017-06-29 2017-11-17 北京神州泰岳软件股份有限公司 Support the intelligent answer method and device of scene relating
CN107437143A (en) * 2017-07-26 2017-12-05 携程计算机技术(上海)有限公司 Tourism ordering method, system, equipment and storage medium based on virtual customer service
CN107644641A (en) * 2017-07-28 2018-01-30 深圳前海微众银行股份有限公司 Session operational scenarios recognition methods, terminal and computer-readable recording medium
CN108369574A (en) * 2015-09-30 2018-08-03 苹果公司 Smart machine identifies
CN108475404A (en) * 2016-01-25 2018-08-31 索尼公司 Communication system and communication control method
CN108628908A (en) * 2017-03-24 2018-10-09 北京京东尚科信息技术有限公司 The method, apparatus and electronic equipment of sorted users challenge-response boundary
CN108874967A (en) * 2018-06-07 2018-11-23 腾讯科技(深圳)有限公司 Dialogue state determines method and device, conversational system, terminal, storage medium
CN109086282A (en) * 2017-06-14 2018-12-25 杭州方得智能科技有限公司 A kind of method and system for the more wheels dialogue having multitask driving capability
CN109086419A (en) * 2018-08-07 2018-12-25 广州小鹏汽车科技有限公司 A kind of social communication method and system distributed based on scene and voice
CN109754806A (en) * 2019-03-21 2019-05-14 问众智能信息科技(北京)有限公司 A kind of processing method, device and the terminal of more wheel dialogues
CN110837543A (en) * 2019-10-14 2020-02-25 深圳和而泰家居在线网络科技有限公司 Conversation interaction method, device and equipment
CN110880324A (en) * 2019-10-31 2020-03-13 北京大米科技有限公司 Voice data processing method and device, storage medium and electronic equipment
CN111161739A (en) * 2019-12-28 2020-05-15 科大讯飞股份有限公司 Speech recognition method and related product
CN113450786A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Network model obtaining method, information processing method, device and electronic equipment
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11360577B2 (en) 2018-06-01 2022-06-14 Apple Inc. Attention aware virtual assistant dismissal
CN115083412A (en) * 2022-08-11 2022-09-20 科大讯飞股份有限公司 Voice interaction method and related device, electronic equipment and storage medium
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11487364B2 (en) 2018-05-07 2022-11-01 Apple Inc. Raise to speak
US11538469B2 (en) 2017-05-12 2022-12-27 Apple Inc. Low-latency intelligent automated assistant
US11550542B2 (en) 2015-09-08 2023-01-10 Apple Inc. Zero latency digital assistant
US11557310B2 (en) 2013-02-07 2023-01-17 Apple Inc. Voice trigger for a digital assistant
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
US11657820B2 (en) 2016-06-10 2023-05-23 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11675491B2 (en) 2019-05-06 2023-06-13 Apple Inc. User configurable task triggers
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11699448B2 (en) 2014-05-30 2023-07-11 Apple Inc. Intelligent assistant for home automation
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11749275B2 (en) 2016-06-11 2023-09-05 Apple Inc. Application integration with a digital assistant
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11783815B2 (en) 2019-03-18 2023-10-10 Apple Inc. Multimodality in digital assistant systems
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11809886B2 (en) 2015-11-06 2023-11-07 Apple Inc. Intelligent automated assistant in a messaging environment
US11810562B2 (en) 2014-05-30 2023-11-07 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
US11838579B2 (en) 2014-06-30 2023-12-05 Apple Inc. Intelligent automated assistant for TV user interactions
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11842734B2 (en) 2015-03-08 2023-12-12 Apple Inc. Virtual assistant activation
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11893992B2 (en) 2018-09-28 2024-02-06 Apple Inc. Multi-modal inputs for voice commands
US11900923B2 (en) 2018-05-07 2024-02-13 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11900936B2 (en) 2008-10-02 2024-02-13 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060271351A1 (en) * 2005-05-31 2006-11-30 Danilo Mirkovic Dialogue management using scripts
CN101067930A (en) * 2007-06-07 2007-11-07 深圳先进技术研究院 Intelligent audio frequency identifying system and identifying method
CN102831446A (en) * 2012-08-20 2012-12-19 南京邮电大学 Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping)
CN103187058A (en) * 2011-12-28 2013-07-03 上海博泰悦臻电子设备制造有限公司 Speech conversational system in vehicle
CN103413549A (en) * 2013-07-31 2013-11-27 深圳创维-Rgb电子有限公司 Voice interaction method and system and interaction terminal
CN103456301A (en) * 2012-05-28 2013-12-18 中兴通讯股份有限公司 Ambient sound based scene recognition method and device and mobile terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060271351A1 (en) * 2005-05-31 2006-11-30 Danilo Mirkovic Dialogue management using scripts
CN101067930A (en) * 2007-06-07 2007-11-07 深圳先进技术研究院 Intelligent audio frequency identifying system and identifying method
CN103187058A (en) * 2011-12-28 2013-07-03 上海博泰悦臻电子设备制造有限公司 Speech conversational system in vehicle
CN103456301A (en) * 2012-05-28 2013-12-18 中兴通讯股份有限公司 Ambient sound based scene recognition method and device and mobile terminal
CN102831446A (en) * 2012-08-20 2012-12-19 南京邮电大学 Image appearance based loop closure detecting method in monocular vision SLAM (simultaneous localization and mapping)
CN103413549A (en) * 2013-07-31 2013-11-27 深圳创维-Rgb电子有限公司 Voice interaction method and system and interaction terminal

Cited By (75)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11671920B2 (en) 2007-04-03 2023-06-06 Apple Inc. Method and system for operating a multifunction portable electronic device using voice-activation
US11900936B2 (en) 2008-10-02 2024-02-13 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US11321116B2 (en) 2012-05-15 2022-05-03 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11862186B2 (en) 2013-02-07 2024-01-02 Apple Inc. Voice trigger for a digital assistant
US11557310B2 (en) 2013-02-07 2023-01-17 Apple Inc. Voice trigger for a digital assistant
US11699448B2 (en) 2014-05-30 2023-07-11 Apple Inc. Intelligent assistant for home automation
US11810562B2 (en) 2014-05-30 2023-11-07 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US11838579B2 (en) 2014-06-30 2023-12-05 Apple Inc. Intelligent automated assistant for TV user interactions
US11842734B2 (en) 2015-03-08 2023-12-12 Apple Inc. Virtual assistant activation
US11947873B2 (en) 2015-06-29 2024-04-02 Apple Inc. Virtual assistant for media playback
US11954405B2 (en) 2015-09-08 2024-04-09 Apple Inc. Zero latency digital assistant
US11853536B2 (en) 2015-09-08 2023-12-26 Apple Inc. Intelligent automated assistant in a media environment
US11550542B2 (en) 2015-09-08 2023-01-10 Apple Inc. Zero latency digital assistant
US11809483B2 (en) 2015-09-08 2023-11-07 Apple Inc. Intelligent automated assistant for media search and playback
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
CN108369574A (en) * 2015-09-30 2018-08-03 苹果公司 Smart machine identifies
CN108369574B (en) * 2015-09-30 2021-06-11 苹果公司 Intelligent device identification
US11809886B2 (en) 2015-11-06 2023-11-07 Apple Inc. Intelligent automated assistant in a messaging environment
CN108475404A (en) * 2016-01-25 2018-08-31 索尼公司 Communication system and communication control method
US11295736B2 (en) 2016-01-25 2022-04-05 Sony Corporation Communication system and communication control method
CN105913039A (en) * 2016-04-26 2016-08-31 北京光年无限科技有限公司 Visual-and-vocal sense based dialogue data interactive processing method and apparatus
CN105913039B (en) * 2016-04-26 2020-08-18 北京光年无限科技有限公司 Interactive processing method and device for dialogue data based on vision and voice
CN106023991A (en) * 2016-05-23 2016-10-12 丽水学院 Handheld voice interaction device and interaction method orienting to multi-task interaction
CN106023991B (en) * 2016-05-23 2019-12-03 丽水学院 A kind of hand-held voice interaction device and exchange method towards multitask interaction
US11657820B2 (en) 2016-06-10 2023-05-23 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US11749275B2 (en) 2016-06-11 2023-09-05 Apple Inc. Application integration with a digital assistant
US11809783B2 (en) 2016-06-11 2023-11-07 Apple Inc. Intelligent device arbitration and control
CN106528522A (en) * 2016-08-26 2017-03-22 南京威卡尔软件有限公司 Scenarized semantic comprehension and dialogue generation method and system
CN108628908A (en) * 2017-03-24 2018-10-09 北京京东尚科信息技术有限公司 The method, apparatus and electronic equipment of sorted users challenge-response boundary
CN108628908B (en) * 2017-03-24 2021-02-26 北京京东尚科信息技术有限公司 Method, device and electronic equipment for classifying user question-answer boundaries
CN107169034A (en) * 2017-04-19 2017-09-15 畅捷通信息技术股份有限公司 A kind of method and system of many wheel man-machine interactions
CN107169034B (en) * 2017-04-19 2020-08-04 畅捷通信息技术股份有限公司 Multi-round human-computer interaction method and system
CN107292696A (en) * 2017-04-27 2017-10-24 深圳虫门科技有限公司 A kind of automobile intelligent purchase guiding system and implementation method
US11467802B2 (en) 2017-05-11 2022-10-11 Apple Inc. Maintaining privacy of personal information
US11862151B2 (en) 2017-05-12 2024-01-02 Apple Inc. Low-latency intelligent automated assistant
US11837237B2 (en) 2017-05-12 2023-12-05 Apple Inc. User-specific acoustic models
US11538469B2 (en) 2017-05-12 2022-12-27 Apple Inc. Low-latency intelligent automated assistant
US11580990B2 (en) 2017-05-12 2023-02-14 Apple Inc. User-specific acoustic models
CN107026940A (en) * 2017-05-18 2017-08-08 北京神州泰岳软件股份有限公司 A kind of method and apparatus for determining session feedback information
CN107133349A (en) * 2017-05-24 2017-09-05 北京无忧创新科技有限公司 One kind dialogue robot system
CN107133349B (en) * 2017-05-24 2018-02-23 北京无忧创新科技有限公司 One kind dialogue robot system
CN109086282A (en) * 2017-06-14 2018-12-25 杭州方得智能科技有限公司 A kind of method and system for the more wheels dialogue having multitask driving capability
CN107357855B (en) * 2017-06-29 2018-06-08 北京神州泰岳软件股份有限公司 Support the intelligent answer method and device of scene relating
CN107357855A (en) * 2017-06-29 2017-11-17 北京神州泰岳软件股份有限公司 Support the intelligent answer method and device of scene relating
CN107437143A (en) * 2017-07-26 2017-12-05 携程计算机技术(上海)有限公司 Tourism ordering method, system, equipment and storage medium based on virtual customer service
CN107644641B (en) * 2017-07-28 2021-04-13 深圳前海微众银行股份有限公司 Dialog scene recognition method, terminal and computer-readable storage medium
CN107644641A (en) * 2017-07-28 2018-01-30 深圳前海微众银行股份有限公司 Session operational scenarios recognition methods, terminal and computer-readable recording medium
US11487364B2 (en) 2018-05-07 2022-11-01 Apple Inc. Raise to speak
US11900923B2 (en) 2018-05-07 2024-02-13 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11907436B2 (en) 2018-05-07 2024-02-20 Apple Inc. Raise to speak
US11630525B2 (en) 2018-06-01 2023-04-18 Apple Inc. Attention aware virtual assistant dismissal
US11360577B2 (en) 2018-06-01 2022-06-14 Apple Inc. Attention aware virtual assistant dismissal
CN108874967B (en) * 2018-06-07 2023-06-23 腾讯科技(深圳)有限公司 Dialogue state determining method and device, dialogue system, terminal and storage medium
CN108874967A (en) * 2018-06-07 2018-11-23 腾讯科技(深圳)有限公司 Dialogue state determines method and device, conversational system, terminal, storage medium
CN109086419A (en) * 2018-08-07 2018-12-25 广州小鹏汽车科技有限公司 A kind of social communication method and system distributed based on scene and voice
CN109086419B (en) * 2018-08-07 2020-11-13 广州小鹏汽车科技有限公司 Social communication method and system based on scene and voice distribution
US11893992B2 (en) 2018-09-28 2024-02-06 Apple Inc. Multi-modal inputs for voice commands
US11783815B2 (en) 2019-03-18 2023-10-10 Apple Inc. Multimodality in digital assistant systems
CN109754806A (en) * 2019-03-21 2019-05-14 问众智能信息科技(北京)有限公司 A kind of processing method, device and the terminal of more wheel dialogues
US11705130B2 (en) 2019-05-06 2023-07-18 Apple Inc. Spoken notifications
US11675491B2 (en) 2019-05-06 2023-06-13 Apple Inc. User configurable task triggers
US11888791B2 (en) 2019-05-21 2024-01-30 Apple Inc. Providing message response suggestions
US11790914B2 (en) 2019-06-01 2023-10-17 Apple Inc. Methods and user interfaces for voice-based control of electronic devices
CN110837543A (en) * 2019-10-14 2020-02-25 深圳和而泰家居在线网络科技有限公司 Conversation interaction method, device and equipment
CN110880324A (en) * 2019-10-31 2020-03-13 北京大米科技有限公司 Voice data processing method and device, storage medium and electronic equipment
CN111161739A (en) * 2019-12-28 2020-05-15 科大讯飞股份有限公司 Speech recognition method and related product
CN111161739B (en) * 2019-12-28 2023-01-17 科大讯飞股份有限公司 Speech recognition method and related product
CN113450786A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Network model obtaining method, information processing method, device and electronic equipment
US11914848B2 (en) 2020-05-11 2024-02-27 Apple Inc. Providing relevant data items based on context
US11924254B2 (en) 2020-05-11 2024-03-05 Apple Inc. Digital assistant hardware abstraction
US11765209B2 (en) 2020-05-11 2023-09-19 Apple Inc. Digital assistant hardware abstraction
US11838734B2 (en) 2020-07-20 2023-12-05 Apple Inc. Multi-device audio adjustment coordination
US11696060B2 (en) 2020-07-21 2023-07-04 Apple Inc. User identification using headphones
US11750962B2 (en) 2020-07-21 2023-09-05 Apple Inc. User identification using headphones
CN115083412A (en) * 2022-08-11 2022-09-20 科大讯飞股份有限公司 Voice interaction method and related device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN104464733B (en) 2019-09-20

Similar Documents

Publication Publication Date Title
CN104464733A (en) Multi-scene managing method and device of voice conversation
CN105487663B (en) A kind of intension recognizing method and system towards intelligent robot
CN109918673A (en) Semantic referee method, device, electronic equipment and computer readable storage medium
CN110209789B (en) Multi-modal dialog system and method for guiding user attention
CN111339306A (en) Classification model training method, classification device, classification equipment and medium
CN110704641A (en) Ten-thousand-level intention classification method and device, storage medium and electronic equipment
CN110781663B (en) Training method and device of text analysis model, text analysis method and device
EP3825862A2 (en) Method and apparatus of recommending information based on fused relationship network, and device and medium
CN110245348A (en) A kind of intension recognizing method and system
KR102502985B1 (en) Method for recommending object, neural network and training method thereof, device, and medium
CN109408710A (en) Search result optimization method, device, system and storage medium
CN116108169B (en) Hot wire work order intelligent dispatching method based on knowledge graph
CN114548298B (en) Model training method, traffic information processing method, device, equipment and storage medium
CN113449084A (en) Relationship extraction method based on graph convolution
CN111368040B (en) Dialogue processing method, model training method and related equipment
CN110263232B (en) Hybrid recommendation method based on extensive learning and deep learning
CN116522905B (en) Text error correction method, apparatus, device, readable storage medium, and program product
CN117216212A (en) Dialogue processing method, dialogue model training method, device, equipment and medium
CN106844732A (en) The method that automatic acquisition is carried out for the session context label that cannot directly gather
CN112818689A (en) Entity identification method, model training method and device
CN116957128A (en) Service index prediction method, device, equipment and storage medium
CN115688758A (en) Statement intention identification method and device and storage medium
CN113312445B (en) Data processing method, model construction method, classification method and computing equipment
CN113344121B (en) Method for training a sign classification model and sign classification
CN113886543A (en) Method, apparatus, medium, and program product for generating an intent recognition model

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant