WO1993021586A1 - Autonomous learning and reasoning agent - Google Patents

Autonomous learning and reasoning agent Download PDF

Info

Publication number
WO1993021586A1
WO1993021586A1 PCT/US1993/003557 US9303557W WO9321586A1 WO 1993021586 A1 WO1993021586 A1 WO 1993021586A1 US 9303557 W US9303557 W US 9303557W WO 9321586 A1 WO9321586 A1 WO 9321586A1
Authority
WO
WIPO (PCT)
Prior art keywords
case
environment
cases
response
message
Prior art date
Application number
PCT/US1993/003557
Other languages
French (fr)
Inventor
Bradley Paul Allen
Original Assignee
Inference Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inference Corporation filed Critical Inference Corporation
Publication of WO1993021586A1 publication Critical patent/WO1993021586A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Definitions

  • This invention relates to case-based reasoning and to a case-based reasoning system which performs autonomous learning in a real-world environment.
  • rule-based systems of this type present a difficult • programming task. Unlike more prosaic programming tasks, constructing a rule base is sometimes counterintuitive, and may be beyond the ability of many application programmers. And once a rule-based system has been constructed based on the knowledge of a human expert, it may be difficult to accommodate changes in the field of operation in which the processor must operate. Such changes might comprise advances in knowledge about the application field, additional tasks which are intended for the processor, or changes in or discoveries about the scope of the application field.
  • One proposed method of the prior art is to -build automated reasoning systems which operate by reference to a set of exemplar cases (a "case base"), to which the facts of a particular situation (the "problem") may be matched.
  • the processor may then perform the same action for the problem as in the exemplar case. While this proposal has been well-received, case-based systems of this type may still require a substantial amount of human effort to identify exemplar cases and present a processor with sufficient information that cases may be matched and acted upon. For example, it may be necessary to deduce or supply extensive information about a complex environment so as to determine a preferred set of exemplar cases.
  • An aspect of the inventions disclosed in that application also includes a technique in which a system may be set to work with a limited case base, and may solicit human advice for treatment o£ new problems which are not already well- treated by the case base, thus learning how to do its job on a dynamic basis.
  • 195/0108 filed the same day as this application, discloses inventions in which a machine learning system may operate in conjunction with a relational database system, and particularly in which a machine learning system may operate in conjunction with a relational database system with an SQL interface.
  • This allows the machine learning system to use high-speed searching power of computer systems which have been designed for use with relational database systems with an SQL interface, and allows the machine learning system to be smoothly integrated into computer systems which have relational -databases, even if those databases were not designed to work with learning or reasoning systems of any kind.
  • an automated reasoning system could dynamically create its own case base in response to problems which it encounters, thus learning how to do its job on a dynamic basis and without substantial human intervention, or at least with only occasional human intervention. Limited intervention would allow an automated reasoning system to examine a larger set of cases and to determine a preferred set of exemplar cases without an external agent, such as a human operator, having to deduce or supply extensive information about a complex environment.
  • an automated reasoning system could operate autonomously in a complex environment, possibly with external intervention such as positive or negative reinforcing stimuli.
  • External stimuli might be in response to a result of the system's attempts to manipulate its environment, or might be provided by an external agent, such as a human operator. Accordingly, it is an object of the invention to provide an automated reasoning system which does not require intervention for every case.
  • the invention provides a software agent which performs autonomous learning in a real-world environment.
  • the autonomous agent may learn by reinforcement (including positive and negative, and delayed and sporadic, reinforcement) , in addition to learning by example and learning by being told what to do.
  • the autonomous agent may be implemented in a case-based reasoning system, which may be coupled to a sensor for gathering information from, and to an effector for manipulating, its environment (which may comprise a software environment, a physical environment, or some combination thereof) .
  • the autonomous agent may tune that case base in response to an evaluation of how well it is operating in that environment.
  • the evaluation may be its own, or may be in response to a stimulus such as a reward or punishment.
  • the autonomous agent may take action to gather information so as to determine which cases are most appropriate to that problem.
  • the autonomous agent may comprise a memory of cases, the contents of that memory being determined by a genetic technique for producing, evaluating and selecting cases. New cases may be produced by inspection of scenarios from the environment, by mutation of old cases, or by combining or selecting features from old cases; thus the memory may comprise cases the autonomous agent has never encountered in the environment.
  • the stored cases may be evaluated in response to a history of previous matches and in response to an external stimulus, and evaluations (such as measures of accuracy and utility) may be associated with stored cases.
  • the contents of the memory may be limited to a set of cases which provides a preferred model of the environment, such as those cases which have the better evaluations.
  • the autonomous agent may comprise a selection technique based on multiple factors, such as match quality, case accuracy, or case utility.
  • the selection technique may also induce experimentation by the autonomous agent, such as by employing a random or pseudorandom effect in selecting cases.
  • the selection technique may also distinguish between those actions which solve problems and those actions which gather further information so as to better solve problems.
  • Multiple autonomous agents may form a collective entity and may cooperate to select an action to be performed by that collective entity.
  • Figure 1 shows a block diagram of an autonomous agent embedded in a complex environment.
  • Figure 2 shows a block diagram of a behavior module of an autonomous agent.
  • Figure 3 shows a data flow diagram of a genetic technique for producing, evaluating and selecting cases.
  • Figure 4 shows a data flow diagram of a technique for selecting cases in a memory.
  • Figure 5 shows a block diagram of an intelligent office equipment device including an autonomous agent.
  • Figure 6 shows a block diagram of a customer service system with a help-desk system including an autonomous agent.
  • Figure 7 shows a block diagram of a knowledge discovery system including an autonomous agent.
  • Appendix A shows an example software environment and autonomous agent for distinguishing between classes of irises.
  • FIG. 1 shows a block diagram of an autonomous agent embedded in a complex environment.
  • a software agent 101 may be embedded in an environment 102 so that the agent 101 may receive a stimulus 103 from the environment 102 by receiving a stimulus message 104, and may perform an action 105 which affects the environment 102 by sending an action message 106.
  • the stimulus message 104 and the action message 106 may each comprise a manipulable software object 107 and may transmit that software object 107 from source to destination.
  • Software objects 107 may comprise data elements 108 and relations to other software objects 107 as is well known in the art. Object-oriented systems are more fully described in "Object-Oriented
  • the environment 102 may comprise either a software environment or a physical environment, or some combination thereof.
  • the environment 102 may comprise a physical room; the agent 101 may receive a LIGHTS stimulus message 104 telling if a set of lights in the room is off or on; and the agent 101 may send a LIGHTS-ON action message 106 to turn on the lights.
  • a set of receptors 109 may receive the stimulus message 104 and generate a features message 110.
  • the receptors 109 may map a set of aspects or features 111 of the environment 102 (especially those relating to external stimuli) into a set of feature objects 112, which may comprise software objects 107 which describe the aspects or features 111 of the environment 102.
  • the environment 102 may comprise an oil refinery and the feature objects 112 may comprise temperature and pressure values at various positions in the refinery.
  • the set of receptors 109 may also generate a reward message 113, which may comprise a scalar-valued reinforcement 114 which measures what success the agent 101 is having with the environment 102.
  • the value of the reinforcement 114 may be determined by the receptors 109, or may be externally supplied, such as from a human operator 115.
  • the environment 102 may comprise a loan portfolio database and the reinforcement 114 may measure loan performance, or the environment 102 may comprise a music audio database and the reinforcement 114 may be entered by the human operator 115 after listening to selections made by the agent 101.
  • a set of proprioceptors 116 may generate a motives message 117, which may comprise software objects 107 which describe goals or needs of the agent 101.
  • the proprioceptors 116 may map aspects or features of the environment 102 (especially those relating to internal states of the agent 101 itself) into those goals or needs.
  • the goals or needs of the agent 101 itself are thus a kind of feature 111 which the agent 101 may consider, and are similar to the features 111 which are reported by the receptors 109.
  • the environment 102 may comprise a piece of office equipment such as a photocopier and the motives message 117 may comprise internally generated features 111 such as the need for new ink or paper.
  • the proprioceptors 116 may be altered by a designer so as to change the essential goals of the agent 101.
  • a behavior module 118 may receive the motives message 117/ the features message 110 and the reward message 113, and may generate a queries message 119, which may comprise software objects 107 which describe a request 120 by the agent 101 for further information, and a commands message 121, which may comprise software objects 107 which describe a command 122 by the agent 101 to affect the environment 102.
  • the behavior module 118 may comprise one like that disclosed with figure 2.
  • a set of effectors 123 may receive the queries message 119 and the commands message 121 and generate the action message 106.
  • the action message 106 may be coupled to the environment 102 (e.g., a physical device or another software element) and may cause an effect in the environment 102.
  • the environment 102 may comprise a chess program and the action message 106 may direct the chess program to make a particular move on behalf of the agent 101.
  • the software agent 101 may be implemented with an automated processor 124, which may execute a software inference engine 125 for reasoning using a set of cases 126 in a case base 127 and a set of rules 128 in a rule base 129.
  • the processor 124 may comprise a system having a processor, memory comprising a stored program, memory comprising data, and input/output devices, as is well known in the art.
  • the operation and software structures of this system are described herein in terms of their functions and at a level of detail which would be clear to those of ordinary skill in the art. It would be clear to anyone of ordinary skill in the art, after perusal of the specification, drawings and claims herein, that modification and/or programming (using known programming techniques) of a processor of known design to achieve these functions would be a straightforward task and would not require undue experimentation.
  • the processor 124 may comprise an IBM-compatible PC configured to be able to execute the Microsoft Windows 3.0 and DOS 3.1 software, and having a hard disk drive, a mouse, and a VGA display. At least a 286 processor with four megabytes of memory is preferred; a 386 processor with eight megabytes of memory is more preferred.
  • the Microsoft Windows 3.0 software is preferably executed in 386 enhanced mode.
  • Figure 2 shows a block diagram of a behavior module of an autonomous agent.
  • a memory module 201 may receive the motives message 117, the features message 110 and the reward message 113, and may generate a set of new cases 202, a set of case update messages 203, and a set of case retrieval messages 204.
  • the new cases 202 may be transmitted to a case database 205 for storage, and to a case index 206 for indexing.
  • aspects of a case- based reasoning system like that disclosed in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991, consistent with the functions disclosed ' herein, may be used for case indexing and matching.
  • the case database 205 may store the actual cases 126; the case index 206 may comprise a mapping from the features 111 to the cases 126 themselves.
  • the case database 205 may comprise a limited number of stored cases 126, the contents of the case database 205 being determined by a genetic technique for producing, evaluating and selecting the cases 126, such as a genetic technique like that disclosed with figure 3.
  • a genetic technique for producing, evaluating and selecting the cases 126 such as a genetic technique like that disclosed with figure 3.
  • the case database 205 might store less than all of the cases 126 which the autonomous agent 101 has encountered, it might maintain a set of cases 126 which provides a preferred model of the environment, such as those cases 126 which allow the agent 101 to distinguish variant problem scenarios and to act autonomously and intelligently in those variant problem scenarios.
  • the case index 206 may comprise an index of the cases stored in the case database 205 (e.g., a set of case identifiers 207) , organized so that cases 126 may be matched and retrieved, and may respond to the case retrieval message 204 by providing a matches message 208.
  • the matches message 208 may comprise the case identifiers 207, and other information which a selector 209 may require.
  • the selector 209 may receive the matches message 208 and may also receive a cases message 210 from the case database 205, and may generate the queries message 119 and
  • the selector 209 may employ a technique for selecting cases like that disclosed with figure 4.
  • the cases message 210 may comprise data from the cases 126, including evaluation data such as accuracy and utility values.
  • Figure 3 shows a data flow diagram of a genetic technique for producing, evaluating and selecting cases.
  • the case database 205 may comprise its current set of cases 126, each of which may comprise a features element 301, which may generally indicate when the case 126 may be useful, and an action element 302, which may indicate the action 105 to take and an evaluation 303 of that action 105.
  • the evaluation 303 may comprise an accuracy value 304, which may indicate how "good” the case 126 generally is when used, and a utility value 305, which may indicate how "often" the case 126 generally is usable.
  • the utility value 305 of a hammer may be high, because it is often usable, even though its accuracy value 304 indicates that, even when the hammer is usable, it is not always the best choice.
  • the utility value 305 of a plane may be low, because it is only used in specialized situations, even though its accuracy value 304 indicates that whenever it is usable, it is the correct tool to choose.
  • An evaluation module 306 may receive the reward message 113 and a history message 307 (indicating the history of matches) , and may operate on the cases 126 to adjust their evaluations 303, particularly their accuracy values 304 and their utility values 305.
  • the evaluation module 306 may respond to the reward message 113 by altering the utility values 305 of the cases 126 to "reinforce" those cases 126 which correspond to the action which resulted in the reinforcement 114. Thus, rewards are “credited” to the cases 126 which earned them. Moreover, the evaluation module 306 may also alter the utility values 305 of those cases 126 which correspond to the action just previous to the reinforced action as well. Thus, rewards are also credited to the cases 126 which "led up to” them.
  • the evaluation module 306 may alter the utility value 305 of each case 126 by adding the reinforcement 114 and the utility value 305 of the case 126 which is the "best match" for the next action.
  • utility value (time t) reinforcement + utility value (time t+1)
  • a reproduction module 308 may operate on the cases 126 to adjust their features elements 301 or the action 105 in their action elements 302, by one or more of several techniques. For example, the reproduction module 308 may create and delete cases 126. The reproduction module 308 may create cases 126, for example by inserting new cases 126 into the case database 205 as they are encountered. One method for such instance-based learning is disclosed in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991. The reproduction module 308 may delete cases 126, for example when their accuracy values 304 or their utility values 305 fall below a threshold.
  • the reproduction module 308 may also make new cases 126 from the old cases 126 in the case database 205.
  • the reproduction module 308 may mutate cases 126, for example by altering one or more features 111 in one case 126 in the case database 205.
  • the reproduction module 308 may also cross-pollinate cases 126, for example by selecting some features 111 from one old case 126 and some features 111 from another old case 126 to create one or more new cases 126.
  • the operation of the evaluation module 306 and the reproduction module 308 serve to continually review and update "the selection of cases 126 in the case database 205, so that the case database 205 is continually altered into a progressively better set of cases 126.
  • the genetic technique is influenced by the reward message 113, which provides a form of evolutionary pressure towards selection of those sets of cases 126 in the case database 205 which maximize rewards.
  • Figure 4 shows a data flow diagram of a technique for selecting cases in a memory.
  • a matching module 401 may review the cases 126 in the case database 205 and may generate a match table 402.
  • a technique for selecting cases 126 may employ techniques for matching features 111 such as attribute-value pairs, and for generating the match table 402, like those disclosed in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991, consistent with the functions disclosed herein.
  • the match table 402 may comprise a set of cases 126 (or indices of cases 126), each having a match quality value 403, the accuracy value 304 of that case 126, and the utility value 305 of that case 126.
  • a randomization module 404 may choose one of the cases 126 based on a random or pseudorandom effect.
  • random effects include pseudorandom effects and related methods which may be used to achieve similar results.
  • the randomization module 404 may choose one of the cases 126 in the match table 402 with a probability of choice for each case 126 which is linearly proportional to its accuracy value 304.
  • the cases 126 in the match table 402 are therefore chosen for matching based on their match quality values 403, but they are selected for action based on their accuracy values 304. This allows the technique for selection to encourage experimentation by the agent 101.
  • the randomization module 404 may employ other and further techniques for choosing one of the cases 126 in the match table 402. For example, another measure associated with the cases 126 (such as their utility value 305) , or a combination of values associated with the cases 126 might be employed in place of the accuracy value 304. Moreover, the probability of choice may be other than linearly proportional to the accuracy value 304. It would be clear to one of ordinary skill in the art that such other and further techniques would be workable, and are within the scope and spirit of the invention.
  • Figure 5 shows a block diagram of an intelligent office equipment device including an autonomous agent.
  • a device 501 such as a photocopier or printer, may be coupled to a device driver interface 502, which may be coupled to a control processor 503 including the autonomous agent 101.
  • the device 501 may of course be one of many different devices, such as a fax machine, a modem, or a telephone.
  • the device driver interface 502 provides the stimulus message 104 to the control processor 503, which delivers the stimulus message 104 to the autonomous agent 101.
  • the autonomous agent 101 generates the action message 106, which the control processor 503 delivers to the device driver interface 502.
  • the motives message 117 may reflect a set of goals, such as eliminating or reducing the need for user involvement in maintenance of the device 501, and to anticipate failures of the device 501 so that preventative maintenance may be performed so as to reduce failures and to increase mean-time-between-failure.
  • the features message 110 may comprise sensor readings from the device 501, and commands from users and dealers.
  • the action message 106 may comprise orders to the device driver interface 502 to alter settings on the device 501, and may also comprise messages to users and dealers such as diagnostic and repair messages.
  • the reward message 113 may comprise a negative reinforcement 114 whenever the • device 501 fails.
  • Figure 6 shows a block diagram of a customer service system with a help-desk system including an autonomous agent.
  • a user 601 such as a human being, may access the system by means of a telephone connection 602.
  • a help- desk system 603 may attempt to respond to inquiries and resolve problems, with the aid of the autonomous agent 101.
  • the help-desk system 603 provides the stimulus message 104 to the autonomous agent 101.
  • the autonomous agent 101 generates the action message 106 for the help- desk system 603.
  • the help-desk system 603, its purpose and operation, may be like that shown in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991. It may receive touch-tone commands from the user 601 and may generate voice messages by means of voice response units; receiving touch-tone commands and generating voice messages are well known in the art.
  • the motives message 117 may reflect a set of goals, such as responding to user inquiries with appropriate information and properly resolving user problems, so as to provide an intelligent help-desk system 603 which focusses on the user problem and responds to that problem, and which adapts to user behavior.
  • the features message 110 may comprise touch-tone commands from the user 601, data indicating voice response unit messages given to the user 601, and state information about the telephone connection
  • the action message 106 may comprise commands to the help-desk system 603 or commands to a telephone system 604 controlling the telephone connection 602 (e.g., to transfer the telephone connection 602 to a human operator) .
  • the help-desk system 603 may consider that it has resolved the user problem when it reaches the end of its response or when it has transferred the telephone connection 602 to a human operator.
  • the reward message 113 may comprise a positive reinforcement 114 when the help-desk system 603 resolves the user problem, and a negative reinforcement 114 whenever the telephone connection 602 is broken by the user 601 prior to that.
  • Figure 7 shows a block diagram of a knowledge discovery system including an autonomous agent.
  • a knowledge database 701 may be coupled to a database interface 702, which may be coupled to a knowledge processor 703 including the autonomous agent 101.
  • the knowledge database 701 may of course be one of many different types of databases, such as an object-oriented database or a relational database.
  • the database interface 702 provides the stimulus message 104 to the knowledge processor 703, which delivers the stimulus message 104 to the autonomous agent 101.
  • the autonomous agent 101 generates the action message 106, which the knowledge processor 703 delivers to the database interface 702.
  • the motives message 117 may reflect a set of goals, such as discovering useful knowledge or patterns in the data in the knowledge database 701, so as to allow the autonomous agent 101 to generalize from data and to capture useful statements about regularities in the knowledge database 701.
  • the features message 110 may comprise statements about data in the knowledge database 701, such as schema definitions or database tables.
  • the action message 106 may comprise orders to the database interface 702 to alter or interrogate the knowledge database 701, such as database queries and transactions.
  • the reward message 113 may comprise a positive reinforcement 114 whenever the autonomous agent 101 discovers "useful" information, for some predetermined definition of "useful" (e.g., an approval message from a human operator) .
  • Appendix A shows an example software environment and autonomous agent for distinguishing between classes of irises.
  • the example software environment comprises a "trainer" for the autonomous agent 101 which determines when the autonomous agent 101 has made a correct determination and provides an appropriate reward message 113.
  • trainer for the autonomous agent 101 which determines when the autonomous agent 101 has made a correct determination and provides an appropriate reward message 113.

Abstract

A software agent (101) which performs autonomous learning in a real-world environment (102), implemented in a case-based reasoning system and coupled to a sensor (109) for gathering information (104) from, and to an effector for manipulating (105), its environment. A case base (127) which is tuned in response to an evaluation of how well the agent is operating in that environment. A memory of cases, the contents of that memory being determined by a genetic technique, including producing cases which may never have been encountered in the environment, evaluating cases in response to a history of previous matches and in response to an external stimulus, and selecting a limited set of cases which provides a preferred model of the environment.

Description

DESCRIPTION
Autonomous Learning And Reasoning Acrent Cross-Reference To Related Application
Title of the Invention
This application is a continuation-in-part of copending application Serial No. 07/ 664,561, filed March 4, 1991 in the name of inventors Bradley P. Allen and S. Daniel Lee and titled "CASE-BASED REASONING SYSTEM", hereby incorporated by reference as if fully set forth herein.
Background of the Invention
1. Field of the Invention This invention relates to case-based reasoning and to a case-based reasoning system which performs autonomous learning in a real-world environment.
2. Description of Related Art
While computers are capable of tremendous processing power, their ability to use that processing power for reasoning about complex problems has so far been limited. Generally, before a computer can be used to address a complex problem, such as one which requires the attention of a human expert, it has been necessary to distill the knowledge of that expert into a set of inferential rules
(a "rule base") which allow an automated processor to reason in a limited field of application. While this method has been effective in some cases, it has the natural drawback that it often requires a substantial amount of time and effort, by both computer software engineers and experts in the particular field of application, to produce a useful product.
Moreover, rule-based systems of this type present a difficult • programming task. Unlike more prosaic programming tasks, constructing a rule base is sometimes counterintuitive, and may be beyond the ability of many application programmers. And once a rule-based system has been constructed based on the knowledge of a human expert, it may be difficult to accommodate changes in the field of operation in which the processor must operate. Such changes might comprise advances in knowledge about the application field, additional tasks which are intended for the processor, or changes in or discoveries about the scope of the application field. One proposed method of the prior art is to -build automated reasoning systems which operate by reference to a set of exemplar cases (a "case base"), to which the facts of a particular situation (the "problem") may be matched. The processor may then perform the same action for the problem as in the exemplar case. While this proposal has been well-received, case-based systems of this type may still require a substantial amount of human effort to identify exemplar cases and present a processor with sufficient information that cases may be matched and acted upon. For example, it may be necessary to deduce or supply extensive information about a complex environment so as to determine a preferred set of exemplar cases.
A parent copending application, Serial No. 07/ 664,561, filed March 4, 1991, discloses inventions in which a case-based reasoning system is smoothly integrated into a rule-based reasoning system, and in which an automated reasoning system may dynamically adapt a case base to problems which it encounters. An aspect of the inventions disclosed in that application also includes a technique in which a system may be set to work with a limited case base, and may solicit human advice for treatment o£ new problems which are not already well- treated by the case base, thus learning how to do its job on a dynamic basis. Another copending application, Serial No. , Lyon &. Lyon Docket No. 195/018, filed the same day as this application, discloses inventions in which a machine learning system may operate in conjunction with a relational database system, and particularly in which a machine learning system may operate in conjunction with a relational database system with an SQL interface. This allows the machine learning system to use high-speed searching power of computer systems which have been designed for use with relational database systems with an SQL interface, and allows the machine learning system to be smoothly integrated into computer systems which have relational -databases, even if those databases were not designed to work with learning or reasoning systems of any kind.
It would be advantageous if an automated reasoning system could dynamically create its own case base in response to problems which it encounters, thus learning how to do its job on a dynamic basis and without substantial human intervention, or at least with only occasional human intervention. Limited intervention would allow an automated reasoning system to examine a larger set of cases and to determine a preferred set of exemplar cases without an external agent, such as a human operator, having to deduce or supply extensive information about a complex environment.
It would also be advantageous if an automated reasoning system could operate autonomously in a complex environment, possibly with external intervention such as positive or negative reinforcing stimuli. External stimuli might be in response to a result of the system's attempts to manipulate its environment, or might be provided by an external agent, such as a human operator. Accordingly, it is an object of the invention to provide an automated reasoning system which does not require intervention for every case.
Summary of the Invention The invention provides a software agent which performs autonomous learning in a real-world environment. The autonomous agent may learn by reinforcement (including positive and negative, and delayed and sporadic, reinforcement) , in addition to learning by example and learning by being told what to do. In a preferred embodiment, the autonomous agent may be implemented in a case-based reasoning system, which may be coupled to a sensor for gathering information from, and to an effector for manipulating, its environment (which may comprise a software environment, a physical environment, or some combination thereof) . In addition to gathering a case base of experience in its environment, the autonomous agent may tune that case base in response to an evaluation of how well it is operating in that environment. The evaluation may be its own, or may be in response to a stimulus such as a reward or punishment. In addition to taking action on a problem based on its case base, the autonomous agent may take action to gather information so as to determine which cases are most appropriate to that problem. In a preferred embodiment, the autonomous agent may comprise a memory of cases, the contents of that memory being determined by a genetic technique for producing, evaluating and selecting cases. New cases may be produced by inspection of scenarios from the environment, by mutation of old cases, or by combining or selecting features from old cases; thus the memory may comprise cases the autonomous agent has never encountered in the environment. The stored cases may be evaluated in response to a history of previous matches and in response to an external stimulus, and evaluations (such as measures of accuracy and utility) may be associated with stored cases. The contents of the memory may be limited to a set of cases which provides a preferred model of the environment, such as those cases which have the better evaluations.
In a preferred embodiment, the autonomous agent may comprise a selection technique based on multiple factors, such as match quality, case accuracy, or case utility. The selection technique may also induce experimentation by the autonomous agent, such as by employing a random or pseudorandom effect in selecting cases. The selection technique may also distinguish between those actions which solve problems and those actions which gather further information so as to better solve problems. Multiple autonomous agents may form a collective entity and may cooperate to select an action to be performed by that collective entity.
Brief Description of the Drawings
Figure 1 shows a block diagram of an autonomous agent embedded in a complex environment.
Figure 2 shows a block diagram of a behavior module of an autonomous agent.-
Figure 3 shows a data flow diagram of a genetic technique for producing, evaluating and selecting cases.
Figure 4 shows a data flow diagram of a technique for selecting cases in a memory. Figure 5 shows a block diagram of an intelligent office equipment device including an autonomous agent.
Figure 6 shows a block diagram of a customer service system with a help-desk system including an autonomous agent. Figure 7 shows a block diagram of a knowledge discovery system including an autonomous agent.
Appendix A shows an example software environment and autonomous agent for distinguishing between classes of irises.
Description of the Preferred Embodiment
An embodiment of this invention may be used together with inventions which are disclosed in a copending application titled "MACHINE LEARNING WITH A RELATIONAL DATABASE", application Serial No. , Lyon & Lyon Docket No. 195/018, filed the same day in the name of the same inventor, hereby incorporated by reference as if fully set forth herein.
Figure 1 shows a block diagram of an autonomous agent embedded in a complex environment. A software agent 101 may be embedded in an environment 102 so that the agent 101 may receive a stimulus 103 from the environment 102 by receiving a stimulus message 104, and may perform an action 105 which affects the environment 102 by sending an action message 106. In a preferred embodiment, the stimulus message 104 and the action message 106 may each comprise a manipulable software object 107 and may transmit that software object 107 from source to destination. Software objects 107 may comprise data elements 108 and relations to other software objects 107 as is well known in the art. Object-oriented systems are more fully described in "Object-Oriented
Design With Applications" by Grady Brooch, published by
Benjamin/Cummings Publishing, Redwood City, California
(1991) , hereby incorporated by reference as if fully set forth herein.
In a preferred embodiment, the environment 102 may comprise either a software environment or a physical environment, or some combination thereof. For example, the environment 102 may comprise a physical room; the agent 101 may receive a LIGHTS stimulus message 104 telling if a set of lights in the room is off or on; and the agent 101 may send a LIGHTS-ON action message 106 to turn on the lights. Alternatively, the environment 102 may comprise a graphic database; the agent 101 may receive a PICTURE stimulus message 104 telling about a picture in the database; and the agent 101 may send an ADD-PROPERTY action message 106 to add a property value (e.g., "author = Vermeer") to the picture.
A set of receptors 109 may receive the stimulus message 104 and generate a features message 110. In a preferred embodiment, the receptors 109 may map a set of aspects or features 111 of the environment 102 (especially those relating to external stimuli) into a set of feature objects 112, which may comprise software objects 107 which describe the aspects or features 111 of the environment 102. For example, the environment 102 may comprise an oil refinery and the feature objects 112 may comprise temperature and pressure values at various positions in the refinery.
The set of receptors 109 may also generate a reward message 113, which may comprise a scalar-valued reinforcement 114 which measures what success the agent 101 is having with the environment 102. In a preferred embodiment, the value of the reinforcement 114 may be determined by the receptors 109, or may be externally supplied, such as from a human operator 115. For example, the environment 102 may comprise a loan portfolio database and the reinforcement 114 may measure loan performance, or the environment 102 may comprise a music audio database and the reinforcement 114 may be entered by the human operator 115 after listening to selections made by the agent 101.
A set of proprioceptors 116 may generate a motives message 117, which may comprise software objects 107 which describe goals or needs of the agent 101. In a preferred embodiment, the proprioceptors 116 may map aspects or features of the environment 102 (especially those relating to internal states of the agent 101 itself) into those goals or needs. The goals or needs of the agent 101 itself are thus a kind of feature 111 which the agent 101 may consider, and are similar to the features 111 which are reported by the receptors 109. For example, the environment 102 may comprise a piece of office equipment such as a photocopier and the motives message 117 may comprise internally generated features 111 such as the need for new ink or paper. The proprioceptors 116 may be altered by a designer so as to change the essential goals of the agent 101. A behavior module 118 may receive the motives message 117/ the features message 110 and the reward message 113, and may generate a queries message 119, which may comprise software objects 107 which describe a request 120 by the agent 101 for further information, and a commands message 121, which may comprise software objects 107 which describe a command 122 by the agent 101 to affect the environment 102. In a preferred embodiment, the behavior module 118 may comprise one like that disclosed with figure 2.
A set of effectors 123 may receive the queries message 119 and the commands message 121 and generate the action message 106. In a preferred embodiment, the action message 106 may be coupled to the environment 102 (e.g., a physical device or another software element) and may cause an effect in the environment 102. For example, the environment 102 may comprise a chess program and the action message 106 may direct the chess program to make a particular move on behalf of the agent 101. In a preferred embodiment, the software agent 101 may be implemented with an automated processor 124, which may execute a software inference engine 125 for reasoning using a set of cases 126 in a case base 127 and a set of rules 128 in a rule base 129. In a preferred embodiment, the processor 124 may comprise a system having a processor, memory comprising a stored program, memory comprising data, and input/output devices, as is well known in the art. The operation and software structures of this system are described herein in terms of their functions and at a level of detail which would be clear to those of ordinary skill in the art. It would be clear to anyone of ordinary skill in the art, after perusal of the specification, drawings and claims herein, that modification and/or programming (using known programming techniques) of a processor of known design to achieve these functions would be a straightforward task and would not require undue experimentation. In a preferred embodiment, the processor 124 may comprise an IBM-compatible PC configured to be able to execute the Microsoft Windows 3.0 and DOS 3.1 software, and having a hard disk drive, a mouse, and a VGA display. At least a 286 processor with four megabytes of memory is preferred; a 386 processor with eight megabytes of memory is more preferred. The Microsoft Windows 3.0 software is preferably executed in 386 enhanced mode.
Figure 2 shows a block diagram of a behavior module of an autonomous agent.
A memory module 201 may receive the motives message 117, the features message 110 and the reward message 113, and may generate a set of new cases 202, a set of case update messages 203, and a set of case retrieval messages 204. The new cases 202 may be transmitted to a case database 205 for storage, and to a case index 206 for indexing. In a preferred embodiment, aspects of a case- based reasoning system like that disclosed in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991, consistent with the functions disclosed 'herein, may be used for case indexing and matching. The case database 205 may store the actual cases 126; the case index 206 may comprise a mapping from the features 111 to the cases 126 themselves. In a preferred embodiment, the case database 205 may comprise a limited number of stored cases 126, the contents of the case database 205 being determined by a genetic technique for producing, evaluating and selecting the cases 126, such as a genetic technique like that disclosed with figure 3. Thus, while the case database 205 might store less than all of the cases 126 which the autonomous agent 101 has encountered, it might maintain a set of cases 126 which provides a preferred model of the environment, such as those cases 126 which allow the agent 101 to distinguish variant problem scenarios and to act autonomously and intelligently in those variant problem scenarios. The case index 206 may comprise an index of the cases stored in the case database 205 (e.g., a set of case identifiers 207) , organized so that cases 126 may be matched and retrieved, and may respond to the case retrieval message 204 by providing a matches message 208. The matches message 208 may comprise the case identifiers 207, and other information which a selector 209 may require.
The selector 209 may receive the matches message 208 and may also receive a cases message 210 from the case database 205, and may generate the queries message 119 and
• the commands message 121. In a preferred embodiment, the selector 209 may employ a technique for selecting cases like that disclosed with figure 4. The cases message 210 may comprise data from the cases 126, including evaluation data such as accuracy and utility values.
Figure 3 shows a data flow diagram of a genetic technique for producing, evaluating and selecting cases. The case database 205 may comprise its current set of cases 126, each of which may comprise a features element 301, which may generally indicate when the case 126 may be useful, and an action element 302, which may indicate the action 105 to take and an evaluation 303 of that action 105. In a preferred embodiment, the evaluation 303 may comprise an accuracy value 304, which may indicate how "good" the case 126 generally is when used, and a utility value 305, which may indicate how "often" the case 126 generally is usable. For example, if the environment 102 is a carpenter's bench, the utility value 305 of a hammer may be high, because it is often usable, even though its accuracy value 304 indicates that, even when the hammer is usable, it is not always the best choice. Similarly, the utility value 305 of a plane may be low, because it is only used in specialized situations, even though its accuracy value 304 indicates that whenever it is usable, it is the correct tool to choose. An evaluation module 306 may receive the reward message 113 and a history message 307 (indicating the history of matches) , and may operate on the cases 126 to adjust their evaluations 303, particularly their accuracy values 304 and their utility values 305. The evaluation module 306 may respond to the reward message 113 by altering the utility values 305 of the cases 126 to "reinforce" those cases 126 which correspond to the action which resulted in the reinforcement 114. Thus, rewards are "credited" to the cases 126 which earned them. Moreover, the evaluation module 306 may also alter the utility values 305 of those cases 126 which correspond to the action just previous to the reinforced action as well. Thus, rewards are also credited to the cases 126 which "led up to" them.
In a preferred embodiment, the evaluation module 306 may alter the utility value 305 of each case 126 by adding the reinforcement 114 and the utility value 305 of the case 126 which is the "best match" for the next action. Thus: utility value (time t) = reinforcement + utility value (time t+1)
A reproduction module 308 may operate on the cases 126 to adjust their features elements 301 or the action 105 in their action elements 302, by one or more of several techniques. For example, the reproduction module 308 may create and delete cases 126. The reproduction module 308 may create cases 126, for example by inserting new cases 126 into the case database 205 as they are encountered. One method for such instance-based learning is disclosed in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991. The reproduction module 308 may delete cases 126, for example when their accuracy values 304 or their utility values 305 fall below a threshold.
The reproduction module 308 may also make new cases 126 from the old cases 126 in the case database 205. The reproduction module 308 may mutate cases 126, for example by altering one or more features 111 in one case 126 in the case database 205. The reproduction module 308 may also cross-pollinate cases 126, for example by selecting some features 111 from one old case 126 and some features 111 from another old case 126 to create one or more new cases 126.
The operation of the evaluation module 306 and the reproduction module 308 serve to continually review and update" the selection of cases 126 in the case database 205, so that the case database 205 is continually altered into a progressively better set of cases 126. The genetic technique is influenced by the reward message 113, which provides a form of evolutionary pressure towards selection of those sets of cases 126 in the case database 205 which maximize rewards.
Figure 4 shows a data flow diagram of a technique for selecting cases in a memory.
A matching module 401 may review the cases 126 in the case database 205 and may generate a match table 402. In a preferred embodiment, a technique for selecting cases 126 may employ techniques for matching features 111 such as attribute-value pairs, and for generating the match table 402, like those disclosed in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991, consistent with the functions disclosed herein. The match table 402 may comprise a set of cases 126 (or indices of cases 126), each having a match quality value 403, the accuracy value 304 of that case 126, and the utility value 305 of that case 126.
A randomization module 404 may choose one of the cases 126 based on a random or pseudorandom effect. As used herein, "random" effects include pseudorandom effects and related methods which may be used to achieve similar results. In a preferred embodiment, the randomization module 404 may choose one of the cases 126 in the match table 402 with a probability of choice for each case 126 which is linearly proportional to its accuracy value 304. The cases 126 in the match table 402 are therefore chosen for matching based on their match quality values 403, but they are selected for action based on their accuracy values 304. This allows the technique for selection to encourage experimentation by the agent 101.
It would be clear to one of ordinary skill in the art, after perusal of the specification, drawings and claims herein, that the randomization module 404 may employ other and further techniques for choosing one of the cases 126 in the match table 402. For example, another measure associated with the cases 126 (such as their utility value 305) , or a combination of values associated with the cases 126 might be employed in place of the accuracy value 304. Moreover, the probability of choice may be other than linearly proportional to the accuracy value 304. It would be clear to one of ordinary skill in the art that such other and further techniques would be workable, and are within the scope and spirit of the invention.
Figure 5 shows a block diagram of an intelligent office equipment device including an autonomous agent.
A device 501, such as a photocopier or printer, may be coupled to a device driver interface 502, which may be coupled to a control processor 503 including the autonomous agent 101. The device 501 may of course be one of many different devices, such as a fax machine, a modem, or a telephone.
The device driver interface 502 provides the stimulus message 104 to the control processor 503, which delivers the stimulus message 104 to the autonomous agent 101. The autonomous agent 101 generates the action message 106, which the control processor 503 delivers to the device driver interface 502. The motives message 117 may reflect a set of goals, such as eliminating or reducing the need for user involvement in maintenance of the device 501, and to anticipate failures of the device 501 so that preventative maintenance may be performed so as to reduce failures and to increase mean-time-between-failure. The features message 110 may comprise sensor readings from the device 501, and commands from users and dealers. The action message 106 may comprise orders to the device driver interface 502 to alter settings on the device 501, and may also comprise messages to users and dealers such as diagnostic and repair messages. The reward message 113 may comprise a negative reinforcement 114 whenever the • device 501 fails.
Figure 6 shows a block diagram of a customer service system with a help-desk system including an autonomous agent. A user 601, such as a human being, may access the system by means of a telephone connection 602. A help- desk system 603 may attempt to respond to inquiries and resolve problems, with the aid of the autonomous agent 101. The help-desk system 603 provides the stimulus message 104 to the autonomous agent 101. The autonomous agent 101 generates the action message 106 for the help- desk system 603.
The help-desk system 603, its purpose and operation, may be like that shown in parent copending application, Serial No. 07/ 664,561, filed March 4, 1991. It may receive touch-tone commands from the user 601 and may generate voice messages by means of voice response units; receiving touch-tone commands and generating voice messages are well known in the art. The motives message 117 may reflect a set of goals, such as responding to user inquiries with appropriate information and properly resolving user problems, so as to provide an intelligent help-desk system 603 which focusses on the user problem and responds to that problem, and which adapts to user behavior. The features message 110 may comprise touch-tone commands from the user 601, data indicating voice response unit messages given to the user 601, and state information about the telephone connection
602. The action message 106 may comprise commands to the help-desk system 603 or commands to a telephone system 604 controlling the telephone connection 602 (e.g., to transfer the telephone connection 602 to a human operator) .
The help-desk system 603 may consider that it has resolved the user problem when it reaches the end of its response or when it has transferred the telephone connection 602 to a human operator. The reward message 113 may comprise a positive reinforcement 114 when the help-desk system 603 resolves the user problem, and a negative reinforcement 114 whenever the telephone connection 602 is broken by the user 601 prior to that. Figure 7 shows a block diagram of a knowledge discovery system including an autonomous agent.
A knowledge database 701 may be coupled to a database interface 702, which may be coupled to a knowledge processor 703 including the autonomous agent 101. The knowledge database 701 may of course be one of many different types of databases, such as an object-oriented database or a relational database.
The database interface 702 provides the stimulus message 104 to the knowledge processor 703, which delivers the stimulus message 104 to the autonomous agent 101. The autonomous agent 101 generates the action message 106, which the knowledge processor 703 delivers to the database interface 702.
The motives message 117 may reflect a set of goals, such as discovering useful knowledge or patterns in the data in the knowledge database 701, so as to allow the autonomous agent 101 to generalize from data and to capture useful statements about regularities in the knowledge database 701. The features message 110 may comprise statements about data in the knowledge database 701, such as schema definitions or database tables. The action message 106 may comprise orders to the database interface 702 to alter or interrogate the knowledge database 701, such as database queries and transactions. The reward message 113 may comprise a positive reinforcement 114 whenever the autonomous agent 101 discovers "useful" information, for some predetermined definition of "useful" (e.g., an approval message from a human operator) .
Appendix A (pages - ) shows an example software environment and autonomous agent for distinguishing between classes of irises. The example software environment comprises a "trainer" for the autonomous agent 101 which determines when the autonomous agent 101 has made a correct determination and provides an appropriate reward message 113. Some exemplary data statements are also included.
Alternative Embodiments
While preferred embodiments are disclosed herein, many variations are possible which remain within the concept and scope of the invention, and these variations would become clear to one of ordinary skill in the art after perusal of the specification, drawings and claims herein.

Claims

Claims
1. An autonomous learning and reasoning software agent.
2. A software agent as in claim 1, comprising means for responding to reinforcement from an environment.-
3. A software agent as in claim 1, comprising a case base of experience in an environment.
4. A software agent as in claim 1, comprising a case base of experience in an environment which has been altered in response to an evaluation of said experience.
5. A software agent as in claim 4, wherein said evaluation may comprise a self-generated evaluation or a reinforcement from said environment.
6. An autonomous software agent, comprising a sensor for gathering information from an environment; an effector for manipulating said environment; and means for performing autonomous learning in response to reinforcement from said environment.
7. An autonomous software agent as in claim 6, wherein said environment comprises a software environment or a physical environment.
8. An autonomous software agent as in claim 6, wherein said environment comprises a carpenter's bench, a chess program, a customer service system, a graphic database, a help-desk system, a knowledge discovery system, a loan portfolio database, a music audio database, an oil refinery, a piece of office equipment, or a physical room.
9. An autonomous software agent as in claim 6, wherein said agent is implemented in a case-based reasoning system.
10. An autonomous software agent for operating in an environment, comprising a sensor for gathering information from said environment; an effector for manipulating said environment; a case base having a plurality of exemplar cases; an inference engine capable of performing case- based reasoning steps on said cases; and means for altering said case base in response to reinforcement from said environment.
11. An autonomous software agent as in claim 10, wherein said reinforcement comprises an evaluation of said agent.
12. An autonomous software agent for operating in an environment, comprising a sensor for gathering information from said environment,- an effector for manipulating said environment; a case base having a plurality of exemplar cases, said case base having a memory of cases, at least part of said memory being determined by a genetic technique; an inference engine capable of performing case- based reasoning steps on said cases; and means for altering said case base in response to reinforcement from said environment.
13. An autonomous software agent as in claim 12, wherein said reinforcement comprises an evaluation of said agent.
14. An autonomous software agent as in claim 12, wherein said memory has a predetermined maximum size.
15. An autonomous software agent as in claim 12, wherein said memory comprises a case which has never been encountered in said environment.
16. An autonomous software agent as in claim 12, wherein said genetic technique comprises the steps of generating a case which has not been encountered in said environment; evaluating a case in response to (a) a set of matching cases, or (b) said reinforcement; and selecting a limited set of cases which provides a preferred model of said environment.
17. An autonomous software agent as in claim 12, comprising means for selecting a case for case-based reasoning based on a plurality of measures.
18. An autonomous software agent as in claim 17, wherein said a plurality of measures comprises match quality, case accuracy, or case utility.
19. An autonomous software agent as in claim 12, comprising means for selecting a case for case-based reasoning which may which may induce experimentation by the autonomous agent.
20. An autonomous software agent as in claim 19, wherein said means for selecting a case comprises means for generating a random or pseudorandom effect.
21. An autonomous software agent as in claim 19, wherein said means for selecting a case comprises means for inducing experimentation by said agent.
22. An autonomous software agent as in claim 19, wherein said means for selecting an action for the purpose of solving a problem and for selecting an action for the purpose of gathering further information so as to better solve problems.
23. An autonomous software agent as in claim 12, comprising means for cooperating with a second autonomous software agent in selecting an action to be performed.
24. An autonomous software agent for operating in an environment, comprising a receptor coupled to said environment and generating a features message; an inference engine coupled to said receptor, said inference engine comprising a case base, means for selecting a set of matching cases from said case base, means for altering said case base by means of a genetic technique in response to said features message and in response to a reinforcement from said environment, and means for generating a commands message in response to said set of matching cases; and an effector coupled to said environment and operating on said environment in response to said commands message.
25. An autonomous software agent as in claim 24, wherein said means for altering comprises means for generating a case which has not been encountered in said environment; means for evaluating a case in response to (a) a set of matching cases, or (b) said reinforcement; and means for selecting a limited set of cases which provides a preferred model of said environment.
26. An autonomous software agent as in claim 24, wherein said means for selecting a set of matching cases from said case base comprises a random effect or a pseudorandom effect.
27. A case-based reasoning system, comprising a case base having a plurality of exemplar cases; and an inference engine capable of performing case- based reasoning steps on said cases; said case base having been constructed substantially by a genetic technique.
28. A case-based reasoning system as in claim 27, wherein said system is capable of operating in an environment, and wherein said genetic technique comprises the steps of generating a case which has not been encountered in said environment; evaluating a case in response to (a) a set of matching cases, or (b) a reinforcement received from said environment; and selecting a limited set of cases which provides a preferred model of said environment.
29. A case-based reasoning system as in claim 28, wherein said step of generating comprises the steps of inspecting a scenario encountered in said environment; altering a case in said case base to form a new case; or combining a first aspect of a first case in said case base with a second aspect of a second case in said case base to form a new case.
30. A case-based reasoning system, comprising a case base having a plurality of exemplar cases; and an inference engine capable of performing case- based reasoning steps on said cases; said case base including a case which has not been encountered in an environment and which has not been entered from an external interface.
31. A case-based reasoning system, comprising a case base having a plurality of exemplar cases; and an inference engine capable of performing case- based reasoning steps on said cases; said case base comprising a predetermined maximum number of cases and having been constructed substantially by a genetic technique.
32. A case-based reasoning system as in claim 31, comprising means for selecting a set of matching cases from said case base having a random effect or a pseudorandom effect.
33. A case-based reasoning system as in claim 32, wherein said means for selecting a set of matching cases from said case base which applies a random effect or a pseudorandom effect to a measure of match quality, case accuracy, or case utility.
34. A case-based reasoning system as in claim 32, wherein said means for selecting a set of matching cases from said case base is more likely to select a case with a greater measure of match quality, case accuracy, or case utility.
35. A case-based reasoning system as in claim 32, wherein said means for selecting a set of matching cases from said case base has a likelihood for selecting a first case over a second case in linear proportion to a ratio of a measure of case accuracy of said first case over said second case.
36. A method of operating a software agent, comprising the step of performing autonomous learning.
37. A method as in claim 36, comprising the step of collecting a case base of experience in an environment.
38. A method as in claim 36, comprising the step of altering said case base in response to an evaluation of said experience.
39. A method as in claim 36, comprising the step of responding to reinforcement from an environment.
40. A method of operating an autonomous software agent in an environment, comprising the steps of generating a features message in response to said environment; altering a case base of experience in said environment by means of a genetic technique in response to said features message and in response to a reinforcement from said environment; selecting a set of matching cases from said case base; generating a commands message in response to said set of matching cases; and operating on said environment in response to said commands message.
41. An autonomous software agent as in claim 40, wherein said step of altering comprises generating a case which has not been encountered in said environment; evaluating a case in response to (a) a set of matching cases, or (b) said reinforcement; and selecting a limited set of cases which provides a preferred model of said environment.
42. A method of operating a case-based reasoning system, comprising the steps of constructing a case base having a plurality of exemplar cases substantially by a genetic technique; and performing case-based reasoning steps on said cases.
43. A method as in claim 42, wherein said genetic technique comprises the steps of generating a case which has not been encountered in an environment; evaluating a case in response to (a) a set of matching cases, or (b) a reinforcement received from said environment; and selecting a limited set of cases which provides a preferred model of said environment.
44. A method as in claim 43, wherein said step of generating comprises the steps of inspecting a scenario encountered in said environment; altering a case in said case base to form a new case; or combining a first aspect of a first case in said case base with a second aspect of a second case in said case base to form a new case.
45. In a case-based reasoning system, a method" of altering a case base having a plurality of exemplar cases; said method comprising generating a case which has not been encountered in said environment; evaluating a case in response to (a) a set of matching cases, or (b) a reinforcement received from said environment; and selecting a limited set of cases which provides a preferred model of said environment.
46. In a case-based reasoning system, a case base which has been selected by a genetic technique.
47. A case base as in claim 46, said genetic technique comprising generating a case which has not been encountered in said environment; evaluating a case in response to (a) a set of matching cases, or (b) a reinforcement received from said environment; and selecting a limited set of cases which provides a preferred model of said environment.
48. In a case-based reasoning system, a case base having a plurality of exemplar cases, substantially all said cases comprising a set of matchable features; an action to be taken when said case is selected; and a measure of value for said case.
49. A case base as in claim 48, wherein said measure of value comprises match quality, case accuracy, or case utility.
50. A case base as in claim 48, wherein substantially all said cases comprise a second measure' of value for said case.
51. A system for operating in an environment, comprising a receptor coupled to said environment and generating a features message; a first inference engine coupled to said features message and a reinforcement from said environment, for generating a commands message; a second inference engine coupled to said features message and said reinforcement from said environment, for altering said commands message; and an effector coupled to said environment and operating on said environment in response to said commands message.
52. A system for operating in an environment, comprising a receptor coupled to said environment and generating a features message; a first inference engine comprising a first case base, first means for selecting a first set of matching cases from said first case base, means for altering said first case base by means of a genetic technique in response to said features message and in response to a reinforcement from said environment, and means for generating a commands message in response to said first set of matching cases; a second inference engine comprising a second case base, second means for selecting a set of matching cases from said second case base, means for altering said second case base by means of a genetic technique in response to said commands message, in response to said features message and in response to said reinforcement from said environment, and means for altering said commands message in response to said second set of matching cases; and an effector coupled to said environment and operating on said environment in response to said commands message.
53. A method for operating in an environment, comprising generating a features message in response to said environment, said features message having a set of features; first matching said set of features to a first case base; generating a commands message in response to said first step of matching and a reinforcement from said environment; second matching said set of features to a second case base; altering said commands message in response to said second step matching and said reinforcement from said environment; and operating on said environment in response to said commands message.
54. A method for operating in an environment, comprising generating a features message in response to said environment; selecting a first set of matching cases from a first case base; altering said first case base by means of a genetic technique in response to said features message and in response to a reinforcement from said environment; generating a commands message in response to said first set of matching cases; selecting a second set of matching cases from a second case base; altering said second case base by means of a genetic technique in response to said commands message, in response to said features message and in response to said reinforcement from said environment; altering said commands message in response to said second set of matching cases; and operating on said environment in response to said commands message.
PCT/US1993/003557 1992-04-15 1993-04-14 Autonomous learning and reasoning agent WO1993021586A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US86992692A 1992-04-15 1992-04-15
US07/869,926 1992-04-15

Publications (1)

Publication Number Publication Date
WO1993021586A1 true WO1993021586A1 (en) 1993-10-28

Family

ID=25354463

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1993/003557 WO1993021586A1 (en) 1992-04-15 1993-04-14 Autonomous learning and reasoning agent

Country Status (2)

Country Link
AU (1) AU4286893A (en)
WO (1) WO1993021586A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5832467A (en) * 1995-09-27 1998-11-03 U.S. Philips Corporation Behavior prediction for rule-based data processing apparatus
EP0911741A1 (en) * 1997-08-29 1999-04-28 Sony France S.A. Hardware or software architecture adapted to develop conditioned reflexes
EP0976039A2 (en) * 1997-03-21 2000-02-02 International Business Machines Corporation Apparatus and method for communicating between an intelligent agent and client computer process using disguised messages
EP1010051A2 (en) * 1997-03-21 2000-06-21 International Business Machines Corporation Apparatus and method for optimizing the performance of computer tasks using intelligent agent with multiple program modules having varied degrees of domain knowledge
US6192354B1 (en) 1997-03-21 2001-02-20 International Business Machines Corporation Apparatus and method for optimizing the performance of computer tasks using multiple intelligent agents having varied degrees of domain knowledge
US6401080B1 (en) 1997-03-21 2002-06-04 International Business Machines Corporation Intelligent agent with negotiation capability and method of negotiation therewith
WO2003067494A1 (en) * 2000-12-01 2003-08-14 Neal Solomon Demand-initiated intelligent negotiation agents in a distributed system
US7246315B1 (en) 2000-05-10 2007-07-17 Realtime Drama, Inc. Interactive personal narrative agent system and method
US10606536B2 (en) 2018-08-17 2020-03-31 Bank Of America Corporation Intelligent systematic physical document fulfillment system
US11025641B2 (en) 2018-08-21 2021-06-01 Bank Of America Corporation System for optimizing access control for server privilege
US11087323B2 (en) 2018-08-21 2021-08-10 Bank Of America Corporation Exposure based secure access system
US11361330B2 (en) 2018-08-22 2022-06-14 Bank Of America Corporation Pattern analytics system for document presentment and fulfillment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5136686A (en) * 1990-03-28 1992-08-04 Koza John R Non-linear genetic algorithms for solving problems by finding a fit composition of functions
US5224206A (en) * 1989-12-01 1993-06-29 Digital Equipment Corporation System and method for retrieving justifiably relevant cases from a case library

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5224206A (en) * 1989-12-01 1993-06-29 Digital Equipment Corporation System and method for retrieving justifiably relevant cases from a case library
US5136686A (en) * 1990-03-28 1992-08-04 Koza John R Non-linear genetic algorithms for solving problems by finding a fit composition of functions

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
PROCEEDINGS OF THE WORKSHOP ON CASE-BASED REASONING; 10-13 May 1988; PHYLLIS KOTON; "Reasoning About Evidence in Causal Explanations", pages 260-270. *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5832467A (en) * 1995-09-27 1998-11-03 U.S. Philips Corporation Behavior prediction for rule-based data processing apparatus
EP0976039A2 (en) * 1997-03-21 2000-02-02 International Business Machines Corporation Apparatus and method for communicating between an intelligent agent and client computer process using disguised messages
EP1010051A2 (en) * 1997-03-21 2000-06-21 International Business Machines Corporation Apparatus and method for optimizing the performance of computer tasks using intelligent agent with multiple program modules having varied degrees of domain knowledge
US6085178A (en) * 1997-03-21 2000-07-04 International Business Machines Corporation Apparatus and method for communicating between an intelligent agent and client computer process using disguised messages
US6192354B1 (en) 1997-03-21 2001-02-20 International Business Machines Corporation Apparatus and method for optimizing the performance of computer tasks using multiple intelligent agents having varied degrees of domain knowledge
US6401080B1 (en) 1997-03-21 2002-06-04 International Business Machines Corporation Intelligent agent with negotiation capability and method of negotiation therewith
US7908225B1 (en) 1997-03-21 2011-03-15 International Business Machines Corporation Intelligent agent with negotiation capability and method of negotiation therewith
US7386522B1 (en) 1997-03-21 2008-06-10 International Business Machines Corporation Optimizing the performance of computer tasks using intelligent agent with multiple program modules having varied degrees of domain knowledge
EP0976039A4 (en) * 1997-03-21 2005-07-20 Ibm Apparatus and method for communicating between an intelligent agent and client computer process using disguised messages
EP1010051A4 (en) * 1997-03-21 2006-02-01 Ibm Apparatus and method for optimizing the performance of computer tasks using intelligent agent with multiple program modules having varied degrees of domain knowledge
EP0911741A1 (en) * 1997-08-29 1999-04-28 Sony France S.A. Hardware or software architecture adapted to develop conditioned reflexes
US6490570B1 (en) 1997-08-29 2002-12-03 Sony Corporation Hardware or software architecture adapted to develop conditioned reflexes
US7246315B1 (en) 2000-05-10 2007-07-17 Realtime Drama, Inc. Interactive personal narrative agent system and method
GB2390194A (en) * 2000-12-01 2003-12-31 Neal Solomon Demand-initiated intelligent negotiation agents in a distributed system
WO2003067494A1 (en) * 2000-12-01 2003-08-14 Neal Solomon Demand-initiated intelligent negotiation agents in a distributed system
US10606536B2 (en) 2018-08-17 2020-03-31 Bank Of America Corporation Intelligent systematic physical document fulfillment system
US11025641B2 (en) 2018-08-21 2021-06-01 Bank Of America Corporation System for optimizing access control for server privilege
US11087323B2 (en) 2018-08-21 2021-08-10 Bank Of America Corporation Exposure based secure access system
US11361330B2 (en) 2018-08-22 2022-06-14 Bank Of America Corporation Pattern analytics system for document presentment and fulfillment

Also Published As

Publication number Publication date
AU4286893A (en) 1993-11-18

Similar Documents

Publication Publication Date Title
US5586218A (en) Autonomous learning and reasoning agent
US7437703B2 (en) Enterprise multi-agent software system with services able to call multiple engines and scheduling capability
Marcus Automating knowledge acquisition for expert systems
Green Theorem proving by resolution as a basis for question-answering systems
US5581664A (en) Case-based reasoning system
Moukas Amalthaea information discovery and filtering using a multiagent evolving ecosystem
Martín‐Bautista et al. A fuzzy genetic algorithm approach to an adaptive information retrieval agent
CN100375083C (en) Automatic establishing of system of context information providing configuration
Gargano et al. Data Mining‐a powerful information creating tool
EP1643325B1 (en) Directory structure in distributed data driven architecture environment
WO1993021586A1 (en) Autonomous learning and reasoning agent
EP0205873A2 (en) Method for processing an expert system rulebase segmented into contextual units
Fan et al. Adaptive agents for information gathering from multiple, distributed information sources
US20070112609A1 (en) Methods and apparatus to incorporate user feedback during planning
US20070011125A1 (en) Inference machine
Coury et al. Supervisory control and the design of intelligent user interfaces
AU6661998A (en) Adaptive object-oriented optimization software system
Michie Problem decomposition and the learning of skills
Mozaffari et al. Feedback control loop design for workload change detection in self-tuning NoSQL wide column stores
Meystel et al. The challenge of intelligent systems
Liu et al. An agent for intelligent model management
Jelassi MCDM: From ‘Stand-Alone’Methods to Integrated and Intelligent DSS
Landrin-Schweitzer et al. Introducing lateral thinking in search engines
Greenwood et al. Separating the art and science of simulation optimization: a knowledge-based architecture providing for machine learning
Stotts et al. New developments in fuzzy logic computers

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AT AU BB BG BR CA CH CZ DE DK ES FI GB HU JP KP KR KZ LK LU MG MN MW NL NO NZ PL PT RO RU SD SE SK UA VN

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase