US8108188B2 - Enumerated linear programming for optimal strategies - Google Patents

Enumerated linear programming for optimal strategies Download PDF

Info

Publication number
US8108188B2
US8108188B2 US12/261,616 US26161608A US8108188B2 US 8108188 B2 US8108188 B2 US 8108188B2 US 26161608 A US26161608 A US 26161608A US 8108188 B2 US8108188 B2 US 8108188B2
Authority
US
United States
Prior art keywords
leader
follower
mixed
strategy
given
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US12/261,616
Other versions
US20100114541A1 (en
Inventor
Daniel P. Johnson
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honeywell International Inc
Original Assignee
Honeywell International Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honeywell International Inc filed Critical Honeywell International Inc
Priority to US12/261,616 priority Critical patent/US8108188B2/en
Assigned to HONEYWELL INTERNATIONAL INC. reassignment HONEYWELL INTERNATIONAL INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JOHNSON, DANIEL P.
Priority to EP09173990A priority patent/EP2182474A3/en
Publication of US20100114541A1 publication Critical patent/US20100114541A1/en
Application granted granted Critical
Publication of US8108188B2 publication Critical patent/US8108188B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Definitions

  • One class of problem involving multiple-adversarial agents can be formulated as a defender-attacker game in which a first intelligent agent (also referred to here as the “defender”) will use limited defensive resources to protect against an attack by a second intelligent adversary (also referred to here as the “attacker”) that uses limited offensive resources.
  • a first intelligent agent also referred to here as the “defender”
  • a second intelligent adversary also referred to here as the “attacker” that uses limited offensive resources.
  • the game theoretic features of such a game are that the attacker gets to observe the defender's mixed strategy over a period of observation before committing to an attack.
  • the attacker chooses a means of attack without knowing the defender's strategy for the day of the attack.
  • the rewards are not zero-sum or symmetric.
  • a convoy protection problem can be formulated as a defender-attacker game in which the defender varies its choice of possible routes and the attacker chooses an ambush site based on its observations of the routes chosen by the defender.
  • Such defender-attacker games are often modeled as Stackelberg games.
  • the Decomposed Optimal Bayesian Stackelberg Solver (DOBSS) is described in Paruchuri, P., J. Pearce, and S. Kraus, “Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games”, Proc. of 7tha Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008), ed. by Berger, Burg, and Nishiyama, May 12-16, Estoril, Portugal, 2008, which is referred to here as the “Paruchuri Article” and which is hereby incorporated herein by reference.
  • DOBSS Decomposed Optimal Bayesian Stackelberg Solver
  • MILP mixed-integer linear program
  • Table 1 shows a normal form Stackelberg game for a simple leader-follower problem in which the leader comprises a defender and the follower comprises an attacker.
  • the defender chooses a “mixed” defender strategy.
  • a “strategy” refers to a set of “actions” taken by a particular agent.
  • a mixed leader strategy refers to a set of actions taken by the leader, where the various leader actions need not all be the same. More formally, the mixed strategy is expressed as 0 ⁇ x i ⁇ 1.
  • the defender tries to maximize its strategy, knowing that the attacker will choose its best attack against whatever mixed defense the defender picks. This can be formularized in Equation (1).
  • Equation (1) Using the approach described in the Paruchuri Article, it is possible to convert Equation (1) into a Mixed Integer Quadratic Program (MIQP) as shown in Equation (2) by taking advantage of standard optimality conditions for the inner optimization problem.
  • MIQP Mixed Integer Quadratic Program
  • Equation (2) Equation (3)
  • the strategy results in a payoff to the attacker of 2/3 and a payoff to the defender of 11/3.
  • Equation (3) Equation (3)
  • One embodiment is a method of solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions.
  • the method includes receiving an expression of the leader-follower problem as a normal form Stackelberg game.
  • the method further includes, for each possible follower action, solving a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action.
  • the method further includes generating an output derived from the optimal mixed leader strategies, and outputting the output by changing a physical state associated with an interface.
  • LP linear program
  • Another embodiment is a system for solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions.
  • the system includes at least one programmable processor and at least one interface to receive information about the leader-follower problem.
  • the programmable processor is configured to execute software that is operable to cause the system to receive an expression of the leader-follower problem as a normal form Stackelberg game.
  • the software is further operable to cause the system to, for each possible follower action, solve a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action.
  • the software is further operable to cause the system to generate an output derived from the optimal mixed leader strategies, and output the output by changing a physical state associated with the interface.
  • LP linear program
  • Another embodiment is a program product for solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions.
  • the program-product includes a processor-readable medium on which program instructions are embodied.
  • the program instructions are operable, when executed by at least one programmable processor included in a device, to cause the device to receive an expression of the leader-follower problem as a normal form Stackelberg game.
  • the program instructions are further operable, when executed by at least one programmable processor included in the device, to cause the device to, for each possible follower action, solve a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action.
  • the program instructions are further operable, when executed by at least one programmable processor included in the device, to cause the device to generate an output derived from the optimal mixed leader strategies, and output the output by changing a physical state associated with an interface.
  • FIG. 1 is a block diagram of one embodiment of a system that is operable to define and solve leader-follower games using the techniques described in connection with FIG. 2 .
  • FIG. 2 is a flow diagram of one embodiment of a method of solving a leader-follower problem.
  • FIG. 1 is a block diagram of one embodiment of a system 100 that is operable to define and solve leader-follower games using the techniques described below in connection with FIG. 2 .
  • the system 100 outputs a mixed strategy for the leader (also referred to here as a “mixed leader strategy”).
  • each mixed leader strategy comprises, for each of the set of leader actions, a respective fraction of occasions that the leader is to choose that respective leader action.
  • the leader-follower game comprises a defender-attacker game in which the leader comprises the defender and the follower comprises the attacker.
  • System 100 is used to output a mixed defender strategy that indicates how limited defense resources 102 available to the defender can be deployed within a given security environment 104 to defend against attacks from the attacker.
  • the limited defense resources 102 comprise a group of vehicles (for example, trucks, sea ships, or air planes) that travel together in a convoy for mutual support and defense. Each convoy takes a particular path that travels through a set of locations 106 .
  • the system 100 is used to output a suggested convoy schedule that indicates, for each possible path the convoy could take, a respective fraction of occasions that the defender is to choose that path.
  • FIG. 1 Although the embodiment shown in FIG. 1 is described here as being used to output a convoy schedule, it is to be understood that in other embodiments, other types of defensive strategies are output. More generally, the systems and techniques described here can be used to develop strategies for dealing with leader-follower games that can be modeled as Stackelberg games for which the actions by the leader and the follower can be enumerated. For example, in other embodiments, strategies relating to defending critical infrastructure resources such as electric power grids, subways, or airports are output.
  • critical infrastructure resources such as electric power grids, subways, or airports are output.
  • the system 100 is described here as being implemented as software 108 that executes on one or more computers 110 (though it is to be understood that the system 100 can be implemented using various combinations of hardware and software).
  • the software 108 is executed by at least one programmable processor 112 (for example, at least one general-purpose microprocessor or central processor) included in the computer 110 .
  • the software 108 comprises a set of program instructions embodied on a storage medium from which at least a portion of the program instructions are read by the programmable processor 112 for execution thereby.
  • the program instructions when executed by the programmable processor 112 , carry out at least a portion of the functionality described here as being performed by the system 100 .
  • the processor 112 includes or is communicatively coupled to at least one data storage device 114 for storing such program instructions and/or data used during execution of the software 108 .
  • suitable data storage devices 114 include any suitable form of volatile memory (such as random-access memory and registers included within programmable processors) and/or non-volatile memory (such as nonvolatile RAM memory, magnetic disc drives, and optical disc drives). Although only a single data storage device is shown in FIG. 1 , it is to be understood that multiple data storage device can be used.
  • One or more interfaces 116 are included in the system 100 to capture information related to the security environment 104 and/or other input used by the processor 112 (and the software 108 executed thereon). Moreover, although only a single interface 116 is shown in FIG. 1 , it is to be understood that multiple interfaces (for example, different types of interfaces) can be used.
  • one or more interfaces 116 are used to communicatively couple one or more input devices 118 to the processor 112 .
  • a user is able to provide input to the processor 112 (and the software 108 executing thereon) using such input devices 118 .
  • the input devices 118 comprise a keyboard and a pointing device (such as a mouse or a touch-pad).
  • the computer 110 includes one or more interfaces by which external input devices are communicatively coupled to the computer 110 .
  • the keyboard and the pointing device are integrated into the computer 110 .
  • a keyboard and/or pointing device external to the portable computer can also be communicatively coupled to the computer 110 .
  • Input to the processor 112 (and the software 108 executing thereon) can be supplied in other ways, for example, from a network (such as a local or wide area network), one or more data files or other data stores (such as databases), or “real time” data from sensors (or other sources of data relating to the security environment 104 ).
  • a network such as a local or wide area network
  • data files or other data stores such as databases
  • real time data from sensors (or other sources of data relating to the security environment 104 ).
  • the software 108 comprises a front end 120 and a back end 122 .
  • the front end 120 comprises a front-end interface 124 via which a user is able to enter a payoff table (also referred to here as a “reward matrix” or “game matrix”).
  • the payoff table specifies the leader payoff and the follower payoff for each combination of a possible leader action and a possible follower action.
  • the leader actions are the possible convoy paths and the follower actions are the possible locations for attacks on a convoy.
  • the payoff table in such an embodiment, identifies the payoff to the defender and the payoff to the attacker for each combination of a convoy path and a location for an attack on the convoy.
  • the front-end interface 124 is implemented as a user interface (displayed on a display device 132 (described below)) via which a user can manually input such data using an input device 114 .
  • the front-end interface 124 is used to receive such data in other ways (for example, by receiving a file in which such data is entered).
  • the back end 122 comprises an enumerated linear programming (LP) module 128 to generate a set of optimal mixed leader strategies for the leader-follower problem using the method described below in connection with FIG. 2 .
  • LP linear programming
  • each mixed leader strategy comprises, for each of the set of leader actions, a respective fraction of occasions that the leader is to choose that respective leader action. More specifically, each mixed leader strategy indicates, for each possible path for the convoy, a respective fraction of occasions that the defender is to take that path.
  • the front end 120 comprises a schedule generator 130 .
  • the schedule generator 130 generates a suggested convoy schedule from the front end's knowledge of the defense resources 102 and the security environment 104 combined with an optimal mixed leader strategy output by the back end 122 .
  • This suggested convoy schedule can be adjusted manually as necessary to produce a finalized schedule.
  • At least a portion of the suggested convoy schedule is communicated (via an appropriate interface 116 ) to a display device 132 for display thereon (for example, using a web browser or other user interface mechanism).
  • a display device 132 may be local to the system 100 (for example, where a video monitor is coupled directly to a video port of the computer 110 used to implement the system 100 ) or may be remote to the system 100 (for example, where the display device 132 is a part of or connected to a client computer that remotely accesses the one or more computers 108 used to implement the system 100 over a network such as Internet using appropriate client and server software, in which case the interface 118 comprises an appropriate network interface).
  • At least a portion of such an output may be communicated via an appropriate interface 116 to an interface 134 associated with the defense resources 102 in order to cause a defense-related action to be taken (for example, communicating a command to a system or device within the defense resources 102 over a network or other communication link that causes the system or device to take some action based on the command).
  • a defense-related action for example, communicating a command to a system or device within the defense resources 102 over a network or other communication link that causes the system or device to take some action based on the command.
  • the output generated by the system 100 is used to (among other things) change the state of the interface 116 (for example, changing the state of the various signals that make up the interface) and any device or interface communicatively coupled thereto (such as display device 132 and interface 134 ).
  • FIG. 2 is a flow diagram of one embodiment of a method 200 of solving a leader-follower problem.
  • Method 200 is based on a new and novel approach to solving such problems which is provably polynomial in the size of the normal form Stackelberg game. This approach is based on the following.
  • the convex set of all legitimate mixed defensive strategies is defined as:
  • Equation (1) The inner optimization problem in Equation (1) can then be pulled out to define the optimal attack function, which computes the best attack given a mixed defender strategy:
  • Equation (10) can be redefined as:
  • each X j must be a linear polytope defined by:
  • X j ⁇ x ⁇ X ⁇ : ⁇ ⁇ ⁇ j ⁇ ⁇ ⁇ i ⁇ c ij ⁇ ⁇ x i ⁇ ⁇ i ⁇ c ij ⁇ x i ⁇ ( 14 )
  • Equation (1) the problem expressed in Equation (1) can be re-expressed in the form:
  • Equation (1) As a result of Equations (15), we can rewrite Equation (1) as an equivalent problem:
  • the particular embodiment of method 200 shown in FIG. 2 is described here as being implemented in the software 108 of system 100 of FIG. 1 .
  • the leader-follower game comprises a defender-attacker game and the system 100 is used to output a suggested convoy schedule that indicates, for each possible path the convoy could take, a respective fraction of occasions that the convey (that is, the defender) is to choose that path.
  • Method 200 comprises receiving an expression of the leader-follower problem as a normal form Stackelberg game (block 202 ).
  • a normal form Stackelberg game that models the defender-attacker problem addressed in this embodiment is received in the form of a payoff table using the front-end interface 124 .
  • a payoff table specifies the leader payoff and the follower payoff for each combination of a leader action and a follower action.
  • the payoff table that is received using the front-end interface 124 specifies the payoff to the defender and the payoff to the attacker for each combination of a convoy path and a location for an attack on the convoy.
  • the payoff table shown in Table 1 is converted into the normal form Stackelberg game as shown above in connection with Equation (1).
  • Method 200 further comprises formulating a linear programming (LP) problem for each possible follower action (block 204 ).
  • LP linear programming
  • For each possible follower action also referred to here as the “provoked follower action”
  • a respective LP problem is formulated that is to be solved by optimizing the leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that provoked follower action.
  • Each such LP problem is formulated by specifying an objective function that expresses the leader payoff for a given mixed leader strategy and the provoked follower action and by specifying a constraint that requires the follower payoff for a given mixed leader strategy and the provoked follower action to be greater than or equal to the follower payoff for that given mixed leader strategy and every other possible follower action.
  • the enumerated LP module 128 formulate each such LP problem from the payoff table received via the front-end interface 124 .
  • the objective function for each such LP problem is formulated as follows: max x ⁇ i R i ⁇ x i (17)
  • Equation (17) is formulated as a polynomial including a term for each leader action in a given mixed leader strategy that is the product of the leader payoff R i ⁇ corresponding to the combination of that leader action and the provoked follower action and a variable that is the fraction x i of occasions that that leader action is to occur in the given mixed leader strategy.
  • Equation (18) requires the follower payoff for a given mixed leader strategy and the provoked follower action to be greater than or equal to the follower payoff for that given mixed leader strategy and every other possible follower action.
  • Equation (19) requires the sum of all the fractions of occasions that the leader actions in the given mixed leader strategy are to occur equals 1.
  • Equation (20) requires that, for each leader action in a given mixed leader strategy, the fraction of occasions that that leader action is to occur is not negative.
  • Method 200 further comprises solving each of the set of LP problems (block 206 ). Because each of the set of LP problems is a polynomial LP problem, conventional LP algorithms and techniques can be used to solve such LP problems in a reasonable amount of time for real world applications.
  • Method 200 further comprises selecting an overall optimal mixed leader strategy from the optimal mixed leader strategies determined by solving the set of enumerated LP problems (block 208 ). The particular optimal mixed leader strategy that has the highest payoff to the leader is selected.
  • the enumerated LP module 128 solves each of the set of LP problems and selects the overall optimal mixed leader strategy from the set of optimal mixed leader strategies that result from solving the set of LP problems.
  • Method 200 further comprises generating an output derived from the optimal mixed leader strategies determined by solving the LP problems (block 210 ) and outputting the output by the changing a physical state associated with an interface (block 212 ).
  • the schedule generator 130 generates a suggested convoy schedule from the front end's knowledge of the defense resources 102 and the security environment 104 combined with the optimal mixed leader strategy output by enumerated LP module 128 .
  • the suggest convoy schedule is, for example, displayed on display device 120 and/or communicated via an appropriate interface 116 in the system 100 to a system or device within the defense resources 102 (via an interface 134 included therein).
  • the methods and techniques described here may be implemented in digital electronic circuitry, or with a programmable processor (for example, a special-purpose processor or a general-purpose processor such as a computer) firmware, software, or in combinations of them.
  • Apparatus embodying these techniques may include appropriate input and output devices, a programmable processor, and a storage medium tangibly embodying program instructions for execution by the programmable processor.
  • a process embodying these techniques may be performed by a programmable processor executing a program of instructions to perform desired functions by operating on input data and generating appropriate output.
  • the techniques may advantageously be implemented in one or more programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
  • a processor will receive instructions and data from a read-only memory and/or a random access memory.
  • Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and DVD disks. Any of the foregoing may be supplemented by, or incorporated in, specially-designed application-specific integrated circuits (ASICs).
  • ASICs application-specific integrated circuits

Abstract

One embodiment is directed to an approach to solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions. The approach includes receiving an expression of the leader-follower problem as a normal form Stackelberg game. The approach further includes, for each possible follower action, solving a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action. The approach further includes generating an output derived from the optimal mixed leader strategies, and outputting the output by changing a physical state associated with an interface.

Description

BACKGROUND
One class of problem involving multiple-adversarial agents (such as security problems) can be formulated as a defender-attacker game in which a first intelligent agent (also referred to here as the “defender”) will use limited defensive resources to protect against an attack by a second intelligent adversary (also referred to here as the “attacker”) that uses limited offensive resources.
The game theoretic features of such a game are that the attacker gets to observe the defender's mixed strategy over a period of observation before committing to an attack. The attacker chooses a means of attack without knowing the defender's strategy for the day of the attack. The rewards are not zero-sum or symmetric. For example, a convoy protection problem can be formulated as a defender-attacker game in which the defender varies its choice of possible routes and the attacker chooses an ambush site based on its observations of the routes chosen by the defender.
Such defender-attacker games are often modeled as Stackelberg games. The Decomposed Optimal Bayesian Stackelberg Solver (DOBSS) is described in Paruchuri, P., J. Pearce, and S. Kraus, “Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games”, Proc. of 7tha Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2008), ed. by Berger, Burg, and Nishiyama, May 12-16, Estoril, Portugal, 2008, which is referred to here as the “Paruchuri Article” and which is hereby incorporated herein by reference. In the DOBSS approach, a mixed-integer linear program (MILP) is formulated for a defender-attacker game for which the various attacks and defenses can be enumerated. The MILP problem is solved to find optimal mixed strategies for the defender-attacker game.
For example, Table 1 shows a normal form Stackelberg game for a simple leader-follower problem in which the leader comprises a defender and the follower comprises an attacker.
TABLE 1
Simple Normal Form Game
C D
A 2, 1 4, 0
B 1, 0 3, 2
Where:
    • iεI Defender's strategy
    • jεJ Attacker's strategy
    • Rij Reward to defender under defense i and attack j
    • cij Reward to attacker under defense i and attack j
    • xiε[0,1] Fraction of occasions that defender will choose i
    • qjε0,1 qj=1 if chosen attack is j, else qj=0
In this example, the defender chooses a “mixed” defender strategy. As used herein, a “strategy” refers to a set of “actions” taken by a particular agent. For example, a mixed leader strategy refers to a set of actions taken by the leader, where the various leader actions need not all be the same. More formally, the mixed strategy is expressed as 0≦xi≦1.
The defender tries to maximize its strategy, knowing that the attacker will choose its best attack against whatever mixed defense the defender picks. This can be formularized in Equation (1).
max x i j R ij x i q j s . t . q = argmax q i j c ij x i q j i x i = 1 j q j = 1 i 0 x i 1 j q j { 0 , 1 } ( 1 )
Using the approach described in the Paruchuri Article, it is possible to convert Equation (1) into a Mixed Integer Quadratic Program (MIQP) as shown in Equation (2) by taking advantage of standard optimality conditions for the inner optimization problem.
max x , q , a i j R ij x i q j s . t . j 0 ( a - i c ij x i ) ( 1 - q ) M i x i = 1 j q j = 1 i 0 x i 1 j q j { 0 , 1 } 0 a M ( 2 )
where M is a large constant such that:
M >> max j i c ij ( 3 )
A dual objective a of the inner problem can be defined for Equation (2), which means that for a solution x,q,a we will have a=maxj Σi cijxi.
Using the approach described in the Paruchuri Article, the problem defined in Table 1 can be expressed using Equation (2) to get the system of equations as follows (where M=3 to satisfy Equation (3)).
max x 1 , x 2 , q 1 , q 2 , a 2 x 1 q 1 + 4 x 1 q 2 + x 2 q 1 + 3 x 2 q 2 s . t . 0 a - x 1 3 ( 1 - q 1 ) 0 a - 2 x 2 3 ( 1 - q 2 ) x 1 + x 2 = 1 q 1 + q 2 = 1 0 x 1 , x 2 1 q 1 , q 2 { 0 , 1 } 0 a 3 ( 4 )
The optimal solution to Equation (4) is x1=2/3; x2=1/3; q1=0; q2=1; a=2/3. The strategy results in a payoff to the attacker of 2/3 and a payoff to the defender of 11/3.
The next stage in the approach described in the Paruchuri Article is to linearize the MIQP of Equation (2). This is done by defining the variables {zij} where:
z ij = x i q j ( 5 ) x i = j z ij ( 6 )
The resulting Mixed Integer Linear Program is as follows:
max z , q , a i j R ij z ij s . t . j 0 ( a - i c ij ( h z ih ) ) ( 1 - q j ) M i j z ij = 1 j q j = 1 i j z ij 1 jq j i z ij 1 ij 0 z ij 1 jq j { 0 , 1 } 0 a M ( 7 )
where M is a large constant satisfying Equation (3).
The problem defined in Table 1 can be expressed using Equation (7) to get the following system of equations (where M=3 to satisfy Equation (3)).
max z 11 , z 12 , z 21 , z 22 , q 1 , q 2 , a 2 z 11 + 4 z 12 + z 21 + 3 z 22 s . t . 0 a - z 11 - z 12 3 ( 1 - q 1 ) 0 a - 2 z 21 - 2 z 22 3 ( 1 - q 2 ) z 11 + z 12 + z 21 + z 22 = 1 q 1 + q 2 = 1 z 11 + z 12 1 z 21 + z 22 1 q 1 z 11 + z 12 1 q 2 z 12 + z 22 1 0 z 11 , z 12 , z 21 , z 22 1 q 1 , q 2 { 0 , 1 } 0 a 3 ( 8 )
The optimal solution to this system of Equations (8) is z11=z21=0, z12=2/3, z22=1/3, q1=0, q2=1, a=2/3. The strategy results in a payoff to the attacker of 2/3 and a payoff to the defender of 11/3.
However, solving the MILP problem that results from the DOBSS approach described in the Paruchuri Article can be cumbersome and computationally intensive for real-word applications.
SUMMARY
One embodiment is a method of solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions. The method includes receiving an expression of the leader-follower problem as a normal form Stackelberg game. The method further includes, for each possible follower action, solving a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action. The method further includes generating an output derived from the optimal mixed leader strategies, and outputting the output by changing a physical state associated with an interface.
Another embodiment is a system for solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions. The system includes at least one programmable processor and at least one interface to receive information about the leader-follower problem. The programmable processor is configured to execute software that is operable to cause the system to receive an expression of the leader-follower problem as a normal form Stackelberg game. The software is further operable to cause the system to, for each possible follower action, solve a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action. The software is further operable to cause the system to generate an output derived from the optimal mixed leader strategies, and output the output by changing a physical state associated with the interface.
Another embodiment is a program product for solving a leader-follower problem in which a leader has a set of leader actions and a follower has a set of follower actions. The program-product includes a processor-readable medium on which program instructions are embodied. The program instructions are operable, when executed by at least one programmable processor included in a device, to cause the device to receive an expression of the leader-follower problem as a normal form Stackelberg game. The program instructions are further operable, when executed by at least one programmable processor included in the device, to cause the device to, for each possible follower action, solve a linear program (LP) problem to determine a respective optimal mixed leader strategy, wherein the LP problem optimizes a leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective follower action. The program instructions are further operable, when executed by at least one programmable processor included in the device, to cause the device to generate an output derived from the optimal mixed leader strategies, and output the output by changing a physical state associated with an interface.
The details of various embodiments of the claimed invention are set forth in the accompanying drawings and the description below. Other features and advantages will become apparent from the description, the drawings, and the claims.
DRAWINGS
FIG. 1 is a block diagram of one embodiment of a system that is operable to define and solve leader-follower games using the techniques described in connection with FIG. 2.
FIG. 2 is a flow diagram of one embodiment of a method of solving a leader-follower problem.
Like reference numbers and designations in the various drawings indicate like elements.
DETAILED DESCRIPTION
FIG. 1 is a block diagram of one embodiment of a system 100 that is operable to define and solve leader-follower games using the techniques described below in connection with FIG. 2. The system 100 outputs a mixed strategy for the leader (also referred to here as a “mixed leader strategy”). In the particular embodiment described, each mixed leader strategy comprises, for each of the set of leader actions, a respective fraction of occasions that the leader is to choose that respective leader action.
In the particular embodiment described here in connection with FIGS. 1 and 2, the leader-follower game comprises a defender-attacker game in which the leader comprises the defender and the follower comprises the attacker. System 100 is used to output a mixed defender strategy that indicates how limited defense resources 102 available to the defender can be deployed within a given security environment 104 to defend against attacks from the attacker.
In this particular embodiment, the limited defense resources 102 comprise a group of vehicles (for example, trucks, sea ships, or air planes) that travel together in a convoy for mutual support and defense. Each convoy takes a particular path that travels through a set of locations 106. The system 100, in such an embodiment, is used to output a suggested convoy schedule that indicates, for each possible path the convoy could take, a respective fraction of occasions that the defender is to choose that path.
Although the embodiment shown in FIG. 1 is described here as being used to output a convoy schedule, it is to be understood that in other embodiments, other types of defensive strategies are output. More generally, the systems and techniques described here can be used to develop strategies for dealing with leader-follower games that can be modeled as Stackelberg games for which the actions by the leader and the follower can be enumerated. For example, in other embodiments, strategies relating to defending critical infrastructure resources such as electric power grids, subways, or airports are output.
The system 100 is described here as being implemented as software 108 that executes on one or more computers 110 (though it is to be understood that the system 100 can be implemented using various combinations of hardware and software). In the particular embodiment shown in FIG. 1, the software 108 is executed by at least one programmable processor 112 (for example, at least one general-purpose microprocessor or central processor) included in the computer 110. The software 108 comprises a set of program instructions embodied on a storage medium from which at least a portion of the program instructions are read by the programmable processor 112 for execution thereby. The program instructions, when executed by the programmable processor 112, carry out at least a portion of the functionality described here as being performed by the system 100.
In such an embodiment, the processor 112 includes or is communicatively coupled to at least one data storage device 114 for storing such program instructions and/or data used during execution of the software 108. Examples of suitable data storage devices 114 include any suitable form of volatile memory (such as random-access memory and registers included within programmable processors) and/or non-volatile memory (such as nonvolatile RAM memory, magnetic disc drives, and optical disc drives). Although only a single data storage device is shown in FIG. 1, it is to be understood that multiple data storage device can be used.
One or more interfaces 116 are included in the system 100 to capture information related to the security environment 104 and/or other input used by the processor 112 (and the software 108 executed thereon). Moreover, although only a single interface 116 is shown in FIG. 1, it is to be understood that multiple interfaces (for example, different types of interfaces) can be used.
For example, one or more interfaces 116 are used to communicatively couple one or more input devices 118 to the processor 112. A user is able to provide input to the processor 112 (and the software 108 executing thereon) using such input devices 118. In the embodiment shown in FIG. 1, the input devices 118 comprise a keyboard and a pointing device (such as a mouse or a touch-pad). In some implementations of the embodiment shown in FIG. 1, the computer 110 includes one or more interfaces by which external input devices are communicatively coupled to the computer 110. In other implementations (for example, where the computer 110 comprises a portable computer), the keyboard and the pointing device are integrated into the computer 110. In some of those implementations, a keyboard and/or pointing device external to the portable computer can also be communicatively coupled to the computer 110.
Input to the processor 112 (and the software 108 executing thereon) can be supplied in other ways, for example, from a network (such as a local or wide area network), one or more data files or other data stores (such as databases), or “real time” data from sensors (or other sources of data relating to the security environment 104).
In the particular embodiment shown in FIG. 1, the software 108 comprises a front end 120 and a back end 122. The front end 120 comprises a front-end interface 124 via which a user is able to enter a payoff table (also referred to here as a “reward matrix” or “game matrix”). The payoff table specifies the leader payoff and the follower payoff for each combination of a possible leader action and a possible follower action. In the particular embodiment described here in connection with FIGS. 1 and 2, the leader actions are the possible convoy paths and the follower actions are the possible locations for attacks on a convoy. The payoff table, in such an embodiment, identifies the payoff to the defender and the payoff to the attacker for each combination of a convoy path and a location for an attack on the convoy.
In one implementation, the front-end interface 124 is implemented as a user interface (displayed on a display device 132 (described below)) via which a user can manually input such data using an input device 114. In other implementations, the front-end interface 124 is used to receive such data in other ways (for example, by receiving a file in which such data is entered).
The back end 122 comprises an enumerated linear programming (LP) module 128 to generate a set of optimal mixed leader strategies for the leader-follower problem using the method described below in connection with FIG. 2.
In the particular embodiment described here in connection with FIG. 1, each mixed leader strategy comprises, for each of the set of leader actions, a respective fraction of occasions that the leader is to choose that respective leader action. More specifically, each mixed leader strategy indicates, for each possible path for the convoy, a respective fraction of occasions that the defender is to take that path.
In the particular embodiment shown in FIG. 1, the front end 120 comprises a schedule generator 130. The schedule generator 130 generates a suggested convoy schedule from the front end's knowledge of the defense resources 102 and the security environment 104 combined with an optimal mixed leader strategy output by the back end 122. This suggested convoy schedule can be adjusted manually as necessary to produce a finalized schedule.
In one implementation of the system 100, at least a portion of the suggested convoy schedule is communicated (via an appropriate interface 116) to a display device 132 for display thereon (for example, using a web browser or other user interface mechanism). Such display device 132 may be local to the system 100 (for example, where a video monitor is coupled directly to a video port of the computer 110 used to implement the system 100) or may be remote to the system 100 (for example, where the display device 132 is a part of or connected to a client computer that remotely accesses the one or more computers 108 used to implement the system 100 over a network such as Internet using appropriate client and server software, in which case the interface 118 comprises an appropriate network interface). In another example, at least a portion of such an output may be communicated via an appropriate interface 116 to an interface 134 associated with the defense resources 102 in order to cause a defense-related action to be taken (for example, communicating a command to a system or device within the defense resources 102 over a network or other communication link that causes the system or device to take some action based on the command). More generally, it should be understood that the output generated by the system 100 is used to (among other things) change the state of the interface 116 (for example, changing the state of the various signals that make up the interface) and any device or interface communicatively coupled thereto (such as display device 132 and interface 134).
FIG. 2 is a flow diagram of one embodiment of a method 200 of solving a leader-follower problem. Method 200 is based on a new and novel approach to solving such problems which is provably polynomial in the size of the normal form Stackelberg game. This approach is based on the following. The convex set of all legitimate mixed defensive strategies is defined as:
X = { x : i x i = 1 , i 0 x i } ( 9 )
The inner optimization problem in Equation (1) can then be pulled out to define the optimal attack function, which computes the best attack given a mixed defender strategy:
q ( x ) = argmax q { i j c ij x i q j : j q j = 1 , j 0 q j } ( 10 )
It can be shown that for xεX, an optimal q(x) that will be one for some ĵ and zero for all other j. Thus, the following can be defined:
q ^ ( x ) = arg max j { i c ij x i } ( 11 )
And, without loss of generality, Equation (10) can be redefined as:
q ( x ) j = { 0 if j q ^ ( x ) 1 if j = q ^ ( x ) ( 12 )
Also, X is then decomposed into those subregions that result in a particular attack response:
X j ={xεX:{circumflex over (q)}(x)=j}  (13)
Note that by definition of Equation (11), each Xj must be a linear polytope defined by:
X j = { x X : j ^ i c ij ^ x i i c ij x i } ( 14 )
Therefore, the problem expressed in Equation (1) can be re-expressed in the form:
max x { ij R ij x i q ( x ) j : x j X j } max x j ^ { ij R ij xq ( x ) j : x X j ^ } max xj { i R ij x i : x X j } ( 15 )
where the last equivalence comes from Equation (12).
As a result of Equations (15), we can rewrite Equation (1) as an equivalent problem:
max j ^ max x i R i j ^ x i s . t . j i c ij x i i c i j ^ x i i x i = 1 i 0 x i ( 16 )
As a result, the original leader-follower problem of Equation (1) can be rewritten as |J| linear programs, each with |I| variables, and |J|+|I|+1 constraints.
This approach is used in method 200. The particular embodiment of method 200 shown in FIG. 2 is described here as being implemented in the software 108 of system 100 of FIG. 1. In this embodiment, the leader-follower game comprises a defender-attacker game and the system 100 is used to output a suggested convoy schedule that indicates, for each possible path the convoy could take, a respective fraction of occasions that the convey (that is, the defender) is to choose that path.
Method 200 comprises receiving an expression of the leader-follower problem as a normal form Stackelberg game (block 202). In the particular embodiment described here in connection with FIGS. 1 and 2, a normal form Stackelberg game that models the defender-attacker problem addressed in this embodiment is received in the form of a payoff table using the front-end interface 124. A payoff table specifies the leader payoff and the follower payoff for each combination of a leader action and a follower action. In this embodiment, the payoff table that is received using the front-end interface 124 specifies the payoff to the defender and the payoff to the attacker for each combination of a convoy path and a location for an attack on the convoy.
For example, for the example leader-follower game described above in connection with Table 1, the payoff table shown in Table 1 is converted into the normal form Stackelberg game as shown above in connection with Equation (1).
Method 200 further comprises formulating a linear programming (LP) problem for each possible follower action (block 204). For each possible follower action (also referred to here as the “provoked follower action”), a respective LP problem is formulated that is to be solved by optimizing the leader payoff for a given mixed leader strategy and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that provoked follower action. Each such LP problem is formulated by specifying an objective function that expresses the leader payoff for a given mixed leader strategy and the provoked follower action and by specifying a constraint that requires the follower payoff for a given mixed leader strategy and the provoked follower action to be greater than or equal to the follower payoff for that given mixed leader strategy and every other possible follower action.
In the particular embodiment described here in connection with FIGS. 1 and 2, the enumerated LP module 128 formulate each such LP problem from the payoff table received via the front-end interface 124. The objective function for each such LP problem is formulated as follows:
maxxΣiRxi  (17)
In other words, Equation (17) is formulated as a polynomial including a term for each leader action in a given mixed leader strategy that is the product of the leader payoff R corresponding to the combination of that leader action and the provoked follower action and a variable that is the fraction xi of occasions that that leader action is to occur in the given mixed leader strategy.
The constraints for each such LP problem are formulated as follows:
∀jΣicijxi≦Σicxi  (18)
Σixi−1  (19)
∀i0≦xi  (20)
In other words, Equation (18) requires the follower payoff for a given mixed leader strategy and the provoked follower action to be greater than or equal to the follower payoff for that given mixed leader strategy and every other possible follower action.
Equation (19) requires the sum of all the fractions of occasions that the leader actions in the given mixed leader strategy are to occur equals 1.
Equation (20) requires that, for each leader action in a given mixed leader strategy, the fraction of occasions that that leader action is to occur is not negative.
Method 200 further comprises solving each of the set of LP problems (block 206). Because each of the set of LP problems is a polynomial LP problem, conventional LP algorithms and techniques can be used to solve such LP problems in a reasonable amount of time for real world applications.
Method 200 further comprises selecting an overall optimal mixed leader strategy from the optimal mixed leader strategies determined by solving the set of enumerated LP problems (block 208). The particular optimal mixed leader strategy that has the highest payoff to the leader is selected.
In the particular embodiment described here in connection with FIGS. 1 and 2, the enumerated LP module 128 solves each of the set of LP problems and selects the overall optimal mixed leader strategy from the set of optimal mixed leader strategies that result from solving the set of LP problems.
Method 200 further comprises generating an output derived from the optimal mixed leader strategies determined by solving the LP problems (block 210) and outputting the output by the changing a physical state associated with an interface (block 212). In the particular embodiment described here in connection with FIGS. 1 and 2, the schedule generator 130 generates a suggested convoy schedule from the front end's knowledge of the defense resources 102 and the security environment 104 combined with the optimal mixed leader strategy output by enumerated LP module 128. As noted above, the suggest convoy schedule is, for example, displayed on display device 120 and/or communicated via an appropriate interface 116 in the system 100 to a system or device within the defense resources 102 (via an interface 134 included therein).
The methods and techniques described here may be implemented in digital electronic circuitry, or with a programmable processor (for example, a special-purpose processor or a general-purpose processor such as a computer) firmware, software, or in combinations of them. Apparatus embodying these techniques may include appropriate input and output devices, a programmable processor, and a storage medium tangibly embodying program instructions for execution by the programmable processor. A process embodying these techniques may be performed by a programmable processor executing a program of instructions to perform desired functions by operating on input data and generating appropriate output. The techniques may advantageously be implemented in one or more programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and DVD disks. Any of the foregoing may be supplemented by, or incorporated in, specially-designed application-specific integrated circuits (ASICs).
A number of embodiments of the invention defined by the following claims have been described. Nevertheless, it will be understood that various modifications to the described embodiments may be made without departing from the spirit and scope of the claimed invention. Accordingly, other embodiments are within the scope of the following claims.

Claims (20)

1. A method of solving a leader-follower problem, the method comprising:
receiving, with a programmable processor, an expression of the leader-follower problem as a normal form Stackelberg game;
solving, with the programmable processor, an enumerated linear program (LP) problem, wherein solving the enumerated LP program comprises:
determining a respective mixed leader strategy for each possible follower action of the normal form Stackelberg game, and
optimizing a leader payoff for a given mixed leader strategy of the determined mixed leader strategies and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke the given fixed follower action;
generating, with the programmable processor, an output derived from optimization of the leader payoff for the given mixed leader strategy and the given fixed follower action over the feasible region; and
outputting, with the programmable processor, the output by changing a physical state associated with an interface.
2. The method of claim 1, further comprising selecting the given mixed leader strategy as an optimal mixed leader strategy from the determined mixed leader strategies based on the optimization of the leader payoff for the given mixed leader strategy and the given fixed follower action over the feasible region, wherein generating an output derived from optimization comprises generating an output indicative of the selected optimal mixed leader strategy.
3. The method of claim 1, wherein the enumerated LP problem comprises:
an objective function that expresses the leader payoff for each possible follower action and the determined respective mixed leader strategy, and
a constraint that requires a follower payoff for the given mixed leader strategy and the given fixed follower action to be greater than or equal to a follower payoff for the given mixed leader strategy and every other possible follower action.
4. The method of claim 1, wherein receiving the expression of the leader-follower problem as a normal form Stackelberg game comprises receiving a reward matrix that specifies a leader payoff and a follower payoff for a given leader action and a given follower action.
5. The method of claim 1, wherein the leader-follower problem comprises a defender-attacker game in which the leader comprises the defender and the follower comprises the attacker.
6. The method of claim 1, further comprising defending at least one of an electric power grid, a subway, an airport, and a convoy based on the output.
7. The method of claim 1, wherein, in the leader-follower problem, the leader has a set of leader actions and the follower has a set of follower actions, and wherein each leader action comprises movement along a convoy path that specifies a set of locations that a convoy travels and each follower action specifies movement to a particular location of the set of locations to attack at the particular location.
8. The method of claim 1, wherein, in the leader-follower problem, the leader has a set of leader actions and the follower has a set of follower actions, and wherein each mixed leader strategy comprises, for each particular leader action of the set of leader actions, a fraction of occasions that the leader chooses the particular leader action.
9. A system for solving a leader-follower problem, the system comprising:
at least one programmable processor;
at least one interface, wherein the at least one programmable processor is configured to receive information about the leader-follower problem via the at least one interface,
wherein the at least one programmable processor is configured to:
receive an expression of the leader-follower problem as a normal form Stackelberg game;
solve an enumerated linear program (LP) problem, wherein the at least one programmable processor solves the enumerated LP program by at least:
determining a respective mixed leader strategy for each possible follower action of the normal form Stackelberg game, and
optimizing a leader payoff for a given mixed leader strategy of the determined mixed leader strategies and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke that respective the given fixed follower action,
generate an output derived from the optimization of the leader payoff for the given mixed leader strategy and the given fixed follower action over the feasible region, and
output the output by changing a physical state associated with the interface.
10. The system of claim 9, wherein the at least one programmable processor is further configured to select the given mixed leader strategy as an optimal mixed leader strategy from the determined mixed leader strategies based on the optimization of the leader payoff for the given mixed leader strategy and the given fixed follower action over the feasible region, wherein generating an output derived from optimization comprises generating an output indicative of the selected optimal mixed leader strategy.
11. The system of claim 9, wherein the enumerated LP problem comprises:
an objective function that expresses the leader payoff for each possible follower action and the determined respective mixed leader strategy, and
a constraint that requires a follower payoff for the given mixed leader strategy and the given fixed follower action to be greater than or equal to a follower payoff for the given mixed leader strategy and every other possible follower action.
12. The system of claim 9, wherein the at least one programmable processor receives the expression of the leader-follower problem as a normal form Stackelberg game by receiving a reward matrix that specifies a leader payoff and a follower payoff for a given leader action and a given follower action.
13. The system of claim 9, wherein the leader-follower problem comprises a defender-attacker game in which the leader comprises the defender and the follower comprises the attacker.
14. The system of claim 9, wherein the at least one programmable processor is configured to execute software, wherein the software comprises a front end and back end,
wherein the front end comprises a front-end interface configured to receive a reward payoff that comprises the expression of the leader-follower problem as the normal form Stackelberg game, and
wherein the back end comprises an enumerated LP module to solve the LP problem for each possible follower action.
15. The system of claim 9, wherein the at least one interface comprises:
a first interface configured to communicatively couple the system to an input device, and
a second interface configured to communicatively couple the system to a display device, wherein the at least one programmable processor changes the physical state associated with the at least one interface by at least displaying a portion of the output on the display device.
16. A program product for solving a leader-follower problem, the program-product comprising a non-transitory processor-readable medium on which program instructions are embodied, wherein the program instructions are operable, when executed by at least one programmable processor included in a device, to cause the device to:
receive an expression of the leader-follower problem as a normal form Stackelberg game;
solve an enumerated linear program (LP) problem, wherein solving the enumerated LP problem comprises:
determining a respective mixed leader strategy for each possible follower action of the normal form Stackelberg game, and
optimizing a leader payoff for a given mixed leader strategy of the determined mixed leader strategies and a given fixed follower action over a feasible region that includes only mixed leader strategies that provoke the given fixed follower action;
generate an output derived from the optimization of the leader payoff for the given mixed leader strategy and the given fixed follower action over the feasible region; and
output the output by changing a physical state associated with an interface.
17. The program product of claim 16, wherein the program instructions are further operable, when executed by at least one programmable processor included in a device, to cause the device to select the given mixed leader strategy as an optimal mixed leader strategy from the determined mixed leader strategies based on the optimization of the leader payoff for the given mixed leader strategy and the given fixed follower action over the feasible region, wherein the program instructions cause the device to generate an output generating an output derived from optimization by at least generating an output indicative of the selected optimal mixed leader strategy.
18. The program product of claim 16, wherein the enumerated LP problem comprises:
an objective function that expresses the leader payoff for each possible follower action and the determined respective mixed leader strategy, and
a constraint that requires a follower payoff for the given mixed leader strategy and the given fixed follower action to be greater than or equal to a follower payoff for the given mixed leader strategy and every other possible follower action.
19. The program product of claim 16, wherein the program instructions are further operable, when executed by the at least one programmable processor included in the device, to cause the device to receive the expression of the leader-follower problem as a normal form Stackelberg game by receiving a reward matrix that specifies a leader payoff and a follower payoff for a given leader action and a given follower action.
20. The program product of claim 16, wherein the leader-follower problem comprises a defender-attacker game in which the leader comprises the defender and the follower comprises the attacker.
US12/261,616 2008-10-30 2008-10-30 Enumerated linear programming for optimal strategies Expired - Fee Related US8108188B2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/261,616 US8108188B2 (en) 2008-10-30 2008-10-30 Enumerated linear programming for optimal strategies
EP09173990A EP2182474A3 (en) 2008-10-30 2009-10-23 Enumerated linear programming for optimal strategies

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/261,616 US8108188B2 (en) 2008-10-30 2008-10-30 Enumerated linear programming for optimal strategies

Publications (2)

Publication Number Publication Date
US20100114541A1 US20100114541A1 (en) 2010-05-06
US8108188B2 true US8108188B2 (en) 2012-01-31

Family

ID=41668414

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/261,616 Expired - Fee Related US8108188B2 (en) 2008-10-30 2008-10-30 Enumerated linear programming for optimal strategies

Country Status (2)

Country Link
US (1) US8108188B2 (en)
EP (1) EP2182474A3 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8364511B2 (en) * 2007-10-15 2013-01-29 University Of Southern California Agent security via approximate solvers
US8545332B2 (en) * 2012-02-02 2013-10-01 International Business Machines Corporation Optimal policy determination using repeated stackelberg games with unknown player preferences
US20130273514A1 (en) * 2007-10-15 2013-10-17 University Of Southern California Optimal Strategies in Security Games
US20130318615A1 (en) * 2012-05-23 2013-11-28 International Business Machines Corporation Predicting attacks based on probabilistic game-theory

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8224681B2 (en) * 2007-10-15 2012-07-17 University Of Southern California Optimizing a security patrolling strategy using decomposed optimal Bayesian Stackelberg solver
WO2013176784A1 (en) * 2012-05-24 2013-11-28 University Of Southern California Optimal strategies in security games
CN104506288A (en) * 2015-01-23 2015-04-08 重庆邮电大学 Probability network code re-transmission method based on Stackelberg game
EP3671692A1 (en) * 2018-12-19 2020-06-24 Ningbo Geely Automobile Research & Development Co. Ltd. Time for passage of a platoon of vehicles
AU2019100368B4 (en) * 2019-01-25 2019-11-28 Norman BOYLE A driverless impact attenuating traffic management vehicle
WO2020162343A1 (en) * 2019-02-04 2020-08-13 日本電気株式会社 Vehicle management device, vehicle management method, and storage medium having program stored therein
CN111475821B (en) * 2020-01-17 2023-04-18 吉林大学 Block chain consensus mechanism method based on file storage certification
CN111546850A (en) * 2020-03-31 2020-08-18 重庆交通大学 Vehicle body height and vehicle attitude coordination control method based on hybrid logic dynamic model
JP7343438B2 (en) * 2020-04-02 2023-09-12 トヨタ自動車株式会社 Autonomous vehicles and autonomous vehicle operation management devices
US11443636B2 (en) * 2020-05-06 2022-09-13 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and methods of platoon leadership as a service
US11869361B2 (en) * 2021-04-01 2024-01-09 Gm Cruise Holdings Llc Coordinated multi-vehicle routing
CN114267168B (en) * 2021-12-24 2023-03-21 北京航空航天大学 Formation resource allocation method applied to urban expressway environment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7315801B1 (en) * 2000-01-14 2008-01-01 Secure Computing Corporation Network security modeling system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8224681B2 (en) * 2007-10-15 2012-07-17 University Of Southern California Optimizing a security patrolling strategy using decomposed optimal Bayesian Stackelberg solver
US8195490B2 (en) * 2007-10-15 2012-06-05 University Of Southern California Agent security via approximate solvers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7315801B1 (en) * 2000-01-14 2008-01-01 Secure Computing Corporation Network security modeling system and method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Jain et al., "Bayesian Stackelberg Games and their Application for Security at Los Angeles International Airport", "ACM SIGecom Exchanges", 2008, vol. 7, No. 2, Publisher: ACM.
Paruchuri et al., "Playing Games for Secuirty: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games", "Proceedings of 7th International Conference on Autonomous Agents and Multiagent Systems", May 2008, pp. 895-902, Publisher: International Foundation for Autonomous Agents and Multiagent Systems.
Paruchuri, Praveen, "Keep the Adversary Guessing: Agent Security by Policy Randomization", "Dissertation", May 2007, pp. 1-119, Publisher: University of Southern California, Published in: California.
Pita et al., "Deployed ARMOR Protection: The Application of a Game Theoretic Model for Security at the Los Angeles International Airpo", "Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems", 2008, pp. 125-132, Publisher: International Foundation for Autonomous Agents and Multiagent Systems.
Tambe et al., "Security via Strategic Randomization", Dec. 31, 2007, pp. 1-4, Publisher: USC Create Homeland Security Center, Published in: California.

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8364511B2 (en) * 2007-10-15 2013-01-29 University Of Southern California Agent security via approximate solvers
US20130273514A1 (en) * 2007-10-15 2013-10-17 University Of Southern California Optimal Strategies in Security Games
US8545332B2 (en) * 2012-02-02 2013-10-01 International Business Machines Corporation Optimal policy determination using repeated stackelberg games with unknown player preferences
US20130318615A1 (en) * 2012-05-23 2013-11-28 International Business Machines Corporation Predicting attacks based on probabilistic game-theory
US8863293B2 (en) * 2012-05-23 2014-10-14 International Business Machines Corporation Predicting attacks based on probabilistic game-theory

Also Published As

Publication number Publication date
EP2182474A2 (en) 2010-05-05
US20100114541A1 (en) 2010-05-06
EP2182474A3 (en) 2012-01-25

Similar Documents

Publication Publication Date Title
US8108188B2 (en) Enumerated linear programming for optimal strategies
US8224681B2 (en) Optimizing a security patrolling strategy using decomposed optimal Bayesian Stackelberg solver
Chadès et al. Optimization methods to solve adaptive management problems
US8364511B2 (en) Agent security via approximate solvers
Ahner et al. Optimal multi-stage allocation of weapons to targets using adaptive dynamic programming
US8545332B2 (en) Optimal policy determination using repeated stackelberg games with unknown player preferences
US11348272B2 (en) Vegetation index calculation apparatus, vegetation index calculation method, and computer readable recording medium
US11688077B2 (en) Adaptive object tracking policy
Mabrok et al. Category theory as a formal mathematical foundation for model-based systems engineering
US9426170B2 (en) Identifying target customers to stem the flow of negative campaign
US20210012191A1 (en) Performing multivariate time series prediction with three-dimensional transformations
US7730000B2 (en) Method of developing solutions for online convex optimization problems when a decision maker has knowledge of all past states and resulting cost functions for previous choices and attempts to make new choices resulting in minimal regret
US20170206560A1 (en) Commercial message planning assistance system and sales prediction assistance system
Barr et al. Stone Soup open source framework for tracking and state estimation: enhancements and applications
El Ghaoui et al. Robust solutions to markov decision problems with uncertain transition matrices
Rhinehart et al. Intrinsic control of variational beliefs in dynamic partially-observed visual environments
Nilim et al. Robust markov decision processes with uncertain transition matrices
US20220221287A1 (en) Moving number estimating device, moving number estimating method, and moving number estimating program
Ziel Smoothed bernstein online aggregation for day-ahead electricity demand forecasting
Santos Jr et al. Capturing a Commander's decision making style
US20230351281A1 (en) Information processing device, machine learning method, and information processing method
McEneaney et al. Value-based control of the observation-decision process
Gorgan et al. Grid based environment application development methodology
Zhang et al. A Game-Theoretic Framework for AI Governance
CN114936263A (en) Retrieval method, device, equipment and medium for spatial target orbit forecast data

Legal Events

Date Code Title Description
AS Assignment

Owner name: HONEYWELL INTERNATIONAL INC.,NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JOHNSON, DANIEL P.;REEL/FRAME:021764/0636

Effective date: 20081030

Owner name: HONEYWELL INTERNATIONAL INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:JOHNSON, DANIEL P.;REEL/FRAME:021764/0636

Effective date: 20081030

CC Certificate of correction
REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20160131