Dev Diary #37 - Behavior of Actors ðŸ§
What's happening / TLDR: Developer diaries introduce details of Espiocracy - Cold War strategy game in which you play as an intelligence agency. You can catch up with the most important dev diary (The Vision) and find out more on Steam page.
---
Actors - influential individuals and organizations, capable of changing history - are the main building block of historical simulation in Espiocracy. Today we'll explore their AI.
This is a fascinating engineering problem! We want:
Playing to our strengths, we have:
If you've been anywhere near computer science in the last few years, you know where this leads to: machine learning. The game does not have computational budget for glamorous deep neural networks, but it has enough for two efficient models (regression & gradient boosting) at the heart of actor behavior, which do all the heavy lifting (training) before the game is installed on your computer and then use this compressed knowledge (inference) to control decisions of actors in gameplay.
Naturally, these models are embedded into a wider solution that starts off at an unusual place.
[h2]Motivation and Goals of Actors[/h2]
All living organisms have to maintain homeostasis, an internal balance of parameters such as temperature, otherwise, they will die. Peter Sterling and Joseph Eyer proposed that evolution invented brains to switch from reactive to predictive homeostasis (allostasis). After all, organisms cannot react to fatal disruptions - but they can try to predict them. This point of view was picked up in the last decade by the theory of constructed emotion in which Lisa Feldman Barrett posits that human emotions are predictions, not reactions. Instead of mere emotional response to external stimuli, our brains may build internal models of the future and give it a meaning that is perceived, i.a., as emotions. Returning to the example of temperature regulation, we evolved from cells that simply swim away from the cold, into cavemen who can be happy at the end of severe winter, while it's still cold, because we understand seasons and know that soon we'll be embraced by warm spring.
Remarkably, this approach to cognition is inversely mirrored by the critique of AI in games. AI is stupid when it can be easily duped (cheesed) with any ploy to which it will be too late to merely react, when it does not have a sense of self-preservation, cannot predict the consequences of its actions, remains predictable instead of predictive, or does not creatively prepare for the future.
Therefore, actors in Espiocracy are first and foremost allostatic forecasters.
(To be clear, we are not discussing AI of players competing with a human player, to be described in a different dev diary. This one is solely about actors, autonomous NPC-like entities.)
On the most basic level (lizard brain), actor behavior is guided by three needs and goals:

These are precisely and consistently quantifiable parameters. For instance, the need for survival increases when an actor's life is threatened by enemies, diseases, or the possibility of imprisonment followed by execution. The three parameters together (squared and multiplied by personal weights) contribute to the overall level of motivation of the actor. High enough motivation pushes the actor to launch actions that will meet needs and goals which caused motivation to increase. If you see here remnants of a perceptron, you're dead right.
[h2]Predictive AI 1: Acting on Needs and Goals[/h2]
Actions available to actors range from universal (such as fleeing the country) to type-specific (e.g. armed organizations able to raid a place) or even role-specific (a member of government defined in the constitution able to propose launching a war). The first example, running away, is easy to understand in the context of motivation. An actor highly worried about survival may flee the country. Once the actor is away from the threat, the need for survival does indeed decrease. Note that actions do not directly satisfy needs or goals, fleeing didn't give straight +20 to own survival, and instead, it worked through decoupled mechanics that underly the calculation of survival. (This is the critical ingredient that allows the mechanisms described below to work.)
Many actions, however, are more complicated than fight-or-flight response.
In the real life, this is where we would do the thinking - use domain knowledge and an internal model of the world to predict what a particular action would achieve. Generally, AI programmers replicate this process in the code but Espiocracy approaches it already in reconciliation with Sutton's bitter lesson and throws a lot of cheap compute at the problem: data about consequences of actions are collected from many simulation runs (with actions or action combinations paired with corresponding changes of goals and context vector), then these are used to train simple regression models (one per action), which regularly infer predicted values of need-goal satisfaction for every available action.

For the simplest example of action, we're essentially gathering data on how often/much fleeing the country actually saves an actor's life. Things get interesting when we start to compare actions (as a kind of internal model of the world!), for instance in this case an actor fighting for survival may plead for help, attack head-on the perceived threat, or resign from the role. This choice is influenced by availability (closed and well-guarded borders make fleeing much harder), traits (some actors will never flee), personal weights of needs and goals, competence parameter (less competent actors make less optimal decisions), and a small injection of randomness.
Computationally, a lot depends on the context vector and frequency of inference. Currently, these are tied to a regional level (a few countries), which means that action-need predictions are recalculated when, i.a., a war erupts in the region. I would worry more about it (and massage features in the vector or build a tree of models tailored to actor types) if I wouldn't have two other major components of the predictive AI.
[h2]Predictive AI 2: Simulating the Future[/h2]
Our brains can play out not only the consequences of an action - we can also predict what will happen without our interventions. More than simple extrapolation, we can sense dramatic shifts such as the consequences of losing an election.
The game regularly prepares simplified simulations of the future at different levels of detail in four timescales: 2 weeks forward, 2 months, 2 years, and 2 decades. Simulations focus only on the most important events (such as political changes) and approximate their influence on actor goals. Then, individual actors, depending on their influence, competence, position, and access to knowledge can tap into the information about changes to their future needs and goals. This creates lovely emergent motivation, for instance, a political leader anticipating his future loss, and prompted to prioritize growth over impact or even survival before an election.

Of course, the difference between natural prediction and clairvoyant cheating AI depends on the details of implementation. For this reason, simulations are generally about the overt world (e.g. no equivalent of intelligence mechanic from OpenXcom), and "tapping in" slowly grows into a complex algorithm (but it's more interesting to write code about access to information in an intelligence game than about directly scripted behavior!).
[h2]Predictive AI 3: Deeper Actions[/h2]
It's no coincidence that there are three goals and three predictive components. They were aligned in their developmental origin: survival-satisfaction, growth-simulation, and now it's time for impact - deep actions.
Actors intentionally change (impact) the history via actions-target pairs chosen by the second model, trained to spot changes in alternate history (instead of changes to needs and goals). Collected data about actions (enriched by flexible vectorized targets) is paired with influence on State Power Index, both short-term and long-term. In search for directional parameter that widely grasps all the significant events from writing influential books to winning the space race, it turned out that SPI is a robust-enough approximation, that also ties this more complex model (at the moment it's gradient boosting machine) to the logical attempt at making the actors try to improve the position of their country relative to other countries (it's not all roses, subverting another country also increases your SPI).

Not all actors will prioritize this kind of impact due to the weighted nature of needs and goals. Expanding on the tailored approach of all the other shallow actions (which takes into account traits or competence), here the model chooses actions from a list generated by the actor's ideologies, views, past experiences, or friendships and conflicts with other actors.
[h2]Final Remarks[/h2]
After staring into the abyss of predictive AI, in the next dev diary we'll look into other avenues of actor activity (such as reactions and storytelling features) on March 10th.
If you're not already wishlisting Espiocracy, consider doing it:
https://store.steampowered.com/app/1670650/Espiocracy/
There is also a small community around Espiocracy:

---
"Physical concepts are free creations of the human mind, and are not, however they may seem, uniquely determined by the external world. In our endeavor to understand reality we are somewhat like a man trying to understand the mechanism of a closed watch. He sees the face and the moving hands, even hears its ticking, but he has no way to open the case. If he is ingenious he may form some picture of a mechanism which could be responsible for all of the things he observes, but he may never be quite sure his picture is the only one which could explain his observations." - Albert Einstein & Leopold Infeld, 1938
---
Actors - influential individuals and organizations, capable of changing history - are the main building block of historical simulation in Espiocracy. Today we'll explore their AI.
This is a fascinating engineering problem! We want:
- Believable, interesting, sometimes cunning behavior
- via ~50 adjustable and moddable actions
- for 1500-2500 (!) actors active at any point
- in the real-time-ish game with very limited computational budget per actor
- without employing a team of AI programmers to craft and maintain large behavior trees or complex state machines
Playing to our strengths, we have:
- Many historical samples. One of the best sources is CIA archive of Presidential Daily Briefs, counting thousands of reports on precisely described political activities.
- Unlimited ability to run millions of simulations offline (= during development, before the release), to collect rich statistical data on the effects of any behavior in the game world.
- Audience of players in 2023. Most of them do have not bad GPUs, 60% have 6+ CPU cores, 93% clock them at 2.3+ GHz, 69% use 16+ GB RAM.
If you've been anywhere near computer science in the last few years, you know where this leads to: machine learning. The game does not have computational budget for glamorous deep neural networks, but it has enough for two efficient models (regression & gradient boosting) at the heart of actor behavior, which do all the heavy lifting (training) before the game is installed on your computer and then use this compressed knowledge (inference) to control decisions of actors in gameplay.
Naturally, these models are embedded into a wider solution that starts off at an unusual place.
[h2]Motivation and Goals of Actors[/h2]
All living organisms have to maintain homeostasis, an internal balance of parameters such as temperature, otherwise, they will die. Peter Sterling and Joseph Eyer proposed that evolution invented brains to switch from reactive to predictive homeostasis (allostasis). After all, organisms cannot react to fatal disruptions - but they can try to predict them. This point of view was picked up in the last decade by the theory of constructed emotion in which Lisa Feldman Barrett posits that human emotions are predictions, not reactions. Instead of mere emotional response to external stimuli, our brains may build internal models of the future and give it a meaning that is perceived, i.a., as emotions. Returning to the example of temperature regulation, we evolved from cells that simply swim away from the cold, into cavemen who can be happy at the end of severe winter, while it's still cold, because we understand seasons and know that soon we'll be embraced by warm spring.
Remarkably, this approach to cognition is inversely mirrored by the critique of AI in games. AI is stupid when it can be easily duped (cheesed) with any ploy to which it will be too late to merely react, when it does not have a sense of self-preservation, cannot predict the consequences of its actions, remains predictable instead of predictive, or does not creatively prepare for the future.
Therefore, actors in Espiocracy are first and foremost allostatic forecasters.
(To be clear, we are not discussing AI of players competing with a human player, to be described in a different dev diary. This one is solely about actors, autonomous NPC-like entities.)
On the most basic level (lizard brain), actor behavior is guided by three needs and goals:
- Survival - staying alive or not dissolved
- Growth - increasing influence (which factors in wealth, number of members, etc.)
- Impact - changing the world according to ideologies, views, traits, opportunities, etc.

These are precisely and consistently quantifiable parameters. For instance, the need for survival increases when an actor's life is threatened by enemies, diseases, or the possibility of imprisonment followed by execution. The three parameters together (squared and multiplied by personal weights) contribute to the overall level of motivation of the actor. High enough motivation pushes the actor to launch actions that will meet needs and goals which caused motivation to increase. If you see here remnants of a perceptron, you're dead right.
[h2]Predictive AI 1: Acting on Needs and Goals[/h2]
Actions available to actors range from universal (such as fleeing the country) to type-specific (e.g. armed organizations able to raid a place) or even role-specific (a member of government defined in the constitution able to propose launching a war). The first example, running away, is easy to understand in the context of motivation. An actor highly worried about survival may flee the country. Once the actor is away from the threat, the need for survival does indeed decrease. Note that actions do not directly satisfy needs or goals, fleeing didn't give straight +20 to own survival, and instead, it worked through decoupled mechanics that underly the calculation of survival. (This is the critical ingredient that allows the mechanisms described below to work.)
Many actions, however, are more complicated than fight-or-flight response.
In the real life, this is where we would do the thinking - use domain knowledge and an internal model of the world to predict what a particular action would achieve. Generally, AI programmers replicate this process in the code but Espiocracy approaches it already in reconciliation with Sutton's bitter lesson and throws a lot of cheap compute at the problem: data about consequences of actions are collected from many simulation runs (with actions or action combinations paired with corresponding changes of goals and context vector), then these are used to train simple regression models (one per action), which regularly infer predicted values of need-goal satisfaction for every available action.

For the simplest example of action, we're essentially gathering data on how often/much fleeing the country actually saves an actor's life. Things get interesting when we start to compare actions (as a kind of internal model of the world!), for instance in this case an actor fighting for survival may plead for help, attack head-on the perceived threat, or resign from the role. This choice is influenced by availability (closed and well-guarded borders make fleeing much harder), traits (some actors will never flee), personal weights of needs and goals, competence parameter (less competent actors make less optimal decisions), and a small injection of randomness.
Computationally, a lot depends on the context vector and frequency of inference. Currently, these are tied to a regional level (a few countries), which means that action-need predictions are recalculated when, i.a., a war erupts in the region. I would worry more about it (and massage features in the vector or build a tree of models tailored to actor types) if I wouldn't have two other major components of the predictive AI.
[h2]Predictive AI 2: Simulating the Future[/h2]
Our brains can play out not only the consequences of an action - we can also predict what will happen without our interventions. More than simple extrapolation, we can sense dramatic shifts such as the consequences of losing an election.
The game regularly prepares simplified simulations of the future at different levels of detail in four timescales: 2 weeks forward, 2 months, 2 years, and 2 decades. Simulations focus only on the most important events (such as political changes) and approximate their influence on actor goals. Then, individual actors, depending on their influence, competence, position, and access to knowledge can tap into the information about changes to their future needs and goals. This creates lovely emergent motivation, for instance, a political leader anticipating his future loss, and prompted to prioritize growth over impact or even survival before an election.

Of course, the difference between natural prediction and clairvoyant cheating AI depends on the details of implementation. For this reason, simulations are generally about the overt world (e.g. no equivalent of intelligence mechanic from OpenXcom), and "tapping in" slowly grows into a complex algorithm (but it's more interesting to write code about access to information in an intelligence game than about directly scripted behavior!).
[h2]Predictive AI 3: Deeper Actions[/h2]
It's no coincidence that there are three goals and three predictive components. They were aligned in their developmental origin: survival-satisfaction, growth-simulation, and now it's time for impact - deep actions.
Actors intentionally change (impact) the history via actions-target pairs chosen by the second model, trained to spot changes in alternate history (instead of changes to needs and goals). Collected data about actions (enriched by flexible vectorized targets) is paired with influence on State Power Index, both short-term and long-term. In search for directional parameter that widely grasps all the significant events from writing influential books to winning the space race, it turned out that SPI is a robust-enough approximation, that also ties this more complex model (at the moment it's gradient boosting machine) to the logical attempt at making the actors try to improve the position of their country relative to other countries (it's not all roses, subverting another country also increases your SPI).

Not all actors will prioritize this kind of impact due to the weighted nature of needs and goals. Expanding on the tailored approach of all the other shallow actions (which takes into account traits or competence), here the model chooses actions from a list generated by the actor's ideologies, views, past experiences, or friendships and conflicts with other actors.
[h2]Final Remarks[/h2]
After staring into the abyss of predictive AI, in the next dev diary we'll look into other avenues of actor activity (such as reactions and storytelling features) on March 10th.
If you're not already wishlisting Espiocracy, consider doing it:
https://store.steampowered.com/app/1670650/Espiocracy/
There is also a small community around Espiocracy:

---
"Physical concepts are free creations of the human mind, and are not, however they may seem, uniquely determined by the external world. In our endeavor to understand reality we are somewhat like a man trying to understand the mechanism of a closed watch. He sees the face and the moving hands, even hears its ticking, but he has no way to open the case. If he is ingenious he may form some picture of a mechanism which could be responsible for all of the things he observes, but he may never be quite sure his picture is the only one which could explain his observations." - Albert Einstein & Leopold Infeld, 1938