There are two fundamentally different ways for an organisation to learn about strategy. The first is by studying history: what worked before, what failed, what patterns emerge from accumulated experience. The second is by creating internal simulation environments where strategies can be tested against plausible competitors and futures before they are deployed. Most organisations rely almost exclusively on the first approach. The most sophisticated rely on both, in concert.
The limitation of historical learning is that it offers only one data source: what actually happened. An organisation can study its past decisions, analyse outcomes, and extract lessons. Yet past experience captures only a narrow slice of possibility. It shows what the organisation did, and what resulted. It cannot easily show what would have happened had different choices been made, what happens if competitors react differently than they have before, or what emerges when multiple uncertainties combine in novel ways.
This is where self-play learning enters. Self-play is a technique for learning through competition or interaction with versions of oneself. A team proposes a strategic move; another team (or a past version of the same team) challenges it; outcomes are examined. The hypothesis embedded in the strategy is tested not against theory but against plausible alternatives. The process accelerates learning because it generates learning experiences far faster than waiting for market outcomes to accumulate.
Consider a simple example. An organisation is considering whether to enter a new market. Historical learning would examine past market entries: which succeeded, which failed, what conditions predicted success. This is valuable but limited. Self-play learning would construct a simulation where one team proposes the entry strategy and another team, briefed on the external environment and playing as a credible competitor, responds. How would established competitors actually react? What moves are available to them? What happens to pricing, customer loyalty, and profitability when they counter attack? The simulation runs not once but many times, with different competitor responses, different market conditions, different execution challenges. Each iteration generates data that historical analysis alone cannot provide.
The scientific foundation for this approach is robust. AlphaZero, the general-purpose game-playing system, learned to master chess, shogi, and Go through self-play. Starting from random moves and knowing only the rules of the game, it improved by playing against copies of itself. Each iteration, the system identified weaknesses in its current strategy by discovering positions where it could be defeated. It adapted, tested the adaptation against itself, and iterated. Within hours, it surpassed the strongest chess engines, which had been hand-crafted and refined by humans over decades. The engine did not learn from studying historical games. It learned from testing hypotheses against itself.
The gap between this algorithm and how organisations operate is instructive. Most organisations do not have mechanisms for systematic self-play. Strategy is debated in conference rooms based on past cases and executive intuition. Decisions are then executed in the market. By the time feedback arrives, the investments are committed and change is costly. In contrast, organisations that build simulation and wargaming infrastructure into their strategic process can test strategies repeatedly before commitment. They can stress-test assumptions. They can discover vulnerabilities early, when they are still correctable.
Wargaming is perhaps the most practical form of organisational self-play. A wargame is a structured simulation where teams propose strategic moves and defend them with reasoning, whilst other teams play as competitors or market forces and respond. A neutral facilitator adjudicates outcomes, often informed by probability, expert knowledge, or randomness. What emerges is learning about the dynamics of strategic interaction: how competitors are likely to respond, what trade-offs are inherent in different approaches, where assumptions prove fragile when challenged by intelligent opposition.
The power of wargaming lies not in prediction but in perspective-taking and assumption-surfacing. When a manager proposes a strategy and another manager, briefed as the competitor, responds with an unexpected move, the simulation creates immediate discomfort. It reveals that assumptions felt certain were actually contingent. It exposes blind spots. It teaches humility about how the world actually works versus how we imagine it works. This learning is felt viscerally in ways that reading case studies never achieves.
Research on wargaming confirms these benefits. Organisations that conduct strategy simulations before committing to major investments report faster decision-making when crises arrive, deeper understanding of competitive dynamics, and higher quality strategic thinking among participants. The effect persists: managers who have experienced wargaming continue thinking more dynamically about strategy long after the simulation ends. They become more attuned to weak signals. They more readily challenge their own assumptions. They prepare contingency plans for possibilities they had not previously considered.
The mechanics of effective self-play are deceptively simple. First, frame the strategic question clearly: what decision must we make? What are we genuinely uncertain about? Second, construct scenarios around key uncertainties rather than around the most likely future. Third, assign teams to different roles: one team proposes the strategy, another plays as a credible competitor or external force, a third observes and records learning. Fourth, compress time: decisions that would take months in reality can be played out in hours. Fifth, iterate: run the simulation multiple times with variations. Sixth, capture learning in real time: what did we assume that proved wrong? What surprised us? How does this change our strategic thinking?
The constraint is not method but imagination and discipline. Organisations often default to scenarios that are too similar to the current state, or that lack enough friction to test assumptions meaningfully. A simulated competitor that merely mimics current behaviour teaches little. A competitor that recognises the weaknesses in the organisation's proposed move and attacks them systematically creates genuine learning. This requires either domain expertise in the role being played, or willingness to let others play that role credibly.
The connection between self-play learning and organisational culture is important. Self-play requires psychological safety. Teams must be willing to propose strategies they are not certain about, knowing they will be challenged. They must be willing to discover that their assumptions are flawed. Organisations with cultures of blame and defensiveness cannot do this well. They cannot separate challenging a strategy from attacking the person who proposed it. In these environments, wargaming becomes political theatre. In cultures that distinguish between holding ideas lightly and defending them rigorously, self-play becomes powerful.
There is also a connection to continuous improvement. An organisation that runs a wargame once, learns something valuable, then reverts to normal operations captures only a fraction of potential benefit. An organisation that integrates wargaming into its rhythm, that revisits major strategic questions annually or quarterly, that runs simulations whenever significant uncertainty exists or major decisions loom, develops a muscle memory for strategic thinking. Teams become more skilful at the practice. The quality of simulation improves. The learning deepens.
The limitation is that self-play captures the interactions within a defined system. It can test strategic moves against plausible competitors and scenarios, but it cannot imagine disruptions that lie outside the frame entirely. A wargame designed around traditional competitors may not anticipate a technology disruption from outside the industry. Scenario planning and environmental scanning remain essential complements to self-play. The most robust strategic thinking combines learning from the past, simulation of plausible futures, and continuous sensing of weak signals from the edges.
Yet for the specific challenge of testing strategies under realistic competitive and market conditions, self-play offers something historical analysis cannot: the ability to compress experience. An organisation can live through months or years of strategic interaction in days. It can test dozens of hypotheses, each generating learning. It can identify the specific conditions under which a strategy becomes vulnerable, and adjust before commitment. This accelerates the learning cycle and builds strategic resilience.
The shift from relying solely on historical learning to incorporating systematic self-play represents a maturation of how organisations approach strategy. It acknowledges the limits of hindsight and the value of rigorous foresight. It treats the future not as something to be predicted but as something to be rehearsed, tested, and iteratively refined. Organisations that master this practice move strategy from something that happens once a year in planning cycles to something that is continuously practiced, tested, and learned from. The result is not perfect prediction. It is better preparation, faster adaptation, and strategic thinking that has been validated in simulation before being deployed in the market.
Links: Matrix Games aid scenario analysis, Multiple futures, Informed decision-making needs narratives and scenarios