#Reviewing Prediction Machines: Cutting Through the AI Hype
Prediction Machines: The Simple Economics of Artificial Intelligence. Ajay Agrawal, Joshua Gans, and Avi Goldfarb. Boston, MA: Harvard Business Review Press, 2018.
There is no shortage of opinions about artificial intelligence (AI). Scour the blogs, and you’re bound to find references to both its promises and its perils. Frequently, the predictions are Janus-faced. Artificial intelligence will eliminate human jobs, and artificial intelligence will create human jobs. It heralds a new industrial revolution, and its impact will be constrained by its significant limitations. Such conflicting rhetoric appears in the military sphere, too. Will artificial intelligence lead to a post-strategy era, or is it but a military enabler. Will artificial intelligence lift the fog of battle or cloud our understanding of battle. Will it lead to a more humane battlefield or eliminate humanity from the battlefield altogether.
The cacophony can be deafening. Russian President Vladimir Putin’s pronouncement in 2017 that “[the leader in artificial intelligence] will be the ruler of the world,” or even Google CEO Sundar Pichai’s 2016 announcement of a new AI first strategy, amplifies the discord by injecting anxiety into the current frenzy. Little wonder, then, that some have started applying the faux-wisdom of Will Farrell’s Talladega Nights character Ricky Bobby to artificial intelligence, frantically proclaiming, “If you’re not first, you’re last.” But these trite soundbites and taglines do little to resolve the fundamental questions of what it is and how it will affect us?
Ajay Agrawal, Joshua Gans, and Avi Goldfarb offer a valuable, fresh perspective, cutting through the hype in their recent Prediction Machines: The Simple Economics of Artificial Intelligence. All three authors are economists at the University of Toronto’s Rotman School of Management, and they all have experience nurturing artificial intelligence start-ups at the Creative Destruction Lab. For this trio, the key to understanding artificial intelligence is to reduce it to a simple supply-and-demand curve.
Artificial intelligence is about generating predictions. “[It] takes information you have, often called ‘data,’ and uses it to generate information you don’t have.” Historically, collecting and parsing data, constructing models, and employing the resident statistical expertise to offer intelligible interpretations demanded significant resources. But what happens if the cost of prediction falls substantially? According to the law of supply and demand, cheaper equals more; hence, “Cheaper prediction will mean more predictions.” Understanding the implications of “more predictions” is the challenge.
Agrawal, Gans, and Goldfarb tackle it admirably. The book is organized into 19 brief chapters, grouped into five parts. Because it’s written primarily for a business audience, the discussion focuses on identifying emerging opportunities in artificial intelligence that “are likely to deliver the highest return on investment.” Hence, examples in the text tend to focus on C-suite strategies and business applications such as reducing customer churn or predicting credit card fraud.
Nonetheless, by successfully reframing artificial intelligence tools as cheap prediction machines, the trio of economists offer several critical insights that are as applicable to military deliberations as they are to discussions in the boardroom. I summarize three below: predictions will be situation-specific, predictions will sometimes be wrong, and decisions will still require human judgement.
Predictions Will Be Situation-Specific
Unlike guesses, predictions require data. More data provide more opportunities to discover critical linkages, generating better predictions. In the past, analytic techniques such as multivariate regression constrained the amount of data that could be scoured for correlations. Consequently, these techniques relied on an analyst’s intuition or hypothesis, and they functioned only as an average, potentially never actually yielding a correct answer. Not so with modern techniques in artificial intelligence, techniques feasting on the immense data sets and complex interactions that would otherwise overwhelm classic statistical models. Data have therefore been likened to “the new oil”—without them, the machine of artificial intlligence would grind to a halt. But, as Agrawal, Gans, and Goldfarb remind us, not all data are created equal.
Data must be tailored to the task at hand. Asking an artificial intelligence to predict whether the pixels in an image (information we know) correspond to a cat (information we don’t know) will not necessarily help when trying to predict if another group of pixels in another image correspond to a vehicle-borne improvised explosive device. Similarly, training a system to play Go and then asking it to play the much simpler game of Tic-Tac-Toe would still cause it to crash. Despite the common objectives of recognizing a specified object or win a game, the data are mutually exclusive based on the desired prediction.
Regrettably, Agrawal, Gans, and Goldfarb may contribute to some of the confusion by occasionally oversimplifying the prediction problem. For example, in their discussion of autonomous driving, the authors identify only a single necessary prediction, “What would a human do?” While framing the problem this way may help an engineer move beyond a rules-based programming decision tree, to be relevant the prediction demands additional nuance. For example, “What would a human do if a truck pulled out in front of him or her?” Only then can the data be searched for similar situations to generate a usable prediction. Without the nuance, the data collected by Tesla of humans driving their electric vehicles could be deemed equally applicable to soldiers driving their tanks on a battlefield.
Not only are data specific to the prediction, but the problems to which we can apply artificial intelligence are also situation-specific. Building on Donald Rumsfeld’s oft-repeated taxonomy of known knowns, known unknowns, and unknown unknowns, the trio of economists add another category: unknown knowns. For Agrawal, Gans, and Goldfarb, known knowns represent a sweet spot for artificial intelligence—the data are rich and we are confident in the predictions. In contrast, neither known unknowns nor unknown unknowns are suitable for artificial intelligence. In the former, there are insufficient data to generate a prediction—perhaps the event is too rare, as may often be the case for military planning and deliberations. In the latter, the requirement for a prediction isn’t even specified, a situation described by Taleb’s black swan. In the final case of unknown knowns, the data may be plentiful and we may be confident in the prediction, but the answer can be very wrong due to unrecognized gaps in the data set, such as omitted variables and counterfactuals that can contribute to problems of reverse causality.
Consequently, current artificial intelligence prediction machines represent “point solutions.” They are optimized for known known situations with plentiful data relevant to specific, understood work flows. To understand how an artificial intelligence tool may function within a specific workflow, the authors introduce the useful concept of a canvas that helps “decompose tasks in order to understand the potential role of a prediction machine,” the importance and availability of data to support it, and the desired outcome. The most important element of the artificial intelligence canvas, though, is the core prediction. Its identification and accurate specification for the task-at-hand are essential. Otherwise, the entire artificial intelligence strategy can be derailed.
Predictions Will Sometimes Be Wrong
The tools of artificial intelligence rely on available data to generate a prediction. Agrawal, Gans, and Goldfarb identify three types of necessary data: training, input, and feedback. The tool is developed using training data and fed input data to generate its prediction. Feedback data from the generated prediction are then used to further improve the algorithm.
More and richer training data generally contribute to better predictions, but collecting data can be resource intensive, constraining the data available for initial training. Feedback data fill the gap, allowing the prediction machine to continue learning. But that feedback data must come from use in the real world. Consequently, the predictions or artificial intelligence are more likely to be wrong when the tool is first fielded. “Determining what constitutes good enough [for initial release] is a critical decision.” What is the acceptable error rate, and who makes that determination?
Even if data are plentiful and the algorithm refined, if data are flawed the predictions will still be incorrect. Additionally, it’s important to remember that all data are vulnerable to manipulation, which would significantly degrade the tools of artificial intelligence. For example, feeding corrupt input data into a prediction machine could crash an artificial intelligence tool. Alternatively, the input data could be subtly altered such that an artificial intelligence tool will continue to function while generating bad predictions. By altering just a few pixels unrecognizable to the human eye, researchers at the Massachusetts Institute of Technology successfully fooled one of Google’s object recognition tools into predicting an image of four machine guns was actually a helicopter. Similarly, feedback data can be manipulated to alter the performance of an artificial intelligence tool, as was observed in Microsoft’s failed Twitter chatbot, Tay. Training data introduce their own vulnerabilities into artificial intelligence—an adversary can interrogate the algorithm, bombarding it with input data while monitoring the output in order to reverse-engineer the prediction machine. Once the inner workings are understood, the tool becomes susceptible to additional manipulation.
Detecting flawed predictions, either due to inadequate learning or adversarial data manipulation, poses a significant challenge. It’s impossible to open the “black box” of an artificial intelligence and identify “what causes what.” While DARPA is trying to resolve this shortcoming, presently the only way to validate whether the predictions are accurate is to study the generated predictions. Agrawal, Gans, and Goldfarb suggest constructing a hypothesis to test for flawed predictions and hidden biases, and then feeding select input data into the prediction machine to test the hypothesis. However, since “we are most likely to deploy prediction machines in situations where prediction is hard,” the authors acknowledge that hypothesis testing of these complex predictions may prove exceptionally difficult. This challenge may be further exacerbated in military-specific scenarios due to the lethal outcomes that often characterize perplexing military problems.
Decisions Based on Predictions Will Still Require Human Judgement
For all the promise of more, better, and cheaper predictions, the decisions based on those predictions will still require human judgement. In fact, just as the value of creamer rises with the value of coffee (they are economic complements), we can expect the value of human judgement to rise as predictions generated by artificial intelligence become more prevalent.
Predictions are but an input into eventual decisions and associated actions. I could estimate the likelihood of my car breaking down in the next six months, an unexpected overseas relocation, or the costs of my kids’ college education (that I predict I’ll have to help finance), but these predictions may not alter my decision to purchase a new car, because they don’t determine the value I’ve assigned to the outcome of driving a new car. The process of assigning that value—the associated reward or payoff—is a distinctly human judgement, and one that varies among individuals.
In the past, these prediction and judgement inputs into our decisions were obscured because we often performed both simultaneously in our head. However, the outsourcing of the prediction function to the new tools of artificial intelligence forces us to “to examine the anatomy of a decision” and acknowledge the distinction.
Herein lies an essential point and the crux of the argument put forward by Agrawal, Gans, and Goldfarb. For all the popular talk of artificial intelligence displacing humans, the three economists assert “prediction machines are a tool for humans,” and “humans are needed to weigh outcomes and impose judgement.” Humans decide what constitutes a best outcome based on the predictions. Moreover, more predictions will yield more payoffs for humans to evaluate and more decisions for humans to make.
Occasionally, an appropriate payoff based on an prediction generated by artificial intelligence can be predetermined and the resulting decision coded into the machine. In these cases, because the prediction dictates the decision, the task itself is ripe for automation. But more often, situations are complex and prediction is hard. As identified above, these are the situations where we are most likely to introduce prediction machines, and the residual uncertainty of the prediction can actually necessitate greater human judgment because the prediction, even if generated through artificial intelligence, may not always be correct. Thus, rather than eliminating the human, artificial intelligence often places an even greater imperative on the human remaining within the system.
Still, the fact that human skills remain essential to the process does not necessarily dictate the same humans be retained in the process. In their final chapter assessing the broader societal impacts of an future dominated by artificial intelligence, Agrawal, Gans, and Goldfarb conclude “the key policy question isn’t about whether AI will bring benefits but about how those benefits will be distributed.” As these tools become more prevalent, individuals will have to learn new skills, and in the process income inequality may be temporarily exacerbated. “Reward function engineers,” those who understand “both the objectives of the organization and the capabilities of the machines” and who can therefore provide the necessary judgement to help guide decisions based on the various predictions, will likely flourish. It’s likely that within the military, strategic and operational planners, as well as subject matter experts, will serve as these essential reward function engineers.
Our current and near-future artificial intelligence tools are idiot savants. Give them a problem and data for which they are trained, and they will perform remarkably; give them a problem for which they are ill-equipped, and they will fail stupendously. It doesn’t matter if the tool is designed for business or national defense.
Too often in the public discourse, artificial intelligence is portrayed as magical fairy dust that should be applied liberally to our most challenging problems. Agrawal, Gans, and Goldfarb’s Prediction Machines dismisses this fallacy. Although written for a business audience, its insights are not confined to the boardroom. Prediction Machines provides a compelling, fresh perspective to help us understand what artificial intelligence is and its potential impact on our world. The text is essential reading for those grappling to make sense of the field.
For Agrawal, Gans, and Goldfarb, artificial intelligence is simply a prediction machine—it uses information we possess to generate information we do not possess. This simple realization immediately refocuses contemporary discussions and guides fruitful development of artificial intelligence. It underscores the situation-specific nature of its data and tools. It discloses its fallibility. And it reveals the role of predictions in our decision process, not as determinants but rather as inputs that must be evaluated according to our uniquely-human judgement. According to the three economists, that is the “most significant implication of prediction machines”—they “increase the value of judgement.”
Those humans and their judgement may not always be apparent once the tools of artificial intelligence are released into the wild. But they are there. And it is our challenge to seek out and locate those humans, because it is they, not the machines, who determined what’s best for all of us.
Steven Fino is an officer in the United States Air Force. He is the author of Tiger Check: Automating the US Air Force Fighter Pilot in Air-to-Air Combat, 1950-1980. The opinions expressed here are his own and do not reflect the official position of the U.S. Air Force, the Department of Defense, or the U.S. Government.
This article appeared originally at Strategy Bridge.
 Agrawal, Ajay, Joshua Gans, and Avi Goldfarb, Prediction Machines: The Simple Economics of Artificial Intelligence (Boston: Harvard Business Review Press, 2018), 24.
 Ibid., 14.
 Ibid., 3.
 Ibid., 33-34.
 Ibid., 43.
 Ibid., 14.
 Ibid., 59.
 Ibid., 60.
 Ibid., 62-63.
 Ibid., 130.
 Ibid., 134.
 Ibid., 43.
 Ibid., 185.
 Ibid., 200.
 Ibid., 204.
 Ibid., 203-4.
 Ibid., 197.
 Ibid., 197-98. “Some in the computer science community call this ‘AI neuroscience.’”
 Ibid., 200.
 Ibid., 15, 19-20.
 Ibid., 74.
 Ibid., 94.
 Ibid., 83.
 Ibid., 91.
 Ibid., 149. The authors also provide a useful example of bank tellers and ATMs, p171-72.
 Mindell, David A., Our Robots, Ourselves: Robotics and the Myths of Autonomy (New York: Viking, 2015), 10.
 Argawal et al., 151.
 Ibid., 213. Italics in original.
 Ibid., 214.
 Ibid., 18.
 Mindell, 13, 15. Mindell similarly challenges his readers to ask, “Where are the people? Which people are they? What are they doing? When are they doing it? … How does human experience change? And why does it matter?” when investigating autonomy. Italics in original.