Download Clarity in AI: Differentiating Machine Learning, AI, and Statistics and more Exercises Artificial Intelligence in PDF only on Docsity! Comments on Michael Jordan’s Essay “Artificial Intelligence: The revolution hasn’t happened yet” Emmanuel Candès, John Duchi, Chiara Sabatti Stanford University March 2019 We praise Jordan for bringing much needed clarity about the current status of Artificial Intelligence (AI)—what it currently is and what it is not—as well as explaining the current challenges lying ahead and outlining what is missing and remains to be done. Jordan makes several claims supported by a list of talking points that we hope will reach a wide audience; ideally, that audience will include academic, university, and governmental leaders, at a time where significant resources are being allocated to AI for research and education. The importance of clarity Jordan makes the point of being precise about the history of the term AI, and distinguishes several activities taking place under the AI umbrella term. Is it all right to use AI as a label for all of these different activities? Jordan seems to think it is not and we agree. To begin with, words are not simple aseptic names; they matter, and they convey meaning (as any branding expert knows). To quote Heidegger: “Man acts as though he were the shaper and master of language, while in fact language remains the master of man.” In this instance, we believe that mislabeling generates confusion, which has consequences for research and educational programming. Mislabeling and lack of historical knowledge obscure the areas in which we must educate students. Jordan argues “that most of what is being called AI today, particularly in the public sphere, is what has been called Machine Learning (ML) for the past several decades.” This is a fair point. Now, what has made ML so successful? What are the disciplines supporting ML and providing a good basis to understand the challenges, open problems and limitations of the current techniques? A quick look at major machine learning textbooks reveals that they all begin with a treatment of what one might term basic statistical tools (linear models, generalized linear models, logistic regression) as well as a treatment of cross validation, overfitting, and related statistical concepts. We also find chapters on probability theory and probabilistic modeling. How about engineering disciplines? Clearly, progress in optimization, particularly in convex optimization, has fueled ML algorithms for the last two decades. When we think about setting up educational programs, clarity is recognizing that statistical, probabilistic, and algorithmic reasoning have been successful, and that it is crucial for us to train researchers in these disciplines to make further progress and understand the limits of current tools. At the research level, different fields of research (e.g., optimization, control, statistics) use similar tools. These research communities, however, have distinct intellectual agendas and work on very different problems; by all being in “AI,” we obscure what progress is missing and what still remains to be solved, making it harder for institutions and society to choose how to invest wisely and effectively in research. Mislabeling also hides the fact that a self-driving car requires more than just a good vision system. It will require roads and all kinds of additional infrastructure. Mislabeling hides the fact that, 1 even when we write that an “artificial intelligence” system recommends a diet [6], it is not AI that performs a study of gut microbiomes, measures their variety, evaluates insulin and sugar responses to different foods, nor even fits the model, which in this case, is a gradient-boosted decision tree [7]. This mislabeling also hides that machine learning should not be an end to itself: just getting people what they want faster (better ads, better search results, better movies, algorithms for more addictive “handles” in songs) does not make us better. What would make us better is a deep investment in real world problems, collaboration between methods scientists (ML researchers) and domain scientists, for instance, studying the persistent degradation of our oceans and recommending actions, or investigating susceptibility to and effective treatments for opioid addiction. An important confusion Jordan addresses is the sense of over-achievement that the use of the term AI conveys. Bluntly, we do not have intelligent machines. We have many unsolved problems. We particularly applaud recognition that much progress is needed in terms of “inferring and represent- ing causality.” This is an area where the ingredients that have made AI very successful—trillions of examples, immense compute power, and fairly narrow tasks—have limited applicability. To recognize whether a cat is on an image or not, the machine does not reason. Rather, it does (so- phisticated) pattern matching. Pearl describes “the ability of imagining things that are not there” as distinctive characteristics of human reasoning, and he sees this counterfactual reasoning as the foundation of the ability of thinking causally; this is absent from the current predictive machine learning toolbox. The role statistics can play In contrast, counterfactual reasoning and imagining what is not there (yet might be) are not foreign to statistics. Statistics has grappled for many years with the challenge of searching for causal relations: emphasizing (sometimes stiflingly) how these cannot be deduced by simple asso- ciation, developing randomized trial frameworks, introducing the idea of “confounders.” Consider the Neyman-Rubin potential outcomes model, which effectively asks: what would have been my response, had I taken the treatment? Or the statistical approaches to estimate the unseen numbers of species, the “dark figure” of unrecorded victims of a certain crime. And more generally, the foundations of statistical inference build precisely out of the ability to imagine sample values you might obtain if you were to repeat an experiment or a data collection procedure. Recognizing how statistics incorporates this fundamental characteristic of human intelligence makes us think about its potential in accompanying the development of our data-laden society; we enumerate a few directions in which we think statistical reasoning is likely to be fruitful. 1. Robustness: As systems based on data interface more and more with the world, it is important that we build them to be robust. It is not sufficient to achieve reasonable performance on a hold-out dataset. We would like to retain predictive power when circumstances are subject to reasonable changes. (Think of high profile failures: in 2015, software engineer Jacky Alciné pointed out that the image recognition algorithms in Google Photos were classifying his black friends as “gorillas.”) Statistical reasoning and tools (for example, can we have “good enough” performance 99% of the time; can we be confident in our predictions; how confident are our predictions?) will be important. 2. Validity of algorithmic inferences: Algorithmic techniques to infer patterns and structure have had exceptional success recently in many areas of practical value. They can also be important, 2