



























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Department of Materials Science and Engineering, Massachusetts Institute of. Technology ... research objective using AI and machine learning (ML).
Typology: Exams
1 / 35
This page cannot be seen from the preview
Don't miss anything!




























Title: Autonomous Experimentation Systems for Materials Development: A Community Perspective Authors : Eric Stach,1,2^ Brian DeCost,^3 A. Gilad Kusne,^3 Jason Hattrick-Simpers,^3 Keith A. Brown,^4 Kristofer G. Reyes,^5 Joshua Schrier,^6 Simon Billinge,7,8^ Tonio Buonassisi,^9 Ian Foster,10,11,12^ Carla P. Gomes,^13 John M. Gregoire,^14 Apurva Mehta,^15 Joseph Montoya,^16 Elsa Olivetti,^17 Chiwoo Park,^18 Eli Rotenberg,^19 Semion K. Saikin,^20 Sylvia Smullin,^21 Valentin Stanev,^22 Benji Maruyama^23 *
research objective using AI and machine learning (ML). While AI and ML are often used interchangeably, we will use the broader term AI to emphasize algorithms used for decision- making in experiments; ML will refer to a subset of methods that include interpolation, classification, and statistical inference. Previous efforts to speed research include high-throughput and combinatorial (HT/Combi) approaches,^21 integrated computational materials engineering (ICME),^22 and the use of AI and ML methods to mine existing databases to identify potential compounds and processes.^23 -^26 While these efforts are powerful for exploring materials parameter spaces and producing and analyzing large amounts of data, they have low iteration rates (related to the “Analysis Bottleneck”^27 ), where interpreting results and planning further iterations are the rate-limiting factor. In contrast, AE systems can execute tens or hundreds of iterations without human intervention, making exceptional speed and high fidelity in research results possible. Indeed, the value proposition of AE lies in the advantages of the autonomous iterative loop; when properly designed, the loop can advance research progress much faster than current methods, make better use of human researcher time and effort, allow for novel unanticipated findings, and enable a better understanding of a system—all while expending fewer resources. Highly autonomous systems also facilitate experiments to be performed remotely,^28 making AE highly accessible to the broad community. The intent of this review is to inform the broad materials community about the current status and future directions of materials AE from researchers active in the area. After presenting some concepts to help the general readership appreciate AE campaigns, we will briefly look at previous attempts to speed research. We will then illustrate the state of the art of AE systems for materials using select examples, describe how AI technologies are being applied to materials AE, and consider the impact of AE on materials research. Lastly, we will set out a future vision for how to expand and exploit AE.
2. MORE ABOUT AUTONOMOUS EXPERIMENTATION For those less familiar with autonomy research, we will explain additional concepts to complement the AE campaign described in the Introduction and Figure 1; further background theory can be found in the literature.11, 19, 25, 29^ We will also look at how AE can enhance the efforts of human researchers. 2. 1 Campaign Objective The campaign objective is the goal of the iterative search process, which comprises a series of experiments termed an experimental campaign. The campaign objective is designed by human researchers in the first step of developing an AE system. In its most basic form, the objective can be the optimization of a property,^30 testing a hypothesis,^31 or the prediction of a result (e.g., in an early campaign, an AE system was tasked with closely predicting the growth rate of carbon nanotubes using prior experiments).^10 2.2 Analysis and Understanding (Knowledge Representation)
In our AE schema, the initial results and raw data from experiments are analyzed, or processed into information that can be exploited for decision-making. This could be, for example, translating force–displacement data to a material stiffness. During analysis, AI and other statistical methods may be used to identify trends or anomalies in data, categorize regions where experiments are prone to fail, detect fundamentally different system responses, or build beliefs into hyperparameters for models. The results of the analyses are captured into a knowledge representation, which is a machine-interpretable model of the information gained from past experiments, including the mapping from inputs to outputs.20, 32^ As the campaign advances, the internal knowledge representation, capturing understanding of the system under study, evolves to include newly observed data. The term "knowledge representation" is used for both the model of understanding and for how new data is captured by the model. The difference between the experimental results and the expected results based on the knowledge-representation model of the previous loop can be thought of as a feedback signal, and it can be used as the basis for training subsequent models.29, 33 2.3 Design of the Planning Algorithms The design of the planning algorithms requires careful consideration of the design policy, which is directly related to the field of optimal experimental design.29, 34, 35^ Throughout the execution of a campaign, the task of achieving the research objective (such as minimizing a response) is often in tension with resolving the uncertainties inherent in the AE system's knowledge representation. This tension, also known as the exploration–exploitation dilemma in the AI community,25, 36-^39 fundamentally arises from the limited and uncertain knowledge the autonomous system has about the physical system under study. The system may choose to perform experiments that are more tailored to reducing overall uncertainties and to searching for new minima (exploration), or it may choose to perform experiments near minima predicted based on current knowledge, uncertainties in that knowledge not withstanding (exploitation). A balance between these two modes, in which the response function is learned globally prior to optimization, is often more efficient than the decoupled alternative.^30 For an AE system to be autonomous, the planning algorithms should be able to function at a certain depth of intelligence; while a simple home thermostat can act on its own, its degree of intelligence is limited. Profound AI for AE can include logical reasoning, independent hypothesis generation and testing, understanding by analogy, the ability to extrapolate concepts, and the ability to design experiments to discern complex relationships efficiently and effectively, among myriad possible outcomes. Because of this versatility, decision authority can be delegated to the AI planner, making the iterative research loop possible. Additionally, the planning and analysis algorithms should be able to integrate contextual information and experimental uncertainties, e.g., intrinsic variability in the materials phenomena themselves, noise from the feedback characterization tools, or the influence of exogenous parameters we do not control/measure. 2.4 Human–Machine Teaming and Deciding on the Decision-Maker
performed; (ii) modeling and simulation as a substitute for slow and costly experiments; and (iii) data science methods to extract information from simulation and experimental data. The groundwork for these developments was laid in part by the Materials Genome Initiative (MGI),4, 45, 46 (^) as well as by similar initiatives worldwide.47, 48 (^) We will briefly review the state of each of these technology areas to clarify their contribution to AE. 3.1 From HT/Combi Experimentation to AE Traditional HT/Combi experiments expedite materials science discovery by parallelizing materials synthesis, processing, and characterization.^49 A typical HT/Combi experiment starts with the automated synthesis of a set of 10^1 – 102 samples, in which some combination of composition, microstructure, and processing have been systematically varied to cover the entire parameter space of interest. This library of samples is then screened either in parallel or serially using a set of automated measurement tools. HT/Combi experimental campaigns are typically limited to one or a few iterations of libraries. Some representative recent examples50, 51^ of this for materials are reviewed by Green et al.^49 Historically, HT experimentation (HTE) hardware development has focused on increasing the number of experimental results per unit time and decreasing the cost per experiment. This makes sense in a non - autonomous (“open-loop”) scenario, where the goal is either to obtain the composition–processing–structure–property linkages by an exhaustive experimental search or to generate a sufficiently large dataset that can be used post hoc to determine these linkages and provide information on regions of optimal performance for subsequent study. The shift towards autonomous approaches may eliminate the need for many experiments and instead favor faster turnaround for smaller batches of targeted experiments, as new results can be incorporated into the experiment planning. In a non-iterative (“open loop”) system, AI can be used to intelligently guide an automated characterization tool to subsample a pre-deposited compositional spread library, realizing a 2× to 10× decrease in the number of samples required to extract information from the system.^49 Some relevant examples of the trend towards lower-throughput, low-latency, small-batch laboratory automation include 3D-printed carousels for performing iterative syntheses of gold nanoparticles to obtain a desired spectrum,^16 dexterous, free-roaming robot chemists that synthesize and characterize small batches of photocatalysts,^52 one-at-a-time synthesis and optoelectronic characterization of perovskite thin films,^14 and the iterative synthesis of perovskite nanocrystals.^53 Microfluidic flow chemistry targeting nanocrystalline materials are especially amenable to this type of approach, as the products can be observed in iteratively changed conditions.^54 -^56 While AE has certainly built upon the techniques used in traditional combinatorial experiments, AE campaigns can be designed to adapt the experimental sampling as needed; replicates can be made where uncertainties are high, while redundant information can be minimized. 3.2 From Modeling and Simulation to AE Physics-based modeling is a mature field, and it is now widely accepted that simulations can identify possible materials of interest.^57 This is exemplified in national efforts, such as the MGI,4, 45, 46 (^) as well as by large-scale computational materials database/repositories, (^58) such as the
Materials Project,^46 AflowLib,^59 Open Quantum Materials Database,^60 the Harvard Clean Energy Project (for solar materials),^61 and the NOMAD repository.^62 Rich toolsets have been developed for facilitating large-scale computation and data archiving, such as ChemML^63 and Atomate.^64 Whereas past efforts have focused on making predictions that are subsequently tested in the laboratory, autonomy enables the incorporation of this information into the ongoing experimental process. That is, simulations are used to select better experiments, and simultaneously incoming experimental data are used to select more informative simulations, in a closed-loop process. A notable recent example of this idea is in the use of density functional theory (DFT) alloy thermodynamics as a probabilistic constraint in the (experimental) Bayesian optimization (BO) of perovskite alloys for structure and stability.^65 More examples of these techniques in the context of AE campaigns will be presented in Current and Related AI Technologies for Materials AE. 3.3 From Data Science Methodologies to AE The use of ML and AI methods for materials applications is now well established and is the topic of recent reviews.24, 40, 66-^70 Their use in accelerating tasks in materials research can be broadly classified as learning to “see” (e.g., spectral interpretation), learning to “estimate” (e.g., surrogate models for predicting outcomes), and learning to “search” (e.g., optimization).^71 Many ML predictions of new materials and properties have been confirmed experimentally.72, 73^ In addition to the use of these methods on simulation and experimental data, they have been used to process other sources of information, such as the natural language text descriptions of synthesis conditions and properties in published papers^65 and structured data showing the relationships between known materials.^74 In addition to mere prediction, ML approaches can play a role in facilitating human understanding. Relevant examples include the use of machine-learned natural language models to provide automated summarization of material properties,^75 collaborative human–algorithm optimization approaches,^43 and explainable AI (XAI) methods.76, 77^ ML and AI methods provide a necessary foundation for the planning and analysis algorithms of AE systems. 4. STATE OF THE ART THROUGH A SELECTION OF AE EXAMPLES AE for materials is a quickly developing field with new systems coming online with increasing frequency. In order to separate the abstract capabilities of the continually evolving robotic systems from the discrete achievements, we will view this progress through the lens of a selection of completed AE research campaigns (see Table 1). One overarching theme to note is that reports of fully autonomous systems are often closely preceded by related advances in hardware automation, in ML-driven experimental planning, or in both, but without full autonomy. These related non-autonomous advances along with certain efforts towards AE will also be included to better illustrate the current state of materials AE development. 4.1 The First Reported AE System for Materials Development, ARES Soon after realizing an automated system to map reaction conditions for carbon nanotube (CNT) growth,^78 Nikolaev et al. reported the first AE system for materials development (see Table 1, Study A).^10 Using in-situ Raman spectroscopy to monitor CNT growth,^79 their Autonomous
the parameter space while random sampling only explored 22%. Perhaps more importantly, the algorithmic sampling achieved the same performance in 128 experiments as the random sampling achieved in 1000. 4.4 AE for Additive Manufacturing: Bayesian Optimization versus Grid-Based Exploration Building on the trend of introducing new categories of experiments in an autonomous context while benchmarking against traditional techniques, Gongora et al.^30 developed BEAR, a robotic manufacturing and testing system to autonomously optimize the toughness of additively manufactured components (BEAR = Bayesian Experimental Autonomous Researcher; see Figure 3 and Table 1, Study D). As part of the initial demonstration to study components defined by four geometric parameters, the authors included an explicit comparison between experimental campaigns guided by BO and those guided by grid-based exploration, revealing the time- and cost-efficiency of AE. What the grid-based system achieved in about a month, the Bayesian system accomplished in just 12 h; after 24 h, the Bayesian method produced a higher toughness performance than that achieved by the month-long grid-based search. They have now extended their work to include finite-element modeling of the physical response, successfully increasing the toughness by another 30% (see Figure 3, Conclude panel).^87 4.5 AE for Thin Films There has been a sustained effort by multiple research groups to develop AE to synthesize and study functional thin films for energy applications. Once again, examples in automation and HTE came first. In 2019, Sun et al. developed a HT process that allowed the synthesis and characterization of 75 unique compositions of perovskite-inspired inorganic films over a span of two months.^88 Following these results, Langner et al. developed a robotic system to synthesize polymer blends for organic photovoltaics and to study degradation in a totally automated fashion, at ~300 samples per day. The resulting large dataset in a four-dimensional parameter space of compositional blends was used to simulate autonomous campaigns, which suggested that a self-driving laboratory could achieve equivalent performance in this space with 32 times fewer experiments.^89 A fully autonomous realization of functional films was published shortly thereafter by MacLeod et al., in which they reported a robotic system moving between synthesis, processing, and multiple characterization stations (Table 1, Study E). By guiding this system with BO through two 35-sample experimental campaigns, they optimized the hole mobility of an organic semiconductor film. Significantly, they also identified a region that exhibits a previously unknown local maximum in mobility.^14 4.6 AE for Quantum Dots In addition to films, quantum dots (QDs) have been the subject of advances in both automation and, recently, autonomy. As far back as 2010, HT synthesis had been applied to map the synthetic parameter space corresponding to QDs.^90 Efforts to screen QDs continue with recent reports on metal–halide QDs.^91 Recently, the concept of automated QD synthesis was combined with a ML-guided experimental planner to realize an artificial chemist for optimizing QD synthesis (Table 1, Study F).^55 This system utilized flow reactors to study a variety of decision-
making policies in a BO framework. Further, the study showed that learning can be accelerated by at least two-fold when the knowledge of one set of precursors was transferred to a different set of precursors. 4.7 Developments in Characterization and Analytical Methods in Efforts towards AE In some cases, efforts towards autonomy in the study of complex properties involve innovative approaches to assess properties. Kirman et al. employed optical observation of crystallization to identify novel perovskites.^92 HT experiments were made possible by using instrumentation developed for protein crystallography studies. ML was applied to both optically analyze samples to evaluate crystallization and to build a predictive model of whether samples would crystallize. Independently, Li et al. also combined robotic synthesis with ML-based experimental selection for perovskite synthetic studies.^93 While their analysis involved a number of manual steps including visual inspection, their experimental selection leveraged a previously developed experimental planner termed ESCALATE (Experiment Specification, Capture And Laboratory Automation Technology).^94 Efforts towards materials AE need not originate from a synthetic viewpoint; the active guidance of analytical systems can itself accelerate the characterization process. For instance, Noack et al. demonstrated how a kriging-based approach could accelerate X-ray scattering experiments by selecting the parameters of subsequent experiments.^95 This approach was experimentally validated through a set of campaigns, each with 600 experiments, on a sample composed of nanoparticles; a reduction in error was observed when the system was guided by active learning (AL), where the ML model’s uncertainty and expected value are used to select new data points. This study highlights a challenge inherent to benchmarking experimental-learning-based studies; comparisons can only be made to previously reported experiments. More recently, real-time control over X-ray measurements was combined with synthetic capabilities by Rakita et al. to dynamically adjust the redox state of compounds in solution.^96 While this approach only featured a single dimension of control (the presence of reducing or oxidizing agents), it is a promising example of how synthesis and characterization can be combined in an autonomous fashion. 4.8 AE and Materials Discovery Efforts towards AE in materials science has also led to the discovery of new materials. Combining HTE and ML, Ren et al. discovered a new metallic glass using an iterative approach and an ML model for experimental selection.^97 Many important materials properties are intimately tied to the structure. As such, learning the relationship between the structure of a material and how it is formed—i.e., phase map—can serve as a blueprint for guiding materials discovery and optimization. Kusne et al.^12 developed CAMEO (Closed-loop, Autonomous system for Materials Exploration and Optimization), an AE system that maximizes overall knowledge of the composition–structure relationship (Table 1, Study H). By controlling synchrotron X-ray diffraction measurements and exploiting phase-map knowledge, they identified a novel phase-change material, which has recently attracted attention in the electronics industry.^98 Further, recent reports of AE systems using first-principles simulation provide more evidence that this approach is amenable to the rapid discovery of novel materials formulations.^18
making them useful to closed-loop techniques. For example, in BEAR (Figure 3 and Table 1, Study D),30, 87^ the mechanical performance of a manufactured structure is viewed as an experimental response function over such structures and is modeled as a random function using a Gaussian process (GP) model.^32 Used with the expected improvement (EI) policy, in which sampling is pursued at the point most likely to maximize improvement of a value, this GP model is used to select the next structure to test.^104 GP models with the EI policy or similar modeling and policy choices are attractive because of the modeling and computational ease. The GP model allows the specification of the assumed structure, such as smoothness, of the response function without being overly restrictive. However, in many materials systems, such assumptions are not globally accurate. The archetypal example of this are critical phenomena. Critical regions of experiment space (e.g., delineating regimes of pressure or temperature) result in responses that change rapidly or discontinuously, which cannot be properly modeled using off-the-shelf GP models. This is not isolated to the use of GP models in BO. Many AL and DOE methods ultimately rely on similar types of generic models. For example, uncertainty-based methods^105 -^107 often rely on GP or linear models to model responses. Another feature not immediately captured with off-the-shelf methods is the fact that experiments often yield several types of responses, e.g., various characterizations, experiment failure, experimental time or cost, or an uncontrolled factor, such as laboratory humidity. More complex models are needed to properly capture the relationships between the different responses, as well as the uncertainties between these relationships. A joint description capturing a variety of measurable responses and phenomena may not be easy to work with. An alternative direction is to utilize an ensemble of more traditional models, each offering simple estimates of the functions of interest; however, the lack of formalism makes inference and predictions more difficult. For example, Powell and Reyes and co-workers108, 109^ describe methods for using an ensemble of physics-based kinetic models to represent beliefs on experimental responses. Other models such as ensembles of neural networks^55 can directly offer multi-variate predictions for several types of responses, in which correlations between outputs are emergent rather than having to explicitly couple them statistically. Such networks have already been used in experimental science and control settings.^110 In a broader context, ensemble-based methods could allow us to use a variety of different types of models in a single decision-making framework. Here, methods such as Bayesian hypothesis testing,^111 model averaging,^112 multi-fidelity modeling,113, 114^ strategies for multi-fidelity optimization with variable dimensional hierarchical models,113, 115^ and multi- information source optimization (MISO),^116 offer potential avenues for more robust modeling and decision-making. 5.2 Reinforcement Learning (RL) Closely related to closed-loop techniques, such as BO, are reinforcement learning (RL)^29 and optimal control.^35 Markov decision processes (MDPs), a core RL framework, models generic states of a closed-loop campaign, stochastic transitions between states upon taking experimental actions, and rewards or costs incurred when making such transitions,^117 offering a more fluent way of modeling many aspects of materials research. Through RL, MDPs allow an agent to make more operational considerations. RL decisions are obtained by estimating expected future cumulative rewards incurred when pursuing a particular branch of an experimental campaign.
Many such techniques do so by approximating a value function (i.e., a measure of how “good” states are) or a policy function (i.e., the expected best action we can take to transition to high- value states). As with BO, learning such functions can be done with generic black-box models or with more problem-specific models that use probabilistic beliefs on response functions, experimental failure, costs, or rewards obtained. 5.3 Deep Learning (DL) Regardless of the type of modeling, approximating the functions needed to execute decision- making in RL generally requires a significant computational investment. The coupling of deep learning (DL)^118 with RL—so-called deep reinforcement learning (DRL),^119 — to calculate DL- model surrogates of value or policy functions may prove useful here. DL models are trained against a large number of states/value pairs. This can be done offline, by considering a large number of potential states a campaign can be in and assuming that a representative set of potential states can be simulated. While this methodology proved successful in the case of AlphaGo^120 and other cases,^121 it remains to be seen whether something similar can be applied in the context of AE. In general, DL methods are also proving useful outside the context of predicting value or policy functions. They work well by self-discovering latent and predictive features from raw, often high-dimensional data.^122 Despite impressive results in many problems, the direct use of DL in materials AE is limited due to the high data requirements needed to train models. Requiring large sets of representative data is somewhat antithetical to the intelligent and nuanced exploration of experiment space discussed above. There are, however, opportunities for this powerful technique to be used inside the closed loop when simulations and physical models are used to generate synthetic data for offline pre-training of the DL model. DL can also be used to autonomously analyze rich characterization data, such as microscopy or tomography data, and possibly map such data into signals that the autonomous agent can use to close the loop. Current examples of this use in non-autonomous settings include DL for optimal microscopy,^123 cryo-electron microscopy,^124 and atom probe tomography.^125 5.4 Transfer Learning (TL) The lack of data is frequently encountered in autonomous research and generally prohibits the use of larger DL outright. To mitigate this, transfer learning (TL) can be used to leverage existing data of previously studied, related materials systems. One way to do this is with deep transfer learning (DTL).^126 Above, we discussed pre-training a DL model in a way similar to what would be encountered during the online execution of the closed loop. In DTL, a DL model is trained using data obtained from a separate task, often in an unsupervised manner, resulting in a learned latent representation of some material in general. Then, within the closed loop, the model is trained from latent representation features to a material property of interest. Pre-training the mapping from material to latent features reduces the data requirements needed to learn the mapping to the property of interest. Alternatively, one can use adjacent data to build more informative priors for BO models used in closed-loop design. This is the perspective taken by Roy and Kaelbling^127 and applied, for example, to building Bayesian priors for the tribological
phase-change memory material using only one-tenth the number of measurements required by the standard (non-autonomous) grid-based approach. While autonomous systems promise to more reliably perform optimal experiments towards an objective, some have expressed concern that robots will ignore results that are outside the objective but are nonetheless interesting and that they will miss serendipitous and synergistic unanticipated results that a human would naturally recognize.^138 In the future, serendipity- awareness may be incorporated into autonomous research algorithms.^86 6. POTENTIAL IMPACT OF AE ON MATERIALS SCIENCE RESEARCH 6.1 Increased Speed and Decreased Cost AE promises to disrupt the current research enterprise and investment structure by increasing returns on capital and skilled labor. While it is difficult to quantify the rate of research progress, it is a function of iteration time, number of iterations needed to find a solution, and the unit cost per iteration, all of which are expected to improve as experimental hardware is automated and as closed-loop iterative algorithms are implemented and improved. Early demonstrations of AE, such as the ARES,^10 BEAR,^30 and CAMEO^12 AE systems discussed above, have already demonstrated orders-of-magnitude reductions in iteration time and number of iterations needed to discover and characterize novel functional materials.^12 AE can also achieve better research outcomes than current processes in terms of parameters, such as materials performance or fidelity of characterization.12, 30 The exponential increase in research-progress speed enabled by AE will make research more affordable. Labor dominates the cost of research, and AE can effectively multiply the productivity of an individual researcher; hundreds of experimental iterations can be done in the time and labor it previously took to do one, reducing the marginal cost of subsequent experiments and allowing us to consider the economics of the AE process. In principle, AE- enabled research equipment need not be more expensive than traditional equipment; in practice, any additional capital costs are a fixed cost, and the amortized marginal cost is small because of the increased duty cycle. As research becomes more affordable, we expect it to become more accessible, just as computing power became more accessible with low-cost processors. 6.2 Future Directions: Globally Integrated AE Systems and a Rise in Citizen Science With the increasing speed of research (per researcher), we predict the trend depicted in Figure 4 ; it assumes each researcher will have access to research robots, which will increase the number of researchers. We expect three phases of AE development stemming from their degree of interconnectedness. Current AE systems are stand-alone and self-contained. In 3–5 years, we anticipate a transition to locally connected systems, where multiple robots can perform mutually dependent research. In 15–20 years, we expect a network of AE systems to be globally integrated, much like the internet is today. Importantly, we envisage network effects for the globally integrated AE systems, where beyond the tipping point, the size and degree of interconnectedness greatly multiply the impact of each new research robot’s contribution to the
network. We can thus expect solutions to currently intractable problems, as a result of leveraging network effects from data sharing and interpretation and a community-driven approach to scientific investigation. Currently, only those with access to large, well-resourced laboratories are able to participate in materials research at the highest level. As a result of the decreased cost and increased accessibility of research, a potential outcome is a rise in citizen science where—as in the astronomy and high-energy physics communities—contributions to the field can be made by enthusiasts with access to data or instruments. In the future, greater access to AE will provide more people the opportunity to access research robots and be able to do meaningful research. This may take the form of remote-access "cloud labs" (e.g., Emerald Cloud Lab, Strateos); low- cost, relatively self-contained networkable benchtop equipment analogous to 3D printers (e.g., modular automated organic synthesizers,^139 ChemPuters^140 ); or open-access challenges where participants can propose new experiments based on collected datasets (e.g., the DARPA SD Perovskites Synthesis challenge, performed on the RAPID system).^93 With the expected increase in accessibility, AE can help address the lack of diversity in science at the earliest stages. Studies have shown that while children may lose confidence in their potential to be scientists, they do not lose their ability to do science; this was especially evident in underrepresented groups.^141 As AE progresses, it can help make scientific research more accessible and appealing to everyone, especially those at risk. 6.3 Changes in Research Strategies As AE is applied to more types of materials systems, we must also consider the implications of AE on the design of campaign objectives and search strategies. Human researchers design experimental campaigns to balance the likelihood of success, potential benefits of success, and explainability of outcomes. Often this takes the form of starting from known experiments^142 and making modifications one variable at a time.^143 This strategy can be effective for local optimizations, but it has difficulty in multiparameter problems and results in biased datasets.^144 The speed and reduced human effort of AE enable a greater diversity of experiments, and since AI/ML algorithms excel at high-dimensional search problems, they are holistic rather than reductionist. AE also increases the risk appetite per experiment. The failure of one or even several experiments does not doom a campaign. In fact, "failed experiments" can serve to inform where experiments do not work and further improve the ML model.^145 Using AE, previously intractable problems become more likely to succeed, and we can pursue more challenging, high- dimensional problems.
7. INVESTMENTS FOR MATERIALS AE In order to fully benefit from AE, the community must overcome significant challenges by investing in ways that make experimental hardware, software, and data sharing more suited to AE. Investments in fundamental research typically focus on addressing specific foundational questions. However, investments in AE will establish an infrastructure that will broadly enable faster research towards many scientific questions as well as industry-relevant results.
machines and people—as well as the automated tools for constructing and curating these databases.^46 Additionally, efforts to develop uniform metadata descriptions, such as tracking material sources and workflow methodologies will be needed to improve knowledge representation. 7.3 Investments in Software Infrastructure To build automated or AE systems that can incorporate multiple commercial systems for synthesis and characterization, more open-source software, data standards, and APIs are needed. Irrespective of whether a system is fully automated, the algorithms used to direct experimental decision-making need to be both robust and flexible enough to be used on a variety of different experimental platforms. Investment is needed in the software infrastructure for materials AE. Atinary (formerly ChemOS),^159 ESCALATE,^94 LabMate.ML,^160 MAOS,^161 BlueSky,^162 and ARES™ OS^28 are examples of such efforts. However, the broader range of materials, modeling software, and experimental hardware will require further investment into software. Commercial hardware often uses proprietary software and data formats that are difficult to access or modify for incorporation into AE systems. 7.4 Collaborations as the Foundation to Future Cooperative Networks AE requires its collaborative network to expand to reach its full potential. As the AE infrastructure becomes more accessible with hardware, data-storage, and software investments, efforts should also be made to encourage key stakeholders in research—industry, academia, and government—to work together and take advantage of the increased accessibility. Current collaborations and partnerships will lay the foundations for the network of AE systems we anticipate in the future. Many materials and chemical corporations have HT/Combi research units that collaborate with academic researchers. Academic research teams are also directly commercializing technologies of AE materials discovery, e.g., ML tools for materials data analysis (Citrine Informatics). The main barrier to effective partnerships between academic and industrial teams is in the ownership of the co-developed intellectual property. With laboratory capabilities distributed across different entities and open-sourced ML algorithms trained with proprietary data, the questions about product ownership and the contributions of involved parties will be a persistent concern, requiring much legal effort to predefine the conditions of each partnership. Programs to standardize these partnerships would be helpful. Government-funded scientific user facilities (SUFs) can play a critical role in encouraging the transition from small-team independent research to cooperative scientific networks. The centralized nature of these facilities offers an opportunity to establish common data formats, data sharing policies, and new access paradigms such as multi-facility proposals. The national laboratory-scale engineering resources can be leveraged to enhance automation, develop hardware and software standards around which large community-scale AE programs can nucleate. In addition to encouraging collaborations between academic, government, and industrial partners, investments for improving the collaborative interaction between human researchers and
the AI algorithms of AE systems are necessary. Designing effective human–machine teaming is an emerging area in autonomy and user-experience research.^163 With good teaming, humans and chess-playing computers working together outperform either humans alone or computers alone.^164 In the 2005 Freestyle Chess Tournament, a team of chess masters and a supercomputer were defeated by a team of amateur humans and desktop computers with superior teaming. There are on-going efforts to incorporate human expertise, judgement, and prior knowledge into search and decision-making algorithms.^165 With respect to materials, only nascent efforts at teaming humans with AI exist for inorganic materials.^43 7.5 Educating the Materials–AI Workforce We have recommended a variety of investments to help establish a new infrastructure for AE to accelerate research progress. This new infrastructure will make possible the globally integrated AE systems we expect in 15–20 years (Figure 4). The systems will be linked together over networks, where experimental, simulation, and information processing nodes combine with human direction to form autonomous "collaboratories,"^166 generating scientific knowledge at rates barely imaginable today. Changes in the workforce behind AE will be required to support this future infrastructure. The current lack of AI and autonomy expertise is a barrier to AE progress. Most individuals in our existing workforce do not have the skillset to do both materials and autonomy research, and universities are just beginning to develop curricula to address computer science and AI for materials research. With autonomous vehicles and huge demands for AI professionals, the AE community will need substantial programmatic investments to develop a workforce of "AI natives" that are comfortable doing closed-loop AE as they are doing materials research, as well as value propositions to attract autonomy and AI experts to materials problems.^167 Workforce development and curricular innovation is needed at all levels,^167 but one particularly pressing need is for technicians that can manage the hybrid mechanical–electrical–chemical systems. Because of similarities to workforce needs in advanced manufacturing, there may be opportunities to extend the existing efforts of community colleges.^168 8. CONCLUSION We hope that this paper informs, sparks interest, and potentially inspires the larger community for AE systems. The first research robots are already making an impact in materials research and development. From optimizing the growth of CNTs to accelerating the understanding of composition–structure–property maps, they are revolutionizing the way scientific research is conducted. Disrupting conventional research methods, AE has demonstrated an increased rate of knowledge generation by orders of magnitude and has resulted in the discovery of new compounds. Broad deployment of AE will require substantial investment in hardware, software, and data infrastructure, as well as in education to overcome technological and workforce challenges. Integrated, online AE systems need to be made cheaper and exponentially more accessible. Upon the demonstration of a sufficient number of AE platforms, funding of large-