Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction to Machine Learning, Lecture notes of Machine Learning

An introduction to machine learning, its history, and its applications. It discusses the mathematical models used in machine learning, including linear regression, logistic regression, and support vector machines. The document also covers the process of learning and estimating statistical models from data. The University of Texas at Austin is mentioned as the institution where the authors are affiliated.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

lalitdiya
lalitdiya 🇺🇸

4.3

(24)

240 documents

1 / 77

Toggle sidebar

Related documents


Partial preview of the text

Download Introduction to Machine Learning and more Lecture notes Machine Learning in PDF only on Docsity! Machine Learning “Summer” School: Introduction Pradeep Ravikumar, Peter Stone Department of Computer Science The University of Texas at Austin What is Machine Learning? Artificial Intelligence and Computer Science • Artificial Intelligence as a field predates that of Computer Science itself
 "an ancient wish to forge the gods" — Pamela McCorduck • Pioneering researchers in CS were also researchers in AI e.g. Alan Turing • Main Hypothesis: If we can build machines that can compute, we can build machines that can think! Artificial Intelligence and Computer Science • Artificial Intelligence as a field predates that of Computer Science itself
 "an ancient wish to forge the gods" — Pamela McCorduck • Pioneering researchers in CS were also researchers in AI e.g. Alan Turing • Main Hypothesis: If we can build machines that can compute, we can build machines that can think! • We can now do the former — computing machines — but not the latter —human-level intelligent machines (for general tasks)*. Why? Artificial Intelligence and Computer Science • Artificial Intelligence as a field predates that of Computer Science itself
 "an ancient wish to forge the gods" — Pamela McCorduck • Pioneering researchers in CS were also researchers in AI e.g. Alan Turing • Main Hypothesis: If we can build machines that can compute, we can build machines that can think! • We can now do the former — computing machines — but not the latter —human-level intelligent machines (for general tasks)*. Why? ‣ What to compute (for general human-level intelligence) is not clear, and even when it is, it is (computationally) intractable Machine Learning in Recent Years • Machine Learning in recent years: ‣Data-driven science — as a new fourth paradigm of scientific discovery
 (first three science paradigms: experimental, theoretical, computational) ‣Data-driven engineering ‣Philosophy of Data 
 
 “If you asked me to describe the rising philosophy of the day, I’d say it is data-ism.... that data will help us do remarkable things — like foretell the future.” 
 .... David Brooks • Modern Machine Learning: Mathematical models learnt from data that characterize the relationships amongst variables in the system Machine Learning: Models • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest (typically via random variables, in which case it is called a statistical model) Graphical Models S1 S2 S3 S4 D1 D3 D2 Symptoms S1, S2 Disease D1? • With many symptoms and diseases, 
 deduction is difficult • Prob (D1 | S1, S2) Symptoms Diseases Machine Learning: Models • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest, typically via random variables Machine Learning: Models • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest, typically via random variables ‣ the model chosen typically depends on the task at hand Machine Learning: Models • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest, typically via random variables ‣ the model chosen typically depends on the task at hand Prediction: Estimate output given input Called classification when output is 
 categorical (stock price up or down) Called regression when output is 
 real-valued (actual stock price value) Models used: Linear Regression, Logistic 
 Regression, Support Vector Machines, … Machine Learning • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest, typically via random variables • Learning: estimating statistical model from data Machine Learning • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest, typically via random variables • Learning: estimating statistical model from data ‣ From “data” to “model” Machine Learning • A key infrastructural element in machine learning is a mathematical model ‣ A mathematical characterization of system(s) of interest, typically via random variables • Learning: estimating statistical model from data ‣ From “data” to “model” • Inference: using statistical model to infer properties of system(s) Machine Learning: Learning Training
 Data Model Train Test Test 
 Data Model Evaluation Is the model good enough, is there enough data? Underfitting Overfitting scikit-learn.org Is the model good enough, is there enough data? Underfitting Overfitting scikit-learn.org What models allow us to do is generalize from data Different models generalize in different ways Simple model: “memorize” all data points, and report nearest neighbor Machine Learning: Inference • Inference: using statistical model to infer properties of system(s) ‣ From “model” to “answers” ‣ Predictive: e.g. most probable response value given covariates/inputs ‣ What is the (most probable) disease given systems ‣ Descriptive: e.g. groupings of variables Machine Learning: Inference • Inference: using statistical model to infer properties of system(s) ‣ From “model” to “answers” ‣ Predictive: e.g. most probable response value given covariates/inputs ‣ What is the (most probable) disease given systems ‣ Descriptive: e.g. groupings of variables ‣ Inferring communities of users in a social network Computational Aspects of ML • Mathematical part of ML: Specifying Models and resulting learning/inference Algorithms, understanding and analyzing properties of Models and Algorithms Computational Aspects of ML • Mathematical part of ML: Specifying Models and resulting learning/inference Algorithms, understanding and analyzing properties of Models and Algorithms • Computational part of ML: Performing learning/inference in a computationally tractable way • Model Learning and Inference can be cast as an optimization problem: ‣ minimize/maximize an objective function with respect to some parameters Computational Aspects of ML: Optimization • Mathematical part of ML: Specifying Models and resulting Estimators, understanding and analyzing properties of Models and Estimators • Computational part of ML: estimating the models from data, inference using the models in a • Model Estimation and Inference can be cast as an ‣ minimize/maximize an objective function with respect to some parameters ‣ Almost any inference task could be recast as solving an optimization problem — this is sometimes called the “variational viewpoint” of inferenceMaximum of objective surface (hill) is is red dot at top of hill Computational Aspects of ML • Mathematical part of ML: Specifying Models and resulting learning/inference Algorithms, understanding and analyzing properties of Models and Algorithms • Computational part of ML: Performing learning/inference in a computationally tractable way • Model Learning and Inference can be cast as an optimization problem: ‣ minimize/maximize an objective function with respect to some parameters Modern Machine Learning • (Parametric/Nonparametric Bayesian/Frequentist) Statistical Models Models under high-dimensional “Big-p” data settings Models for Complex Data Types Models with the additional facet of spatio-temporality Models with the additional facet of actions which feed back to
 the model
 
 Models with multiple interacting systems/agents Mathematical Models: Parametric vs Nonparametric • Parametric Models: “fixed-size” models that do not “grow” with the data ‣ More data just means you learn/fit the model better Fitting a simple line (2 params)
 to a bunch of one-dim. samples Mathematical Models: Parametric vs Nonparametric • Parametric Models: “fixed-size” models that do not “grow” with the data ‣ More data just means you learn/fit the model better • Nonparametric Models: Models that grow with the data ‣ More data means a more complex model Statistical Models: Bayesian vs Frequentist • Mathematical models in ML typically described via random variables — in which case they are also called statistical models • Statistical models typically specified by unknown parameters (to be learnt from data) Statistical Models: Bayesian vs Frequentist • Mathematical models in ML typically described via random variables — in which case they are also called statistical models • Statistical models typically specified by unknown parameters (to be learnt from data) • Frequentist: there exist a “ground-truth” set of unknown parameters that are constant (i.e. not random) Statistical Models: Bayesian vs Frequentist • Mathematical models in ML typically described via random variables — in which case they are also called statistical models • Statistical models typically specified by unknown parameters (to be learnt from data) • Frequentist: there exist a “ground-truth” set of unknown parameters that are constant (i.e. not random) • Bayesian: model parameters are themselves random, and typically specified by their own distribution/statistical model, with their own unknown “hyper- parameters” Complex Data Types • Standard Machine Learning models work with inputs and outputs that are scalars or vectors in standard Euclidean space • But increasingly so, the data types of the inputs and outputs are not so simple anymore Complex Data Types Phylogenetic Trees [wikipedia] Parse Trees [wikipedia] Graph Structure [one-mind] Complex Data Types Permutations/Rankings € > SB hitos / www.google.com) #q=search senginestlp=1 oc: gs =| \ Bisearch engines - Googie s Google _ search engines ' Pradeep Raker + Shaw > I Web 29° Vas m News Mores Seach tock, 219 2 Qogoile Web Search were Cogple com ‘All the best search engines piled into one. Wes; | Images; | Video: [: News: | Local White Pages. Search Results from: Google, Yahoo!, Yardex, And More White Pages - Horoscope - Local - About Oogpile Web search engine - Vikipedia, he free encyclopedia oe whoeda onyahiWe search engine Awad search engine is software code that 1 desgred to search for formation on the Word Wide Web The saarch msuts are generally presented in a ine of List of search engines - Wikipedia. the tree encyclopedia on wipeda orp/atkyUst_of_search engines Thee © 2 list of articles about search engines, mcluding web search engines. Selection based search engines. metasearch engres. desstop search tools. and wenm esearchenginelest Com May 21, 2010 ~ The Search Engine List is the web's most comprehensive list of major ard minor search engines complete with irks and abstracts descrbing —. worm ema COM aces Search engines: Here are the 15 Most Popular Search Engines ranked by a combination of constantly Updated tratfic statistics News for search engines 4 hy Russia's search engine is moving up the rarks in the search engine world ~ it now comes in fouth place after Googie, Basids, and Yahoo. https Jerre inquickh com Statistical Models for spatio-temporal data • Random vectors characterizing how a system varies as a function of time (and potentially space) ‣ Y(t) , X(s,t) • Also called random fields Statistical Models for spatio-temporal data • Random vectors characterizing how a system varies as a function of time (and potentially space) ‣ Y(t) , X(s,t) • Also called random fields Field in Physics: Some physical quantity associated with space-time Models: With Actions • Models for agents that take actions depending on current state ‣ these actions incur rewards, and affect future states (“feedback”) • Forms the subfield of Reinforcement Learning Reinforcement Learning Supervised learning mature [WEKA] For agents, reinforcement learning most appropriate Environment Agent πPolicy : S A action (a[t]) state (s[t]) reward (r[t+1]) Peter Stone (UT Austin) PRISM 3 Parametric Models: Frequentist Ensemble Methods Joydeep Ghosh, UT Austin Parametric Models, Optimization Evolving Neural Networks Risto Mikkulainen, UT Austin Parametric Models Deep Learning Yoshua Bengio, University of Montreal Parametric Models: High-dimensional Low-rank Matrices Sujay Sanghavi, UT Austin Parametric Models: Complex Data Ranking Ambuj Tewari, University of Michigan Non-parametric Models: Bayesian Nonparametric Bayesian Models Peter Mueller, UT Austin Models: With Actions Multi-step Prediction Richard Sutton, University of Alberta Models: Multiple Agents Multi-Agent Systems Peter Stone, UT Austin Optimization Convex Optimization Constantine Caramanis, UT Austin Applications Natural Language Processing Raymond Mooney, UT Austin Computer Vision Kristen Grauman, UT Austin Psychology Tal Yarkoni, UT Austin Robotics Peter Stone, UT Austin MLSS Austin - by the numbers • 84 from the US • 26 International • 9 Academics • 15 Industry • 86 Students • 110* total registrants Some Logistics • Lectures each day in two sessions: morning session from 9:00 - 12:00, and afternoon session from 2:00 - 5:00 • Lunch each day from 12:00 - 2:00 on your own, except Friday Jan 16 when we will be providing a boxed lunch • Poster Session (optional) from 2:00 - 5:00 on January 12th • Evenings on your own, our volunteers will lead a night out on Saturday January 10 (optional)