Download Introduction to Machine Learning: Examination Questions and Answers and more Exams Nursing in PDF only on Docsity!
Introduction to Machine Learning Actual
Examination
What is ML? ANSWER>>>> A data science technique that extracts patterns from data to
forecast future outcomes, behaviors and trends
What are some applications of ML? ANSWER>>>> Natural language processing, Computer
vision, analytics, decision making
What is AI (artificial intelligence)? ANSWER>>>> a broad term that refers to computers
thinking like humans
What is ML (Machine learning?) ANSWER>>>> subcategory of AI that involves learning
from data without being explicitly programmed
What is DL (Deep Learning)? ANSWER>>>> subcategory of machine learning that uses a
layered neural network architecture inspired by the human brain
What are the steps of the data science process? ANSWER>>>> Collect data, prepare data, train
model, evaluate model, deploy model, retrain model
What types of data does ML deal with? ANSWER>>>> Numerical, time-Series, Categorical,
Text, Image
What are some hallmarks of tabular data? ANSWER>>>> It is arranged in a data table with
rows and columns
What do the rows in tabular data represent? ANSWER>>>> single items (entities)
What do the columns in tabular data represent? ANSWER>>>> properties of items
What is the importance of vectors in ML? ANSWER>>>> It is used heavily to represent many
things. Non-numerical data types are often converted into representative
numerical vectors
What are the two main approaches of scaling data? ANSWER>>>> Standardization and
Normalization
What is standardization? ANSWER>>>> a method of scaling data to have mean = 1 and
std. deviation = 1
What is normalization? ANSWER>>>> A method of scaling data into the range [0,1]
How are values modified in standardization? ANSWER>>>> (x- mean) / variance for value x
How are values modified in normalization? ANSWER>>>> (x-xmin)/(xmax-xmin) for values x
What are the two approaches to encoding categorical data? ANSWER>>>> Ordinal encoding and
one hot encoding
What is ordinal encoding? ANSWER>>>> converting categories into numerical values, first
category is represented by 0, second by 1 ... etc.
What is one hot encoding? ANSWER>>>> each possible value for a category gets its own
column and receives either a 1 or a 0 as its value depending if the entity is part of
that category or not
What is the drawback of ordinal encoding? ANSWER>>>> implicitly assumes order and
importance between categories (category 1 is more important than category 2
because it is 0 whereas category 2 is 1)
What is the drawback of one hot encoding? ANSWER>>>> large number of columns generated
How is image data represented? ANSWER>>>> Image data is represented by pixels
What is depth in terms of image data? ANSWER>>>> Depth is how many channels the data
has. RBG has depth of 3 and grayscale has depth 1
How is image data vectorized? ANSWER>>>> Each pixel is represented by [xpos, ypos,
color]. 3-D vector size is [height][width][channel depth] so a 4x4 color image
has vector size [4][4][3].
When it comes to image preprocessing, what are some of the transformations used?
ANSWER>>>> Rotation, cropping, resizing, denoising, centering, normalizing, making
the aspect ratio uniform
How do you make image data have uniform aspect ratio? ANSWER>>>> make sure it's a square
matrix
How do you normalize image data? ANSWER>>>> mean pixel value in a channel from each
pixel value in that channel
How do you normalize text data? ANSWER>>>> transform it into canonical form. multiple
spellings are reduced into single spelling (colour becomes color), different forms
are reduced to a single form ('is, am, are' all becomes 'be')
What is lemmatization when it comes to text data? ANSWER>>>> reduces multiple inflections
to a single dictionary form
What are stop words? ANSWER>>>> high freq. words that are unwanted during text
analysis
What does it man to tokenize a string? ANSWER>>>> either split each strong of text into a
list of smaller parts or tokens or split a sentence into separate keywords
How do you vectorize text data? ANSWER>>>> Identify the particular features of the text
that is relevant to the task
Get features extracted in a numerical form that is accessible to ML algorithm via TF-IDF or word embedding
What is TF-IDF and how does it work? ANSWER>>>> term frequency-inverse document
frequency and it assigns less importance to common words or words that contain
less information
How does feature extraction for text data work? ANSWER>>>> Vectors can be visualized on a
graph, the distance between two vectors is used to assess similarity in meaning
or some connection.
Identify the pipeline for text data ANSWER>>>> Preprocessing and normalizing,
tokenization, stop word removal etc - > feature extraction and vectorization (TF -
IDF, GloVe, Word2Vec)-> feed vectorize document and labels into model and train
model
What is the statistical perspective of ML? ANSWER>>>> y= f(x), output is dependent as a
function of the input and you are looking to find the function
What is the computer science perspective of ML? ANSWER>>>> Program(input Features), data
inputs (input features) are used to train a model to find the correct outputs
(sometimes given).
Use input features to create a program that can generate the desired output
What are notebooks? ANSWER>>>> Documenting tool that others can use to reproduce
experiments, it's a combination of runnable code, output, formatted text and
visualizations that is made up of one or more cells that allow execution of
individual code snippets and chunks. Output of each cell can be saved and viewed
by others
Explain what each of the following components help you perform ML runs in Azure ML: [Notebooks, automated ML, designer, datasets, experiments, models, endpoints, compute, datastores]
ANSWER>>>> Notebooks - Sample notebooks and user files loaded inside of compute
instances
Automated ML - Can automate intensive tasks that rapidly iterate over many
combinations of algorithms, hyperparameters to find the best model based on the
chosen metric
- Create new runs and view previous runs in the Automated ML tab
Designer - Drag-and-drop tool that lets you create ML models without any code
- Has templates and can view drafts
Datasets - Create datasets from local files, datastores, etc
Experiments - Helps organize runs
- All runs must be associated with an experiment, can view all runs related to an
experiment
Models - Models are produced by runs in Azure ML, all models created in Azure or
trained outside of Azure are accessible here
Endpoints - Exposes real-time endpoints for scoring or pipelines for advanced
automation
Compute - Designated compute resource where you run training script or host
service deployment
- Manage compute instance, training cluster, inference cluster, attached compute
Datastores - Attached storage account in which you can store datasets
What is the difference between a model and an algorithm? ANSWER>>>>models are specific representations learned from data, algorithms are the processes of learning Model = Algorithm(data)
What is a linear regression model? ANSWER>>>>It predicts a variable y from input variable x and assumes a simple linear relationship What are the forms for simple and multiple linear regressions? ANSWER>>>>Simple: Y = B0 + B1X Multiple: Y = B0 + B1X1 +B2X2 ... +BnXn What does training a linear regression model entail? ANSWER>>>>finding the coefficients that best represent the input variables, minimizing error between line of best fit and each data point How do you prepare data for a linear regression? ANSWER>>>>Linear assumption, remove noise, remove collinearity, gaussian distribution fit, rescale inputs What is the formula to determine Root Mean Squared Error (RMSE) ANSWER>>>>RMSE = sqrt((predicted - actual)^2)/(# of datapoints)) What is a learning function used for? ANSWER>>>>learn a useful transformation of the input data that gets us closed to our expected output What is irreducible error? ANSWER>>>>The ever-present error in a predicted value because it is predicted from a limited dataset What is irreducible error caused by? ANSWER>>>>the data collection process:
- not enough data
- not enough data features What is the difference between irreducible error and model error? ANSWER>>>>Model error is how different the predictions are from the actual output, can be reduced by refining the model learning process What is a parametric function? ANSWER>>>>parametric functions simplify mapping to a known functional form, general form is known, computes for coefficients and constants
What is a non-parametric function? ANSWER>>>>no assumptions are made regarding the form of mapping between input and output data, free-form relationship formation between data, can have any functional form What are the benefits and limitations of parametric functions? ANSWER>>>>Benefits: simpler, easier to understand and interpret, faster learning from data, less training data required to learn mapping function Limitations: highly constrained to specific form of the function, limited complexity, poor fit in practice, not everything fits the underlying mapping function What are the benefits and limitations of non-parametric functions? ANSWER>>>>Benefits: highly flexible, can fit a large # of functional forms, makes no assumptions on underlying function, high performance in prediction models produced Limitations: more training data needed, slower to train, generally has many parameters, risks overfitting the training data What are some properties of classical ML? ANSWER>>>>- based on classical mathematical algorithms
- more suitable for small data
- easier to interpret outcomes
- cheaper to perform
- can run on low-end machines
- doesn't require large amounts of computational power
- difficult to learn large datasets
- requires feature engineering
- difficult to learn complex features What are some properties of deep learning? ANSWER>>>>- based on neural networks
- suitable for high complexity problems
- better accuracy than classical ML
- better support for big data
- complex features can be learned
- difficult to explain trained data
- requires significant computational power What is supervised learning and what approaches does it entail? ANSWER>>>>Learns from both inputs and expected outputs, datasets are labeled Classification, regression, similarity learning, feature learning, anomaly detection passive process where learning is performed without any actions that could influence the data What is unsupervised learning and what approaches does it entail? ANSWER>>>>Learns from data that only contains inputs (unlabeled), finds hidden structures and relationships in data to train model Clustering, feature learning, anomaly detection passive process where learning is performed without any actions that could influence the data What is reinforcement learning and what approaches does it entail? ANSWER>>>>Learns how an agent should take action in an environment to maximize a reward function Markov Decision Process active process where the actions of the agent influence the data observed in the future, hence influencing its own potential future states What distinguishes Classification algorithms? ANSWER>>>>outputs are categorical What distinguishes Regression algorithms? ANSWER>>>>outputs are numericaland continuous
What distinguishes Clustering algorithms? ANSWER>>>>find inherent groups or clusters in the data, assigns entities to each cluster/group What are feature learning algorithms? ANSWER>>>>features are discovered or learned from the data What is anomaly detection? ANSWER>>>>algorithms used to detect abnormal data in a set of normal data What is Bias? ANSWER>>>>simplifying assumptions made by a model that make the target function easier to learn Why is high bias bad? ANSWER>>>>Bias measures how inaccurate a model prediction is in comparison to true output so more bias = less accurate High bias = more assumptions + potentially miss important relationships b/t features and output, can cause underfitting What is Variance? ANSWER>>>>amount the estimate of the target function will change if different training data is used Why is high variance bad? ANSWER>>>>High variance suggests that the algorithm learns the random noise instead of the output and causes overfitting What is the tradeoff between bias and variance? ANSWER>>>>inverse relationship, models that are very complex usually have low bias and high variance Low complexity models usually have low variance and high bias Error is lowest when variance and bias are balanced
What kind of bias / variance do parametric algorithms usually have? ANSWER>>>>High bias, Low Variance What kind of bias / variance do non-parametric algorithms usually have? ANSWER>>>>High Variance, Low Bias What is Overfitting? ANSWER>>>>When the model fits the training dat avery well but fails to generalize new data "memorizing" the data and not adapting well to new data What is Underfitting? ANSWER>>>>When the models neither fit the training data nor generalize to new data Doesn't model training data well, doesn't generalize new data well either How would you prevent overfitting? ANSWER>>>>K-fold cross-validation Simplifying Model More data Reduce Dimensionality Stop training early when performance stops improving What is K-fold cross-validation? ANSWER>>>>Splits initial training data into k subsets and trains the model k times Used to reduce overfitting What is the Markov Decision Process? ANSWER>>>>Mathematical process to model decision making in situations where out comes are partly random and partly under the control of a decision maker Used in reinforcement learning