Machine Learning: Concepts and Techniques, Exercises of Financial Economics

A concise overview of key concepts and techniques in machine learning. It covers supervised and unsupervised learning, machine learning algorithms like support vector machines (svms) and decision trees, and related concepts such as high-dimensionality, feature selection, and feature engineering. The document also touches upon the application of machine learning in bioinformatics and genomics.

Typology: Exercises

2024/2025

Available from 02/26/2025

patrick-maina-2
patrick-maina-2 🇬🇧

585 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Machine Learning
Machine learning ✔✔form of artificial intelligence that does not require you to explicitly
program, learning from patterns in data
Supervised learning ✔✔type of machine learning in which the response variable is
known
Unsupervised learning ✔✔type of machine learning in which the response variable is
unknown
R ✔✔a mathematics based programming language that is often used for machine learning
WEKA ✔✔a graphical program (or a visual programming language)
High-dimensionality ✔✔a high-dimensionality problem is a machine-learning problem in
which the number of dimensions or features is much more than the number of cases
Clustering ✔✔part of unsupervised learning that in which similar cases are grouped
together
Classification ✔✔the process of using machine learning to identify different cases based
on patterns found in data (example: classifying tumors as malignant or benign;
classifying emails as spam or not spam)
SVM ✔✔(Support Vector Machine) a machine learning algorithm that computes a
hyperplane in order to separate different classes of data points (example: a SVM could be
used to compute a hyperplane that separates data points that represent cancer and not
cancer)
OVA ✔✔(One vs. All) special types of SVMs that are used in multi-class problems -
builds one SVM that compares each class to the rest of the classes (example: in a problem
regarding the diagnosis of 14 different cancers, 14 SVMs would be built such as breast
cancer vs. everything else, prostate cancer vs. everything else, etc.)
Decision tree ✔✔a machine learning algorithm that makes a map with generic
characteristics that can be used to determine what class a specific case falls into
pf3

Partial preview of the text

Download Machine Learning: Concepts and Techniques and more Exercises Financial Economics in PDF only on Docsity!

Machine Learning Machine learning ✔✔form of artificial intelligence that does not require you to explicitly program, learning from patterns in data Supervised learning ✔✔type of machine learning in which the response variable is known Unsupervised learning ✔✔type of machine learning in which the response variable is unknown R ✔✔a mathematics based programming language that is often used for machine learning WEKA ✔✔a graphical program (or a visual programming language) High-dimensionality ✔✔a high-dimensionality problem is a machine-learning problem in which the number of dimensions or features is much more than the number of cases Clustering ✔✔part of unsupervised learning that in which similar cases are grouped together Classification ✔✔the process of using machine learning to identify different cases based on patterns found in data (example: classifying tumors as malignant or benign; classifying emails as spam or not spam) SVM ✔✔(Support Vector Machine) a machine learning algorithm that computes a hyperplane in order to separate different classes of data points (example: a SVM could be used to compute a hyperplane that separates data points that represent cancer and not cancer) OVA ✔✔(One vs. All) special types of SVMs that are used in multi-class problems - builds one SVM that compares each class to the rest of the classes (example: in a problem regarding the diagnosis of 14 different cancers, 14 SVMs would be built such as breast cancer vs. everything else, prostate cancer vs. everything else, etc.) Decision tree ✔✔a machine learning algorithm that makes a map with generic characteristics that can be used to determine what class a specific case falls into

Multi-class ✔✔many classes Binary class ✔✔2 classes Gene expression ✔✔ (from Wikipedia) process by which information from a gene is used in the synthesis of a functional gene product. Products are often proteins, but in non- protein coding genes such as ribosomal RNA (rRNA), transfer RNA (tRNA) or small nuclear RNA (snRNA) genes, the product is a functional RNA. Gene expression levels ✔✔(from Wikipedia) the ability to quantify the level at which a particular gene is expressed within a cell, tissue or organism. Ideally measurement of expression is done by detecting the final gene product (for many genes this is the protein) however it is often easier to detect one of the precursors, typically mRNA, and infer gene expression level. DNA microarray ✔✔technology used to determine gene expression levels Poorly differentiated cancer ✔✔a cancer whose origin is difficult to determine LOOCV / cross-validation ✔✔cross validation - randomly splitting the data into n number of groups and training the data on n-1 groups and testing it on a different group every time; LOOCV is a special case in which n = the number of cases Feature selection ✔✔normally used in high-dimensionality problems to pick features that play a bigger role in making a prediction Feature engineering ✔✔altering features based on non-linear relationships (examples: log, squaring, doubling, etc.) S2N ✔✔ (signal-to-noise ratio) if the data is really noisy (many missing values, nonsensical values, etc.) then the signal will not be right meaning that the S2N is low (opposite if S2N is high) Synonyms for feature ✔✔attribute, predictor, variable, independent variable, dimension Synonyms for response variable ✔✔prediction, signal, dependent variable