












Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Naïve Bayes Classifier Multilayer Perceptron (MLP): Architecture and Working Discuss overfitting and pruning techniques in Decision Trees. Advantages of using Artificial Neural Networks (ANN)? (Give the Advantages & Disadvantage of Neural Network) Dimensionality Reduction: Process in Machine Learning Gaussian Mixture Models (GMM): Concept
Typology: Exams
1 / 20
This page cannot be seen from the preview
Don't miss anything!













1 : Write a short note on Naïve Bayes classifier.
Naïve Bayes Classifier (Short Note)
The Naïve Bayes classifier is a popular supervised machine learning algorithm used for classification tasks. It is based on Bayes' Theorem, which calculates the probability of a class given some input features.
Key Idea
The algorithm assumes that all features are independent of each other given the class label. This assumption is called “naïve,” because in real-world data, features are often related—but the model still performs well in many cases.
Formula
It is based on:
P(C∣X)=P(X∣C)⋅P(C)
P(X)
Where:
(P(C|X)): Posterior probability (probability of class given input) (P(X|C)): Likelihood (P(C)): Prior probability of class (P(X)): Evidence
Types of Naïve Bayes
Advantages
Simple and easy to implement Works well with large datasets Fast and efficient Performs well in text classification problems
Disadvantages
Assumes feature independence (not always realistic) Zero probability problem (can be handled using smoothing techniques)
Applications
Spam email detection Sentiment analysis Document classification Medical diagnosis
2 : Describe the architecture and working of a Multilayer Perceptron.
Multilayer Perceptron (MLP): Architecture and Working
A Multilayer Perceptron (MLP) is a type of artificial neural network used for classification and regression tasks. It is a feedforward neural network , meaning data flows in one direction—from input to output.
🔷 Architecture of MLP
An MLP consists of multiple layers of neurons:
1. Input Layer
This layer receives the input data (features). Each neuron represents one feature of the dataset. It does not perform computation; it only passes data to the next layer.
2. Hidden Layer(s)
One or more layers between input and output. Each neuron performs computation using weights, bias, and activation functions. These layers help the model learn complex patterns.
3. Output Layer
Produces the final result (prediction). The number of neurons depends on the problem (e.g., one for regression, multiple for classification).
🔷 Components of MLP
Weights (W): Control the importance of input features Bias (b): Helps shift the output Activation Function: Introduces non-linearity o Examples: ReLU, Sigmoid Function, Softmax Function
3: What is Overfitting? Discuss overfitting and pruning techniques in
Decision Trees.
🔷 What is Overfitting?
Overfitting is a condition in machine learning where a model learns the training data too well , including noise and unnecessary details, resulting in poor performance on new (unseen) data.
In the context of decision trees, overfitting happens when the tree becomes too deep and complex , capturing every small variation in the training dataset.
🔸 Signs of Overfitting:
Very high accuracy on training data Low accuracy on test data Complex tree with many branches
🔷 Overfitting in Decision Trees
Decision trees tend to overfit because they:
Split data repeatedly until all data points are perfectly classified Create very specific rules that do not generalize well
For example, a tree may create unnecessary splits based on minor differences in data, which are not useful for prediction.
🔷 Pruning Techniques in Decision Trees
Pruning is the process of reducing the size of a decision tree to improve its generalization and avoid overfitting.
🔸 Types of Pruning:
1. Pre-Pruning (Early Stopping)
Stops the tree from growing fully during training Conditions to stop splitting: o Maximum depth limit o Minimum number of samples per node o Minimum information gain
Advantages:
Faster and simpler Prevents overly complex trees early
Disadvantages:
May stop too early → underfitting
2. Post-Pruning (Backward Pruning)
First builds a full tree, then removes unnecessary branches
Common Methods:
Reduced Error Pruning: Removes nodes if it does not reduce accuracy on validation data Cost Complexity Pruning (used in algorithms like CART): Balances tree size and accuracy by adding a penalty for complexity
🔷 Benefits of Pruning
Reduces overfitting Improves model generalization Makes the tree simpler and more interpretable
4 : What are the advantages of using Artificial Neural Networks (ANN)?
(Give the Advantages & Disadvantage of Neural Network)
Artificial Neural Networks (ANN): Advantages & Disadvantages
Artificial Neural Networks are computational models inspired by the human brain and are widely used in machine learning and deep learning.
🔷 Advantages of ANN
1. Ability to Learn Complex Patterns
4. Risk of Overfitting
They may memorize training data instead of generalizing (similar to overfitting problem).
5. Difficult to Tune
Choosing the right number of layers, neurons, and parameters can be challenging.
6. Dependency on Hardware
Performance depends heavily on GPUs and computing resources.
5 : What is the process of Dimensionality Reduction?
Dimensionality Reduction: Process in Machine Learning
Dimensionality Reduction is the process of reducing the number of input features (variables) in a dataset while preserving as much important information as possible. It helps simplify models, reduce computation, and improve performance.
🔷 Process of Dimensionality Reduction
1. Data Collection
Gather the dataset with many features (high-dimensional data).
2. Data Preprocessing
Handle missing values Normalize or standardize data Remove irrelevant or duplicate features
3. Feature Selection or Extraction
There are two main approaches:
🔹 (A) Feature Selection
Select only the most important features Methods include: o Filter methods (correlation, chi-square test) o Wrapper methods o Embedded methods
👉 Keeps original features but reduces their number
🔹 (B) Feature Extraction
Transform existing features into a new lower-dimensional space Important techniques include: Principal Component Analysis o Converts features into principal components based on variance Linear Discriminant Analysis o Maximizes class separability
👉 Creates new features from old ones
4. Model Training
Train the machine learning model using reduced features
5. Evaluation
Compare performance before and after dimensionality reduction Check accuracy, efficiency, and overfitting
🔷 Benefits of Dimensionality Reduction
Reduces computation time Removes noise and redundancy Helps avoid overfitting Improves visualization (2D/3D plots)
🔷 Importance of Confusion Matrix
Gives detailed insight into model performance Helps identify types of errors Useful for imbalanced datasets Helps in improving model accuracy
7 : What is the role of back-propagation in Neural Networks
Role of Backpropagation in Neural Networks
Backpropagation (short for backward propagation of errors) is a key algorithm used to train artificial neural networks. It helps the network learn by adjusting its weights and biases to minimize prediction error.
🔷 What Backpropagation Does
Backpropagation works together with Gradient Descent to reduce the error (loss) between the predicted output and the actual output.
🔷 Role of Backpropagation
1. Error Calculation
After forward propagation, the network produces an output. The difference between predicted and actual output is calculated using a loss function.
2. Error Propagation
The error is propagated backward from the output layer to hidden layers. It determines how much each neuron contributed to the error.
3. Gradient Computation
Backpropagation computes gradients (partial derivatives) of the loss with respect to weights. It uses the chain rule from calculus.
4. Weight and Bias Update
Weights and biases are updated to reduce error:
5. Iterative Learning
This process is repeated over many epochs until the error is minimized.
🔷 Importance of Backpropagation
Enables neural networks to learn from data Improves model accuracy over time Makes training of deep networks possible Efficiently updates all weights in the network
🔷 Simple Example
If a neural network predicts a value incorrectly, backpropagation:
8 : Discuss the concept of Gaussian Mixture Models (GMM)
Gaussian Mixture Models (GMM): Concept
A Gaussian Mixture Model (GMM) is a probabilistic model used to represent data as a combination of multiple Gaussian (normal) distributions. It assumes that the dataset is generated from several underlying distributions, each representing a cluster.
🔷 Disadvantages
Computationally expensive May converge to local optima Requires specifying number of components (K)
🔷 Applications
Image segmentation Speech recognition Anomaly detection Customer segmentation
9 : Explain Random Forest algorithm with example
Random Forest Algorithm (Explanation with Example)
Random Forest is a powerful supervised machine learning algorithm used for classification and regression. It is an ensemble learning method , meaning it combines multiple models (decision trees) to improve accuracy and reduce overfitting.
🔷 Basic Idea
Instead of using a single decision tree, Random Forest builds many decision trees and combines their results:
For classification → uses majority voting For regression → uses average of outputs
🔷 How Random Forest Works
1. Bootstrapping (Sampling)
Random subsets of training data are created (with replacement). Each subset is used to train a separate decision tree.
2. Random Feature Selection
At each split in a tree, only a random subset of features is considered. This makes trees less correlated and improves performance.
3. Tree Construction
Multiple decision trees are built independently using different data samples.
4. Aggregation (Final Prediction)
Combine predictions from all trees: o Classification → Majority vote o Regression → Average value
🔷 Example
Suppose we want to predict whether a student will pass or fail based on:
Study hours Attendance Previous marks
Step 1:
Create multiple datasets from the original data.
Step 2:
Train different decision trees:
Tree 1 → Predicts “Pass” Tree 2 → Predicts “Fail” Tree 3 → Predicts “Pass”
Step 3:
Final Prediction (Majority Voting):
👉 Pass (since most trees predicted “Pass”)
🔸 Types of Points:
Core Point: Has at least MinPts within ε Border Point: Near a core point but has fewer neighbors Noise Point: Neither core nor border (outlier)
🔸 Working of DBSCAN:
🔸 Advantages:
Detects clusters of arbitrary shapes Handles noise effectively No need to specify number of clusters
🔸 Disadvantages:
Sensitive to ε and MinPts values Not suitable for datasets with varying densities
🔷 2. OPTICS (Ordering Points To Identify the Clustering Structure)
🔸 Concept
OPTICS is an extension of DBSCAN that handles varying density clusters more effectively.
🔸 Key Idea:
Instead of directly forming clusters, OPTICS creates an ordering of points based on their density and stores:
Reachability distance Core distance
🔸 Working of OPTICS:
🔸 Advantages:
Handles varying densities better than DBSCAN Produces detailed cluster structure No strict need to choose ε beforehand
🔸 Disadvantages:
More complex than DBSCAN Higher computational cost
11 : What is the role of Activation Functions in Artificial Neural Networks?
Explain the different types with examples.
Role of Activation Functions in Artificial Neural Networks (ANNs)
Activation functions are mathematical functions applied to a neuron’s output. They determine whether a neuron should be activated or not and introduce non-linearity into the network.
Smooth curve Interpretable as probability
Disadvantages:
Vanishing gradient problem
2. Tanh (Hyperbolic Tangent)
Formula: f(x) = tanh(x) Output range: (-1, 1)
Example: Used in hidden layers for centered data
Advantages:
Zero-centered output
Disadvantages:
Still suffers from vanishing gradients
3. ReLU (Rectified Linear Unit)
Formula: f(x) = max(0, x)
Example: Widely used in deep learning models like image recognition
Advantages:
Simple and fast Reduces vanishing gradient problem
Disadvantages:
“Dead neuron” problem (outputs zero for negative inputs)
4. Leaky ReLU
Formula: f(x) ={ x x > 0 0.01x & x \leq 0
Example: Used to fix ReLU’s dead neuron issue
5. Softmax Function
Converts outputs into probabilities that sum to 1
Example: Used in multi-class classification (e.g., digit recognition)
6. Linear Activation Function
Formula:
f(x) = x
Example: Used in regression problems (predicting continuous values)