Lab 14 Lecture Notes, Lecture notes of Environmental Science

Lab 14 Lecture notes for GEOL4342

Typology: Lecture notes

2025/2026

Uploaded on 05/05/2026

mustufa-khan-1
mustufa-khan-1 🇺🇸

5 documents

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Lab 14: Convolutional Neural Network (CNN)
Instructor: Rashik Islam
Table of Contents
14.1 CNN (Convolutional Neural Network) 1
14.2 Key Components of a CNN 1
14.3 Simple Example of Using a CNN 1
14.4 Key Hyperparameters in CNN 3
14.5 Practice Code 3
14.4 Overfitting 12
14.1 CNN (Convolutional Neural Network)
A Convolutional Neural Network (CNN) is a type of deep neural network that is especially
powerful for processing data that has a grid-like topology, such as images. CNNs are primarily
used in the field of computer vision, where they excel at tasks like image recognition, image
classification, and object detection.
14.2 Key Components of a CNN
Input Layer: This is where the image or input data is first introduced into the network.
Convolutional Layers: These layers are the core building blocks of a CNN. They apply a
number of filters to the input to create feature maps. These filters automatically detect
important features like edges, colors, and textures by performing element-wise
multiplication of the filter matrix with the input matrix.
Activation Function: Typically, a nonlinear activation function like ReLU (Rectified Linear
Unit) is applied after each convolution operation to introduce nonlinear properties into
the model, helping it learn more complex patterns.
Pooling Layers: Following the convolutional layers, pooling layers (such as max pooling or
average pooling) are used to reduce the dimensionality of each feature map while
retaining the most important information. This makes the neural network more efficient
and less prone to overfitting.
Fully Connected Layers: After several convolutional and pooling layers, the high-level
reasoning in the neural network is done via fully connected layers. Neurons in a fully
connected layer have connections to all activations in the previous layer, as seen in regular
neural networks. Their activation can hence be computed with a matrix multiplication
followed by a bias offset.
Output Layer: The final layer is typically a softmax layer that classifies the input image into
various classes based on the training dataset.
14.3 Simple Example of Using a CNN
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download Lab 14 Lecture Notes and more Lecture notes Environmental Science in PDF only on Docsity!

Lab 1 4 : Convolutional Neural Network (CNN) Instructor: Rashik Islam

Table of Contents

14 .1 CNN (Convolutional Neural Network) 1 14 .2 Key Components of a CNN 1 14 .3 Simple Example of Using a CNN 1 14 .4 Key Hyperparameters in CNN 3 14 .5 Practice Code 3 14 .4 Overfitting 12

14 .1 CNN (Convolutional Neural Network)

A Convolutional Neural Network (CNN) is a type of deep neural network that is especially powerful for processing data that has a grid-like topology, such as images. CNNs are primarily used in the field of computer vision, where they excel at tasks like image recognition, image classification, and object detection.

14 .2 Key Components of a CNN

  • Input Layer: This is where the image or input data is first introduced into the network.
  • Convolutional Layers: These layers are the core building blocks of a CNN. They apply a number of filters to the input to create feature maps. These filters automatically detect important features like edges, colors, and textures by performing element-wise multiplication of the filter matrix with the input matrix.
  • Activation Function: Typically, a nonlinear activation function like ReLU (Rectified Linear Unit) is applied after each convolution operation to introduce nonlinear properties into the model, helping it learn more complex patterns.
  • Pooling Layers: Following the convolutional layers, pooling layers (such as max pooling or average pooling) are used to reduce the dimensionality of each feature map while retaining the most important information. This makes the neural network more efficient and less prone to overfitting.
  • Fully Connected Layers: After several convolutional and pooling layers, the high-level reasoning in the neural network is done via fully connected layers. Neurons in a fully connected layer have connections to all activations in the previous layer, as seen in regular neural networks. Their activation can hence be computed with a matrix multiplication followed by a bias offset.
  • Output Layer: The final layer is typically a softmax layer that classifies the input image into various classes based on the training dataset.

14 .3 Simple Example of Using a CNN

Let's consider a simple example: identifying whether an image contains a cat or a dog. 14 .3.1 Input: You input an image into the CNN. 14 .3.2 Feature Learning:

  • Convolutional Layer: Detects edges and shapes in the image.
  • Activation Function: Applies ReLU to add non-linearity, helping to differentiate complex patterns.
  • Pooling Layer: Reduces the size of the feature maps, focusing on the most important features. 14 .3.3 Classification:
  • Fully Connected Layer: Takes the high-level features identified by the convolutional and pooling layers and learns which features are most important for distinguishing between cats and dogs.
  • Output Layer: Uses a softmax function to classify the image as either 'cat' or 'dog'.

from tensorflow.keras.datasets import cifar from tensorflow.keras.utils import to_categorical from tensorflow.keras import layers, models from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.callbacks import EarlyStopping import matplotlib.pyplot as plt

2. load data and prepare it

Load and prepare CIFAR-10 data

(x_train, y_train), (x_test, y_test) = cifar10.load_data() x_train = x_train.astype('float32') / 255 x_test = x_test.astype('float32') / 255 y_train = to_categorical(y_train, 10 ) y_test = to_categorical(y_test, 10 ) Explanation of data preparation: (x_train, y_train), (x_test, y_test) = cifar10.load_data() Imagine you have a huge photo album (CIFAR-10) with lots of small pictures. These are split into two parts: one is for teaching your computer what each picture is (training set), and the other is for checking if the computer has learned correctly (testing set). So, we just divided our photo album into: Teaching Photos (x_train): These are the actual photos. Answers for Teaching Photos (y_train): These are the correct names for each photo. Quiz Photos (x_test): These are new photos to test the computer. Answers for Quiz Photos (y_test): These are the correct names for the quiz photos. x_train = x_train.astype('float32') / 255 x_test = x_test.astype('float32') / 255 Here, we're simply making sure the computer sees the pictures in a way it understands best. Originally, each color in the photos is represented by a number from 0 to 255. We change those numbers into smaller ones (between 0 and 1) because the computer prefers dealing with smaller numbers when learning from the pictures. y_train = to_categorical(y_train, 10 ) y_test = to_categorical(y_test, 10 ) Imagine you have a quiz paper with checkboxes. Each picture has a set of 10 checkboxes, and only one of them should be marked as the correct answer. Initially, the answers were written as numbers between 0 and 9, but now we convert them into these checkboxes, marking the right

answer for each picture. This makes it easy for the computer to match the photo it sees with the right checkbox. In sum, in this stage, we took a big collection of pictures, made sure they're in a format the computer likes, and then organized the answers into a clear checkbox format. This makes everything ready for the computer to start learning and later to take a test to show how much it has learned.

3. Define CNN model

Define the CNN model architecture

model = models.Sequential([ layers.Conv2D( 32 , ( 3 , 3 ), padding='same', activation='relu', input_shape=( 32 , 32 , 3 )), layers.Conv2D( 32 , ( 3 , 3 ), activation='relu'), layers.MaxPooling2D( 2 , 2 ), layers.Dropout(0.25), layers.Conv2D( 64 , ( 3 , 3 ), padding='same', activation='relu'), layers.Conv2D( 64 , ( 3 , 3 ), activation='relu'), layers.MaxPooling2D( 2 , 2 ), layers.Dropout(0.25), layers.Flatten(), layers.Dense( 512 , activation='relu'), layers.Dropout(0.5), layers.Dense( 10 , activation='softmax') ]) Explanation Starting the Model Building Process: model = models.Sequential([ Think of this like starting a recipe. You’re beginning to put together a list of steps (layers) the model will use to process images. First Set of Layers - Spotting Features: layers.Conv2D( 32 , ( 3 , 3 ), padding='same', activation='relu', input_shape=( 32 , 32 , 3 )), layers.Conv2D( 32 , ( 3 , 3 ), activation='relu'), These layers are like the model’s eyes. They look for basic patterns in the images using small windows (3x3). The ‘padding’ means it won’t ignore the edges of the pictures, and ‘relu’ is a way of saying it only pays attention to the most obvious features.

the last ‘Dense’ layer sorts it all into 10 categories (like sorting socks into 10 different drawers), with ‘softmax’ making sure that each image gets put into one and only one category.

4. Model Compilation model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) The Blueprint for Improvement - Optimizer: optimizer='adam' The optimizer is like a personal trainer for the model. It helps the model understand how to adjust its learning to be better as it goes. 'adam' is a popular choice because it's like a smart trainer that adapts the training routine as the model learns more, helping it learn faster and more effectively. Measuring Mistakes - Loss Function: loss='categorical_crossentropy' The loss function is the method for measuring errors the model makes. Think of it like a scoring system that counts how many times the model gets an answer wrong. 'Categorical_crossentropy' is used when you have several categories (like dog, cat, boat, etc.), and it’s very harsh—it gives a high penalty when the model is confident but wrong, encouraging it to be more careful with its guesses. Keeping Score - Metrics: metrics=['accuracy'] Metrics are like a scoreboard showing how well the model is doing. 'Accuracy' is the score we’re most interested in—it tells us the percentage of pictures the model is labeling correctly. It's a straightforward way for us to see how our model is performing on the task we've given it. So, to sum up, with model.compile(), we're setting up the rules of the game: how to train (optimizer), how to score the training (loss), and how to know how well the model is doing (metrics). It's the prep before the actual learning starts! 5. Data Augmentation

Set up data augmentation

datagen = ImageDataGenerator( rotation_range= 15 , width_shift_range=0.1, height_shift_range=0.1, horizontal_flip=True, fill_mode='nearest' ) datagen.fit(x_train) this section of the code introduces a concept called data augmentation, which is basically a way to create new training examples from the existing ones by applying random transformations. This helps the model learn from more varied data without having to collect new images. Let’s go through each part: Explanation: Creating a Data Generator: datagen = ImageDataGenerator( Think of datagen as a machine that can take your original images and make altered copies of them. These copies have slight variations, which helps train the model to recognize the objects in your images under different conditions. Setting the Transformations: rotation_range=15, width_shift_range=0.1, height_shift_range=0.1, horizontal_flip=True, fill_mode='nearest' Here's what each setting does:

  • rotation_range=15: This rotates the image up to 15 degrees left or right. Imagine tilting your head slightly—it's the same idea; the object looks a bit different, but you can still tell what it is.
  • width_shift_range=0.1 and height_shift_range=0.1: These settings slide the images a little bit horizontally or vertically (by 10% of the total width or height). It’s like when you take a photo and the subject isn’t exactly in the center.
  • horizontal_flip=True: This flips the image as if looking at it in a mirror, but only left-to- right. It’s good for when the orientation doesn't change the meaning of the picture (like a cat facing left is still a cat when facing right).
  • fill_mode='nearest': When we rotate or shift the image, some new pixels will come into view at the edges. 'nearest' fills these in with pixel values that resemble the closest pixels. So it’s like if you zoomed in too much and had to guess what the edges looked like based on the surrounding area.

was performing its best on the validation data. It's like a video game where you can revert to your last best checkpoint if you take a wrong turn.

7. Model Training

Train the model

history = model.fit(datagen.flow(x_train, y_train, batch_size= 64 ), epochs= 50 , validation_data=(x_test, y_test), callbacks=[early_stopping]) This code is where the training wheels come off, and we really start teaching the model with the help of all the tools we've set up. Let’s break it down: Recording the Journey - Storing the Training Progress: history = model.fit( This starts the training process, and 'history' is like a diary that records how well the model did after each attempt at learning, storing details about accuracy and mistakes. The Augmented Training Data - Getting Creative with Images: datagen.flow(x_train, y_train, batch_size= 64 ), Here, datagen.flow takes our training images and labels and applies the data augmentation transformations we set up earlier, creating those altered images on the fly. It also groups them into small batches (64 images at a time), which the model will use to update its understanding step by step. Setting the Time - How Long to Train: epochs= 50 , An 'epoch' is one complete pass through all the training images. '50 epochs' means the model will go through all the pictures 50 times, each time refining its understanding. But remember our early stopping? If the model stops improving, it won’t actually go all the way to 50. Checking the Answers - Validation Data: validation_data=(x_test, y_test), While training, the model will also regularly check itself against a set of images it hasn’t learned from (the test set). It's like taking a practice quiz at the end of each study session to see how much it has really learned. The Smart Assistant - Early Stopping Callback: callbacks=[early_stopping]

'Callbacks' are like helpers watching over the training process. Our early stopping callback is on the lookout for when the model stops getting better at the practice quizzes. If it sees no improvement for a set number of times (our patience setting), it’ll stop the training and revert to the best it did. So, in summary, this code starts the training process, continually tries to improve the model by showing it both original and creatively altered images, tests it, and has a smart system in place to stop training if it’s no longer beneficial. The model gets better and better in a controlled way, without overdoing it.

8. Model Evaluation test_loss, test_acc = model.evaluate(x_test, y_test, verbose= 2 ) print(f"Test accuracy: {test_acc}") This snippet of code is like the final exam for the model, where we see how well it has learned to recognize and classify images based on what it saw during training. The Final Exam: test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2) - model.evaluate is the function used to test the model. You can think of it as the model taking a final exam using the test images (x_test) and their correct answers (y_test). - test_loss is like the score that tells us how many mistakes the model made on average. Lower scores are better here—it means the model is making fewer mistakes. - test_acc is the test accuracy, which tells us the percentage of images the model classified correctly. Higher is better—you want your model to get as many right as possible! The verbose parameter in Keras controls the amount of information that's displayed during training or evaluation. Setting verbose=2 affects how the progress updates are shown when you're evaluating or training a model. Here’s what the different settings mean: - verbose=0: Silent mode, no information is shown. - verbose=1: Progress bar, which gives a visual representation of the progress. - verbose=2: One line per epoch; instead of the progress bar, you get a simpler, more concise output of the progress, typically showing the main metrics after each epoch. So, when you set verbose=2 in the model.evaluate() function, it means that the output will be minimal and typically include just the loss and other metrics after the evaluation of the test data, without showing a progress bar. Announcing the Score:

  • Regular Check-ups: Use techniques like dropout (taking breaks) or regularization (not focusing too much on any single topic) to ensure the model learns the subject broadly.
  • Change Study Habits: Early stopping is like telling the student to stop studying if they aren’t improving on new practice questions anymore.
  • Broaden the Study Material: Using cross-validation is akin to practicing with a broader set of questions to ensure the student understands the subject well, not just specific questions. What is Underfitting? Imagine if a student only briefly glances over their study materials before an exam. They grasp only the basic concepts but miss out on the details and complexities needed to answer more challenging questions effectively. In machine learning, underfitting happens when a model fails to capture important aspects of the data because it's too simplistic. Recognizing Underfitting Here are some signs that a model might be underfitting: High Training Loss: The model doesn't perform well even on the training data, indicating it hasn't learned the basic patterns. Poor Validation Performance: The model also performs poorly on new, unseen data (validation set). Unlike overfitting, where the model does well on training data but poorly on validation data, underfitting means the model does poorly on both.

Plot training and validation loss

plt.plot(history.history['loss'], label='Training Loss') plt.plot(history.history['val_loss'], label='Validation Loss') plt.title('Model Loss') plt.ylabel('Loss') plt.xlabel('Epoch') plt.legend(loc='upper right') plt.show() Exercise 1 4 : Develop a Convolutional Neural Network that can accurately classify images of fashion items from the Fashion-MNIST dataset. from tensorflow.keras.datasets import fashion_mnist for visualization after validation: import matplotlib.pyplot as plt import numpy as np from tensorflow.keras.datasets import fashion_mnist

Load Fashion-MNIST dataset

(_, _), (x_test, y_test) = fashion_mnist.load_data()

Normalize and prepare images

x_test = x_test.astype('float32') / 255. x_test = x_test.reshape((x_test.shape[ 0 ], 28 , 28 , 1 ))

Assuming you have a trained model named 'model'

Class names in Fashion-MNIST

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

Select a few images and labels from the test set

num_images = 10 test_images = x_test[:num_images] test_labels = np.argmax(y_test[:num_images], axis= 1 )

Get model predictions

predictions = model.predict(test_images) predicted_labels = np.argmax(predictions, axis= 1 )

Function to plot images, true labels, and model's predictions

def plot_images(images, true_labels, predicted_labels): plt.figure(figsize=( 15 , 5 )) for i in range(len(images)): plt.subplot( 2 , 5 , i + 1 ) plt.imshow(images[i].reshape( 28 , 28 ), cmap='gray') plt.title(f"Actual: {class_names[true_labels[i]]}\nPredicted: {class_names[predicted_labels[i]]}") plt.xticks([]) plt.yticks([]) plt.show()

Call the function with the images and labels

plot_images(test_images, test_labels, predicted_labels)