Spam Detection Algorithm Analysis

CAP613 5 Final Pro je ct Prop os al – Joe La Fa ta and Alex Wad e

Overview

Spam is currently a major problem with the internet. It wastes bandwidth, storage space, time, and is a

nuisance in general. Most e-mail providers now have a mechanism in place to filter out much of the

spam that comes into their servers. For our CAP6135 project proposal, we plan to study the spam

detection algorithms that can be used by e-mail providers in depth. We plan to write a modular

framework that allows multiple spam detection algorithms to be executed on sample e-mail messages

and reports performance metrics about these algorithms. We will test multiple algorithms and

combinations of algorithms and perform analysis on the reported metrics.

Purpose

The purpose of this project is to provide performance comparisons for a few well-known algorithms.

This information is beneficial in the fight against the growing amount of spam that is sent out. Internet

e-mail providers can use this information to help them determine the best techniques for fighting spam.

The scope of this project could potentially be expanded to include production of new or hybrid spam

detection algorithms, and/or the production a spam filtering service that can be run on an e-mail server

or client.

Goals

The primary goal of this project is to produce a flexible framework that supports multiple spam

detection modules. The framework should allow each algorithm to be trained on known spam and non-

spam e-mail messages and to be executed on a different set of input e-mails. It should collect

performance metrics on each algorithm including statistics on accuracy, required training time, and

processing time. We will find a public spam database or will come up with a message harvesting

technique in order to acquire training and test messages.

Another goal of this project is to learn how several spam detection algorithms work. We will implement

at least three of these algorithms as modules for our framework. Each of the algorithms will be

analyzed and all of them will be compared.

If time permits, there are several choices for additional goals. Because we are taking a modular

approach when building our system, it will be easy to add new spam detection algorithms. If possible,

we will implement additional spam detection algorithms or create ones that combine two or more

algorithms that we implemented. Another potential goal for this project is to create a service that can

be run on an e-mail server and perform spam filtering. A similar option to this will be to write a spam

filtering plug-in for an e-mail client such as Thunderbird, Evolution, or Outlook. These goals are optional

Partial preview of the text

Download Spam Detection Algorithm Analysis - Final Project Proposal | CAP 6135 and more Study Guides, Projects, Research Computer Science in PDF only on Docsity!

CAP6135 Final Project Proposal – Joe LaFata and Alex Wade

Overview Spam is currently a major problem with the internet. It wastes bandwidth, storage space, time, and is a nuisance in general. Most e-mail providers now have a mechanism in place to filter out much of the spam that comes into their servers. For our CAP6135 project proposal, we plan to study the spam detection algorithms that can be used by e-mail providers in depth. We plan to write a modular framework that allows multiple spam detection algorithms to be executed on sample e-mail messages and reports performance metrics about these algorithms. We will test multiple algorithms and combinations of algorithms and perform analysis on the reported metrics. Purpose The purpose of this project is to provide performance comparisons for a few well-known algorithms. This information is beneficial in the fight against the growing amount of spam that is sent out. Internet e-mail providers can use this information to help them determine the best techniques for fighting spam. The scope of this project could potentially be expanded to include production of new or hybrid spam detection algorithms, and/or the production a spam filtering service that can be run on an e-mail server or client. Goals The primary goal of this project is to produce a flexible framework that supports multiple spam detection modules. The framework should allow each algorithm to be trained on known spam and non- spam e-mail messages and to be executed on a different set of input e-mails. It should collect performance metrics on each algorithm including statistics on accuracy, required training time, and processing time. We will find a public spam database or will come up with a message harvesting technique in order to acquire training and test messages. Another goal of this project is to learn how several spam detection algorithms work. We will implement at least three of these algorithms as modules for our framework. Each of the algorithms will be analyzed and all of them will be compared. If time permits, there are several choices for additional goals. Because we are taking a modular approach when building our system, it will be easy to add new spam detection algorithms. If possible, we will implement additional spam detection algorithms or create ones that combine two or more algorithms that we implemented. Another potential goal for this project is to create a service that can be run on an e-mail server and perform spam filtering. A similar option to this will be to write a spam filtering plug-in for an e-mail client such as Thunderbird, Evolution, or Outlook. These goals are optional

and will be attempted based on how long it takes to implement the other goals and how long these additional goals will take to implement.

Spam Detection Algorithm Analysis - Final Project Proposal | CAP 6135, Study Guides, Projects, Research of Computer Science

Related documents

Partial preview of the text

Download Spam Detection Algorithm Analysis - Final Project Proposal | CAP 6135 and more Study Guides, Projects, Research Computer Science in PDF only on Docsity!

Spam Detection Algorithm Analysis

CAP6135 Final Project Proposal – Joe LaFata and Alex Wade