Data Mining vs. OLAP: Differences & Importance in Business Intelligence - Prof. Stephen Lo | Study notes Management Information Systems

The Lowdown on Data Mining

by Evan Levy

Data mining is a high-yield but complex form of knowledge discovery. Before you even think

about using this technology, make sure you know what it is, what it takes, and what you need.

In its 1997-1998 study of data mining market trends, META Group claimed that nearly 80 percent of

companies intervi ewed exp ect ed d ata min ing t o be a cri tic al s ucce ss f act or b y 199 9. M ore r ecen tly,

Forrester Research weighed in on data mining, claiming that, while many companies were still

evaluating the technology, most planned on using it by 2001. Other analysts and independent research

firms polling companies to find out who’s doing what in the data mining space are finding that the

common denominator is intention, not practice. Is this because companies are solidifying their

infrastructures first? Are companies too intimidated to admit they have no intention of doing data mining

at all? Or is there still a pervasive misunderstanding of what data mining really is–and isn’t?

I recently spoke at a database marketing conference on this point. The title of my presentation was "Data

Mining in the Real World," and the room was brimming with both technicians and marketers. When I

got to the part of the presentation that discussed the differences between data mining and OLAP, I

noticed a guy a few rows from the front. He had stopped taking notes and had put down his pen. After

the presentation, he buttonholed me, taking me to task for my definition of data mining.

At first, I figured he worked for an OLAP vendor, one of the many who had labeled its multidimensional

analysis or query generation tool as a data mining product. But after listening to his harangue for a few

minutes, I was able to piece together that he was a data analyst for a marketing organization and had

been telling everyone that his company was doing data mining. I had burst his bubble by classifying his

cherished "data mining" tool as a simple OLAP application, and I had clearly called into question his

status as a knowledge worker.

My point is not that OLAP is less valuable than data mining, but that they are two separate breeds of

analysis with entirely different objectives, not to mention tools, skill sets, and implementation methods.

UNDERSTANDING THE PLAYERS

Most people wouldn’t use a spreadsheet tool to write a book. Even a crack statistician wouldn’t use SAS

to fill out an expense report. Different software tools exist to tackle different business functions, just as

different decision-support tools exist because there are different classes of questions. The major classes

of decision support are:

Canned reports. This is the most basic type of decision support, if not the most pervasive. Nearly every

data warehouse starts out by generating reports. The delivery of timely, accurate reports containing

business information is incredibly valuable–especially in places where this data never before existed.

Such an application focuses on well-defined, well-understood business questions. It also allows users to

gradually change their businesses to leverage this new information.

Ad hoc querying. Submitting free-form or ad hoc questions to the database is the next logical step in the

Data Mining vs. OLAP: Differences & Importance in Business Intelligence - Prof. Stephen Lo, Study notes of Management Information Systems