Ch. 4 - Predictive Analytics I: Data Mining Process, Methods, and Algorithms Questions and, Exams of Data Mining

predictive analytics in law enforcement - Answer 1. policing with less 2. new thinking on cold cases 3. the big picture starts small 4. success brings credibility 5. just for the facts 6. safer streets for smarter cities

Typology: Exams

2025/2026

Available from 03/17/2026

coco-store-1
coco-store-1 🇺🇸

2.8K documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Ch. 4 - Predictive Analytics I: Data
Mining Process, Methods, and
Algorithms Questions and Answers.
predictive analytics in law enforcement - Answer 1. policing with less
2. new thinking on cold cases
3. the big picture starts small
4. success brings credibility
5. just for the facts
6. safer streets for smarter cities
not new - Answer Although the term "data mining" is relatively new, the
ideas behind it are ___ ___.
statistical analysis; artificial intelligence - Answer Many of the techniques
used in data mining have their roots in traditional ___________ ________ and
__________ ____________ (AI).
data mining (DM) - Answer - exciting technology
- imperative and common practice for a vast majority of organizations
- a strategic weapon for companies to compete with the giants of Amazon,
Capital One, and Marriott
Why is data mining gaining attention? - Answer 1. competition at the
global scale
2. recognition of the value in data sources
3. availability of quality data
- a large portion of "understanding the customer" can come from analyzing
the vast amount of data that a company routinely collects
- this has helped Amazon and many other successful businesses
4. decrease in cost
- the cost of data storage has plummeted recently, making data mining
feasible for more firms
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Ch. 4 - Predictive Analytics I: Data Mining Process, Methods, and Algorithms Questions and and more Exams Data Mining in PDF only on Docsity!

Ch. 4 - Predictive Analytics I: Data

Mining Process, Methods, and

Algorithms Questions and Answers.

predictive analytics in law enforcement - Answer 1. policing with less

  1. new thinking on cold cases
  2. the big picture starts small
  3. success brings credibility
  4. just for the facts
  5. safer streets for smarter cities not new - Answer Although the term "data mining" is relatively new, the ideas behind it are ___ ___. statistical analysis; artificial intelligence - Answer Many of the techniques used in data mining have their roots in traditional ___________ ________ and __________ ____________ (AI). data mining (DM) - Answer - exciting technology
  • imperative and common practice for a vast majority of organizations
  • a strategic weapon for companies to compete with the giants of Amazon, Capital One, and Marriott Why is data mining gaining attention? - Answer 1. competition at the global scale
  1. recognition of the value in data sources
  2. availability of quality data
  • a large portion of "understanding the customer" can come from analyzing the vast amount of data that a company routinely collects
  • this has helped Amazon and many other successful businesses
  1. decrease in cost
  • the cost of data storage has plummeted recently, making data mining feasible for more firms

data mining (DM) - Answer the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases valid - Answer the discovered patterns should hold true on new data novel - Answer previously unknown patterns are discovered potentially useful - Answer results should lead to some business benefit knowledge mining - Answer another name for "data mining" data mining tools - Answer - use mathematical techniques for extracting hidden patterns for predictive purposes

  • use patterns in data to develop mathematical rules for predicting outcomes for future observations
  • are commonly used to identify customer buying patterns to increase sales and for fraud detection, among other things
  • in data mining, classification models help in prediction
  • a data mining study is specific to addressing a well-defined business task, and different business tasks require different sets of data data - Answer the most critical ingredient for DM which may include soft/unstructured data end user - Answer The data miner is often an ___ ____. creative thinking - Answer Striking it rich requires ________ ________. parallel processing - Answer Because of the large amounts of data and massive search efforts, it is sometimes necessary to use ________ __________ for data mining. patterns - Answer DM extracts ________ from data.

banking and other financial - Answer - automate the loan application process

  • optimizing cash reserves with forecasting retailing and logistics - Answer - optimize inventory levels at different locations
  • improve the store layout and sales promotions manufacturing and maintenance - Answer - predict/prevent machinery failures
  • discover novel patterns to improve product quality brokerage and securities trading - Answer - predict changes on certain bond prices
  • forecast the direction of stock fluctuations insurance - Answer - identify and prevent fraudulent claim activities
  • determine optimal rate plans most common standard processes (in order of effectiveness) - Answer 1. CRISP-DM (Cross-Industry Standard Process for Data Mining)
  1. SEMMA (Sample, Explore, Modify, Model, and Assess)
  2. KDD (Knowledge Discovery in Databases) CRISP-DM - Answer process is more comprehensive, common, and standardized data mining process CRISP-DM - Answer Composed of six consecutive phases:
  3. Business Understanding
  4. Data Understanding
  5. Data Preparation (data pre-processing)
  6. Model Building
  7. Testing and Evaluation
  8. Deployment (use for prediction)

(Steps 1, 2, and 3 account for 85% of total project time) process is highly repetitive and experimental classification - Answer most frequently used DM method for real-world problems

  • learn from past data, classify new data qualitative; categorical - Answer Classification: The data is ___________. The output variable is ___________ (nominal or ordinal in nature) nominal data - Answer has finite non-ordered values EX: yes/no (Y/N); gender (M/F); ethnic groups (a choice from a list of groups) ordinal data - Answer has finite ordered values EX: store location is good, fair, bad; customer credit rating is 0-Bad, 1-Fair, 2- Excellent classification - Answer Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes? clustering - Answer Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features? (EX: market segmentation) clustering - Answer partitions a collection of things into segments whose members share similar characteristics assessment methods for classification - Answer - predictive accuracy - hit rate
  • learns the clusters of things from past data, then assigns new instances
  • in marketing, it is also known as "segmentation" association rule mining - Answer - a very popular DM method in business
  • finds interesting relationships (affinities) between variables (items or events) market basket analysis - Answer Because of its successful application to retail business problems, association rule mining is commonly called ______ ______ ________.
  • often used as an example to describe DM to ordinary people, such as the famous "relationship between diapers and beers!" market basket analysis - Answer identify strong relationships among different products (or services) that are usually purchased together (show up in the same basket together, either a physical basket at a store or a virtual basket at an e-commerce Web site) association rule mining and market basket analysis - Answer - INPUT: the simple point-of-sale transaction data
  • OUTPUT: most frequent affinities among items EX: according to the transaction data... "Customer who bought a laptop computer and an antivirus software, also bought extended service plan 70% of the time." How do you use such a pattern/knowledge?
  • put the items next to each other
  • promote the items as a package
  • place items far apart from each other! business; medicine - Answer A representative application of association rule mining includes popular uses in ________ and in ________.

apriori algorithm - Answer most commonly used algorithm to discover association rules (most commonly used for association rule mining)

  • given a set of itemsets, the algorithm attempts to find subsets that are common to at least a minimum number of itemsets
  • uses a bottom-up approach
  • widely used for data mining association - Answer 1. finds the commonly co-occuring groupings of things, such as a) tells you what products your customers are most likely to purchase at the same time b) market basket analysis examples:
  • beer and diapers
  • comprehensive automobile insurance and health insurance
  • online books, online music, and podcasts association rule mining - Answer - a very popular DM method in business
  • finds interesting relationships (affinities) between variables (items or events)
  • also known as "market basket analysis" - helps understand the purchase behavior of a buyer in the retail business
  • often used as an example to describe DM to ordinary people, such as the famous "relationship between diapers and beers!"
  • finds an affinity of two products to be commonly together in a shopping cart predictions - Answer tell the nature of future occurrences of certain events based on what has happened in the past, such as a) predicting the winner of the Super Bowl (classification) or

private and personal - Answer Data that is collected, stored, and analyzed in data mining is often _______ and ________. privacy - Answer One way to accomplish _______ and protection of individuals' rights when data mining is by de-identification of the customer records prior to applying data mining applications, so that the records cannot be traced to an individual. (Third party providers of publicly available datasets protect the anonymity of the individuals in the data set primarily by removing identifiers such as names and social security numbers.) data mining myths - Answer 1. provides instant solutions and crystal-ball predictions

  1. is not yet viable for business applications
  2. requires a separate, dedicated database
  3. can only be done by those with advanced degrees
  4. is only for large firms that have lots of customer data
  5. is another name for the good-old statistics data mining truths - Answer 1. if using a mining analogy, "knowledge mining" would be a more appropriate term than "data mining"
  6. the cost of data storage has plummeted recently, making data mining feasible for more firms
  7. understanding customers better has helped Amazon and others become more successful; the understanding comes primarily from analyzing the vast data amounts routinely collected
  8. parallel processing is sometimes used for data mining because of the massive data amounts and search efforts involved
  1. the number of users of free/open source data mining software now exceeds that of users of commercial software versions