



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A concise overview of key concepts related to the certified analytics professional (cap) exam. It covers essential topics such as business problem framing, the analytics process model (aligned with crisp-dm), stakeholder identification, and various analytical techniques. The document also delves into data types, levels of measurement, descriptive and prescriptive statistics, and methods for evaluating analysis, including statistical significance and effect size. It serves as a quick reference guide for understanding the fundamental principles and methodologies in the field of analytics, useful for exam preparation and practical application. It also covers hypothesis testing and measures of central tendency and variance.
Typology: Exams
1 / 5
This page cannot be seen from the preview
Don't miss anything!




CAP Domain I: Business Problem Framing correct answer The first and most important step of an analytics project. Process of defining the business problem, identifying stakeholders, determining if the problem has an analytics solution, refining the problem statement and constaints, defining the set of business benefits, and obtaining stakeholder agreement. Analytics Process Model correct answer Aligns with the CRISP-DM model (Cross Industry Standard Platform for Data Mining). Encapsulates the major components of conducting any analytical research project. The model is iterative and has 6 phases: understand the organization, understand the data, prepare the data, analyze and interpret the data, evaluate the analysis, and communicate and deploy the results CAP Domain I: Business Problem Statement correct answer Starts by describing a business opportunity or threat or an issue in broad terms. CAP Domain I: How to frame a business problem correct answer The 5 Ws: Who are the stakeholders? What problem/function is project to solve? Where does the problem occur? When does the problem occur? Why does the problem occur? CAP Domain I: Stakeholders correct answer Anyone affected by the project or who is the most critical to the long term success of the project. CAP Domain III: Analytics correct answer The scientific process of transforming data into insight for making better decisions (INFORMS) CAP Domain I: 5 Whys correct answer Iterative process of discovery through repetitively asking 'why'; used to explore cause and effect relationships underlying and/or leading to the problem. CAP Domain II: Problem amenable to analytics solution correct answer Ask: Does the answer lie within the organization's control? Does the requisite data exist or can it be obtained? Can the likely problem be solved and/or modeled? Can the the organization accept and deploy the answer? CAP Domain II: Analytics Problem Framing correct answer Requires analyst to: reformulate a problem statement as an analytics problem; develop a proposed set of drivers and relationships to inputs; state
the set of assumptions related to the problem; define key metrics of success; obtain stakeholder agreement on the approach. CAP Domain II: Decomposition correct answer Decomposition is the act of breaking down a higher-level requirement to multiple lower-level requirements. CAP Domain II: Quality Function Deployment (QFD) correct answer A rigorous process that maps the translation of requirements from one level to the next. CAP Domain II: Kano's Requirements Model correct answer One of best known models for decomposition that distinguishes between unexpected customer delights, known customer requirements, and customer must-haves that are not explicitly stated. Must capture entire context including "exciting" requirements, "normal" requirements and "expected" requirements CAP Domain II: Anchoring correct answer Tendency of people to hang on to views that they've seen and held before, even if they are incorrect. This is a danger to be avoided at stage of developing a proposed set of drivers and relationships to outputs. CAP Domain II: Define key metrics of success correct answer Ensure that all facets of the business problem are incorporated in the metrics. CAP Domain III: Hard data correct answer Data that is obtained by scientific observation and measurement. Analysis typically uses this type of data. CAP Domain III: Soft data correct answer Data gleaned from interviews and reflective opinions and preferences. This data has to be converted into scientific data. CAP Domain III: Conjoint measurement or analysis correct answer Approach to convert soft data to hard data. Posits that the behavior of the actual individual can be described by an artificial individual whose preferences are described by a utility function. The utility function for various outcomes is first specified as a parametric function of observable attributes of that item. Individuals are asked to either specify which hypothetical alternatives they prefer or how they would rank different items in order of preferability to determine the parameters of the utility function. Parameters are calibrated to minimize disparity between what individual prefers and what the model predicts.
Data are numerical and represent measurements made along a continuous scale - ex. Money, temperature, age Selecting Analytical Techniques- Inferential statistics correct answer Designed to identify if two group are different in terms of some measure. Z and t tests in their various forms, ANOVA, binomial proportions, and others allow to compare two or more groups and determine whether a statistically significant difference exists. Evaluating the Analysis - statistical significance correct answer Provides a level of confidence that the results produced by the technique are not simply occurring due to random variation in the data and statistical error. I.e. we are saying that we have reasonable evidence to believe that the estimate is not zero in the population or that we have evidence that a relationship, a difference does exist. Evaluating the Analysis - effect size correct answer Gives an indication of the size of the relationship between results/findings using a technique. By contrast, Statistical significance tells us if we have a reason to believe a relationship exists. CAP Domain III - Descriptive Statistics - Measures of Central Tendency correct answer The idea of central tendency is to try to describe the average, middle, or center of the data. Include Mean and Median CAP Domain III - Measures of Variation - Range correct answer The range provides us with information about how wide the spread of the data are. It is the difference between the largest value and the smallest value in the data. CAP Domain III - Measures of Variance - Variance correct answer Variance provides us with a measure of how much the data varies from its center or mean. It is calculated by measuring the distance of each data point from the mean, then averaging the squared differences. Steps: 1. Calculate the mean of the data; 2. Calculate each data point's deviation by subtracting the mean from it; 3. Square each of the deviations calculated in Step 2; 4. Add up all of the squared deviations; 5. Divide by the number of data points in the data set. A larger variance indicates greater dispersion of data. Usual give SD more weight than variance because difficult to interpret. CAP Domain III - Measures of Variance - Standard Deviation correct answer The positive square root of the variance. In Excel = SQRT(The Variance). The smaller the standard deviation, the less the data are spread.
CAP Domain III - Measures of Variance - Mode correct answer The value that occurs most often in the data. Useful with categorical variables where median is not meaningful. Skewness correct answer In a normal distribution, the mean and the median are the same number while the mean and median in a skewed distribution become different numbers: A left-skewed, negative distribution will have the mean to the left of the median. A right-skewed distribution will have the mean to the right of the median. Too much skewness and many statistical methods won't work. hypothesis testing correct answer Used to judge if a population exhibits a certain characteristic by only measuring a sample of that population. To prove that the mean of a population is different from some proposed value. I.e hypothesis testing helps us determine if sets of data are different enough. A theory must be stated in two opposing hypothesis constructed to contain all possibilities. : Null hypothesis = H The statement to disprove. Alternate hypothesis = HA The statement to prove.