Download SPSS tutorial on cluster analysis (.pdf) and more Lecture notes Marketing in PDF only on Docsity!
SPSS TutorialSPSS Tutorial
AEB 37 / AE 802
Marketing Research Methods
Week 7
Cluster analysisCluster analysis
Lecture / Tutorial outline
- Cluster analysis
- Example of cluster analysis
- Work on the assignment
Cluster AnalysisCluster Analysis andand
marketing research marketing research
- Market segmentation. E.g. clustering of
consumers according to their attribute
preferences
- Understanding buyers behaviours.
Consumers with similar
behaviours/characteristics are clustered
- Identifying new product opportunities.
Clusters of similar brands/products can help
identifying competitors / market opportunities
- Reducing data. E.g. in preference mapping
Steps to conduct aSteps to conduct a Cluster Analysis Cluster Analysis
1. Select a distance measure
2. Select a clustering algorithm
3. Determine the number of clusters
4. Validate the analysis
Defining distance: theDefining distance: the
Euclidean distance Euclidean distance
D
ij
distance between cases i and j
x
ki
value of variable X
k
for case j
Problems:
• Different measures = different weights
• Correlation between variables (double
counting)
Solution: Principal component analysis
n
ij ki kj
k
D x x
Clustering proceduresClustering procedures
- Hierarchical procedures
- Agglomerative (start from n clusters,
to get to 1 cluster)
- Divisive (start from 1 cluster, to get to
n cluster)
- Non hierarchical procedures
AgglomerativeAgglomerative
clustering clustering
- Linkage methods
- Single linkage (minimum distance)
- Complete linkage (maximum distance)
- Average linkage
- Ward’s method
- Compute sum of squared distances within clusters
- Aggregate clusters with the minimum increase in the overall sum of squares
- Centroid method
- The distance between two clusters is defined as the difference between the centroids (cluster averages)
KK--means clusteringmeans clustering
- The number k of cluster is fixed
- An initial set of k “seeds” (aggregation centres) is provided
- First k elements
- Other seeds
- Given a certain treshold, all units are assigned to the nearest cluster seed
- New seeds are computed
- Go back to step 3 until no reclassification is necessary Units can be reassigned in successive steps ( optimising partioning )
Suggested approachSuggested approach
1. First perform a hierarchical
method to define the number of
clusters
2. Then use the k - means procedure
to actually form the clusters
Defining the number ofDefining the number of clusters: elbow rule (1) clusters: elbow rule (1) Agglomeration Schedule 4 7 .015 0 0 4 6 10 .708 0 0 5 8 9 .974 0 0 4 4 8 1.042 1 3 6 1 6 1.100 0 2 7 4 5 3.680 4 0 7 1 4 3.492 5 6 8 1 11 6.744 7 0 9 1 2 8.276 8 0 10 1 12 8.787 9 0 11 1 3 11.403 10 0 0 Stage 1 2 3 4 5 6 7 8 9 10 11 Cluster 1 Cluster 2 Cluster Combined CoefficientsCluster 1 Cluster 2 Stage Cluster First Appears Next Stage Stage Number of clusters 0 12 1 11 2 10 3 9 4 8 5 7 6 6 7 5 8 4 9 3 10 2 11 1
n
Validating theValidating the
analysis analysis
- Impact of initial seeds / order of
cases
- Impact of the selected method
- Consider the relevance of the
chosen set of variables
SPSS ExampleSPSS Example
Agglomeration Schedule 3 6 .026 0 0 8 2 5 .078 0 0 7 4 9 .224 0 0 5 1 7 .409 0 0 6 4 10 .849 3 0 8 1 8 1.456 4 0 7 1 2 4.503 6 2 9 3 4 9.878 1 5 9 1 3 18.000 7 8 0 Stage 1 2 3 4 5 6 7 8 9 Cluster 1 Cluster 2 Cluster Combined Coefficients Cluster 1 Cluster 2 Stage Cluster First Appears Next Stage
Number of clusters: 10 – 6 = 4
Component
-1.5 -1.0 -.5 0.0 .5 1.0 1.5 2.
Component
.
-. -1. -1. -2.
Cluster Number of Ca
4 3 2 1 LUCY JULIA FRED ARTHUR JENNIFER THOMAS MATTHEW NICOLE PAMELA JOHN