Download Kimball & Ross The Data Warehouse Toolkit and more Study notes Data Warehousing in PDF only on Docsity!
John Wiley & Sons, Inc.
N E W YO R K • C H I C H EST E R • W E I N H E I M • B R I S BA N E • S I N G A P O R E • TO R O N TO
Wiley Computer Publishing
Ralph Kimball
Margy Ross
The Data Warehouse
Toolkit
Second Edition
The Complete Guide to
Dimensional Modeling
TEAMFLY
John Wiley & Sons, Inc.
N E W YO R K • C H I C H EST E R • W E I N H E I M • B R I S BA N E • S I N G A P O R E • TO R O N TO
Wiley Computer Publishing
Ralph Kimball
Margy Ross
The Data Warehouse
Toolkit
Second Edition
The Complete Guide to
Dimensional Modeling
C O N T E N T S
v
Acknowledgments xv
A C K N O W L E D G M E N T S
xv
F
irst of all, we want to thank the thousands of you who have read our Toolkit
books, attended our courses, and engaged us in consulting projects. We have
learned as much from you as we have taught. As a group, you have had a pro-
foundly positive impact on the data warehousing industry. Congratulations!
This book would not have been written without the assistance of our business
partners. We want to thank Julie Kimball of Ralph Kimball Associates for her
vision and determination in getting the project launched. While Julie was the
catalyst who got the ball rolling, Bob Becker of DecisionWorks Consulting
helped keep it in motion as he drafted, reviewed, and served as a general
sounding board. We are grateful to them both because they helped an enor-
mous amount.
We wrote this book with a little help from our friends, who provided input or
feedback on specific chapters. We want to thank Bill Schmarzo of Decision-
Works, Charles Hagensen of Attachmate Corporation, and Warren Thorn-
thwaite of InfoDynamics for their counsel on Chapters 6, 7, and 16, respectively.
Bob Elliott, our editor at John Wiley & Sons, and the entire Wiley team have
supported this project with skill, encouragement, and enthusiasm. It has been
a pleasure to work with them. We also want to thank Justin Kestelyn, editor-
in-chief at Intelligent Enterprise for allowing us to adapt materials from sev-
eral of Ralph’s articles for inclusion in this book.
To our families, thanks for being there for us when we needed you and for giv-
ing us the time it took. Spouses Julie Kimball and Scott Ross and children Sara
Hayden Smith, Brian Kimball, and Katie Ross all contributed a lot to this book,
often without realizing it. Thanks for your unconditional support.
xvii
I N T R O D U C T I O N
T
he data warehousing industry certainly has matured since Ralph Kimball pub-
lished the first edition of The Data Warehouse Toolkit (Wiley) in 1996. Although
large corporate early adopters paved the way, since then, data warehousing
has been embraced by organizations of all sizes. The industry has constructed
thousands of data warehouses. The volume of data continues to grow as we
populate our warehouses with increasingly atomic data and update them with
greater frequency. Vendors continue to blanket the market with an ever-
expanding set of tools to help us with data warehouse design, development,
and usage. Most important, armed with access to our data warehouses, busi-
ness professionals are making better decisions and generating payback on
their data warehouse investments.
Since the first edition of The Data Warehouse Toolkit was published, dimen-
sional modeling has been broadly accepted as the dominant technique for data
warehouse presentation. Data warehouse practitioners and pundits alike have
recognized that the data warehouse presentation must be grounded in sim-
plicity if it stands any chance of success. Simplicity is the fundamental key that
allows users to understand databases easily and software to navigate data-
bases efficiently. In many ways, dimensional modeling amounts to holding the
fort against assaults on simplicity. By consistently returning to a business-
driven perspective and by refusing to compromise on the goals of user under-
standability and query performance, we establish a coherent design that
serves the organization’s analytic needs. Based on our experience and the
overwhelming feedback from numerous practitioners from companies like
your own, we believe that dimensional modeling is absolutely critical to a suc-
cessful data warehousing initiative.
Dimensional modeling also has emerged as the only coherent architecture for
building distributed data warehouse systems. When we use the conformed
dimensions and conformed facts of a set of dimensional models, we have a
practical and predictable framework for incrementally building complex data
warehouse systems that have no center.
For all that has changed in our industry, the core dimensional modeling tech-
niques that Ralph Kimball published six years ago have withstood the test of
time. Concepts such as slowly changing dimensions, heterogeneous products,
factless fact tables, and architected data marts continue to be discussed in data
warehouse design workshops around the globe. The original concepts have
been embellished and enhanced by new and complementary techniques. We
decided to publish a second edition of Kimball’s seminal work because we felt
that it would be useful to pull together our collective thoughts on dimensional
modeling under a single cover. We have each focused exclusively on decision
support and data warehousing for over two decades. We hope to share the
dimensional modeling patterns that have emerged repeatedly during the
course of our data warehousing careers. This book is loaded with specific,
practical design recommendations based on real-world scenarios.
The goal of this book is to provide a one-stop shop for dimensional modeling
techniques. True to its title, it is a toolkit of dimensional design principles and
techniques. We will address the needs of those just getting started in dimen-
sional data warehousing, and we will describe advanced concepts for those of
you who have been at this a while. We believe that this book stands alone in its
depth of coverage on the topic of dimensional modeling.
Intended Audience
This book is intended for data warehouse designers, implementers, and man-
agers. In addition, business analysts who are active participants in a ware-
house initiative will find the content useful.
Even if you’re not directly responsible for the dimensional model, we believe
that it is important for all members of a warehouse project team to be comfort-
able with dimensional modeling concepts. The dimensional model has an
impact on most aspects of a warehouse implementation, beginning with the
translation of business requirements, through data staging, and finally, to the
unveiling of a data warehouse through analytic applications. Due to the broad
implications, you need to be conversant in dimensional modeling regardless
whether you are responsible primarily for project management, business
analysis, data architecture, database design, data staging, analytic applica-
tions, or education and support. We’ve written this book so that it is accessible
to a broad audience.
For those of you who have read the first edition of this book, some of the famil-
iar case studies will reappear in this edition; however, they have been updated
significantly and fleshed out with richer content. We have developed vignettes
for new industries, including health care, telecommunications, and electronic
commerce. In addition, we have introduced more horizontal, cross-industry
case studies for business functions such as human resources, accounting, pro-
curement, and customer relationship management.
xviii I n t r o d u c t i o n