

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The important processes that need to be clearly delineated for data processing, Analysis and Modelling are: Data model: what data are going to be available and the way will it flow? Data gathering: how will data be gathered both in physical and technological terms? Data gathered: what data are going to be gathered? Data types: what sorts of data are going to be gathered? Data formatting: how will data be held? Data warehousing: where will data be held? Data mining: how will we retrieve data from
Typology: Essays (university)
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Introduction to Data Mining and Modelling Subject: Management Paper 1 The important processes that need to be clearly delineated for data processing, Analysis and Modelling are: Data model: what data are going to be available and the way will it flow? Data gathering: how will data be gathered both in physical and technological terms? Data gathered: what data are going to be gathered? Data types: what sorts of data are going to be gathered? Data formatting: how will data be held? Data warehousing: where will data be held? Data mining: how will we retrieve data from the warehouse? Information modelling: how will we create models and what of? Information access: how will we access the info models and reports? Presentation & reporting: on what is going to we report? Most companies want to understand essential information about customers at every point of contact, for example: Lifetime value X sell and upgrade potential Acquisition cost Channel preferences Loyalty/retention Purchase behaviour patterns
Much of the info that they need will have different frequencies of change, refreshment or occurrence. it'll be kept for various periods. In some cases, aggregated data could also be kept instead of source data. All of those factors affect the info modelling exercise and therefore the eventual modelling software requirements. Turning the info into useful information requires: Identifying the issue(s) Assembling the info set(s) Building models Verify models Interpretation of the results Automation of the delivery Thereafter, modelling tools and techniques need to be used. These are often divided into two groups: theory driven and data driven. Theory driven modelling (hypothesis testing) attempts to substantiate or disprove preconceived ideas. Theory driven modelling tools require the user to specify most of the model supported prior knowledge then tests to ascertain if the model is valid. Data driven modelling tools automatically create the model supported patterns they find within the data. This also must be tested before it is often accepted as valid.