


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Your company is currently working in [Assumed Domain] for 2 years. For a new, young company, the competition in the market is very high. Therefore, the Board of Director has decided to apply Business Intelligence to improve the company business process by making better decisions. The Board of Directors assigns a small group including you in Research & Development Department to study business intelligence to apply for the company in the coming years. You need to research about business processes and decision support processes in the company and identify the types of data (unstructured, semi-structured or structured) generated by these processes with examples. You also need to research about current software used in the business process or decision support process and evaluate these usages (benefits and drawbacks). Next you need to understand the types of support for decision-making at different levels (operational, tactical and strategic) within the company and study which business
Typology: Cheat Sheet
1 / 26
This page cannot be seen from the preview
Don't miss anything!



















Qualification BTEC Level 5 HND Diploma in Computing Unit number and title Unit 14: Business Intelligence Submission date 04/03/2023 Date Received 1st submission Re-submission Date Date Received 2nd submission Student Name Nguyen Dan Que Student ID GCS Class GCS0905A Assessor name Nguyen Xuan Sam Student declaration I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism. I understand that making a false declaration is a form of malpractice.
Student’s signature Grading grid P1 P2 M1 M2 D1 D
Student Name/ID Number Nguyen Dan Que/GCS
Unit Number and Title 14: Business Intelligence Academic Year 2019 - 2020 Unit Tutor Assignment Title Number & Assignment 1: Discover business process and BI technologies
Issue Date Submission Date IV Name & Date
Submission Format The submission is in the form of a Microsoft® PowerPoint® style presentation to be presented to your colleagues. The presentation can include links to performance data with additional speaker notes and a bibliography using the Harvard referencing system. T copy. You are required to make effective use of headings, bullet points and subsections,he presentation slides for the findings should be submitted with speaker notes as one as appropriate. Your research should be referenced using the Harvard referencing The recommended word limit is 500 words, including speaker notes, although you will not system. be penalised for exceeding the total word limit.
Unit Learning Outcomes LO1 making. Discuss business processes and the mechanisms used to support business decision- LO2 Compare the tools and technologies associated with business intelligence functionality Assignment Brief Your company is currently working in [Assumed Domain] for 2 years. For a new, young company, the competition in the market is ver decided to apply Business Intelligence to improve the company business process byy high. Therefore, the Board of Director has making better decisions. The Board of Directors assigns Department to study business intelligence to apply for the company in the coming year a small group including you in Research & Developments. You need to research about business processes and decision support processes in the company and identify the types of data generated by these processes with examples (unstructured, semi. You also need to research about current-structured or structured) software used in the business process or decision support process and evaluate these usages (benefits and drawbacks) Next you need to understand the types of support for decision. -making at different levels (operational, tactical and strategic) within the company and study which business
intelligence features can help on that types of support. Study the information systems or technologies (of BI) can be used in this case, compare and co which should be used. ntrast them to conclude Your group needs to present the research results to the board in a presentation of 30 minutes.
Learning Outcomes and Assessment Criteria Pass Merit Distinction
LO1 Discuss business processes and the mechanisms used to support business decision-making
D1 Evaluate the benefits and drawbacks of using application software as a mechanism for business processing.
P1 Examine, using examples, the terms ‘Business Process’ and ‘Supporting Processes’.
M1 Differentiate between unstructured and semi-structured data within an organisation.
LO2 intelligence functionality Compare the tools and technologies associated with business^ D2 contrast a range of^ Compare and information systems and technologies that can be used to support organisations at operational, tactical and strategic levels.
P2 support available for business Compare the types of decision levels within an organisation.-making at varying
M2 the key features of business Justify, with specific examples, intelligence functionality.
Assignment 1 answers
1. Introduction
1. 1 Overview of problems Nowadays, the progress of the economy and human needs are constantly changing, buying a house is no exception. However, the prices of houses are not stable and are always affected by many different factors. On the other hand, the needs of home buyers are difficult to meet. Therefore, the purpose of the model to predict the house price is based on many factors such as architecture, location, quality, ...
Figure 1: Housing price factors In machine learning, we can predict the trend of price based on available information, this is applicable in most of the applications around us such as online shopping, entertainment, ... Similarly, this is also applied to price prediction.
1. 2 Motivations With the increasing demand for housing, meeting the needs of customers is also becoming increasingly difficult. When deciding to buy a house, customers always want to find a house with a convenient location, reasonable spaciousness, and many other requirements, but especially, the price of the house must be cheap or at least reasonable. Applying machine learning to house price prediction will significantly reduce the pressure on real estate. At the
same time, customers will also find it easier to find a suitable home for them at a reasonable price. In this report, I will predict home prices in King County, United State.
1.3 Objectives In this report, there will be some objectives that we’re focusing on:
Table 1 : First 5 rows of data from dataset
2.2.3 Dataset description As mentioned above, this dataset contains 21 features:
3. Proposed model
3.1 Correlation A correlational research strategy looks into correlations between variables without allowing the researcher to control or manipulate any of them. (Bhandari P, 2021) A correlation is a measurement of the intensity and/or direction of a link between two (or more) variables. A correlation’s direction might be either positive or negative. (Bhandari P,
Table 2: Correlation explanation Positive correlation Both variables change in the same way. Negative correlation Variables shift in opposing directions. Zero correlation There is no connection between the variables.
Figure 2 : correlation coefficient There are also many types of correlation coefficients, the most common one is Pearson’s due to its strong inferences. (Bhandari P, 2021)
𝑟 = (^) √[𝑛Σ𝑥 2 𝑛 (^) −Σ 𝑥𝑦(Σ𝑥^ −) 2 (]Σ(𝑥𝑛)Σ(𝑦Σ 2 𝑦 (^) −) (Σ𝑦) (^2) ) (^) (1)
Rxy is strength of the correlation between variables x and y n is sample size ∑ is sum of what follows
Β 1 X 1 is the regression coefficient (Β 1 ) of the first independent variable (X 1 ) (a.k.a. the effect that increasing the value of the independent variable has on the predicted y value) … is do the same for however many independent variables you are testing Βn Xn is the regression coefficient of the last independent variable e is model error (a.k.a. how much variation there is in our estimate of ) (Bevans R, 2020)
3.4 R-squared and adjusted R-squared R-squared The coefficient of determination (R²) measures how well a statistical model predicts an outcome. (Turney S, 2022) The lowest possible value of R² is 0 and the highest possible value is 1. Put simply, the better a model is at making predictions, the closer its R² will be to 1. (Turney S, 2022) You can choose one out of two formulas to calculate the coefficient of determination: Formula 1 𝑅^2 = (𝑟)^2 ( 4 )
r is Pearson correlation coefficient Formula 2
𝑅^2 = 1 − 𝑅𝑆𝑆 𝑇𝑆𝑆^ (^5 )
RSS is sum of squared residuals TSS is total sum of squares Adjusted R-squared The Adjusted Coefficient of Determination (Adjusted R-squared) is a Coefficient of Determination modification that takes the number of variables in a data set into account. It also penalizes you for points that do not correspond to the model. (Vogt, 2005)
n is sample size k is number of independent variables R^2 is coefficient of determination
3.5 Model estimation To measure model accuracy, we use Mean absolute (MAE), Mean square error (MSE) and Root mean square error (RMSE). Mean absolute error (MAE) The degree of inaccuracy in your measurements is expressed as absolute error. It represents the discrepancy between the measured and "actual" values.
n is the number of errors, Σ is summation symbol (which means “add them all up”), |xi – x| is the absolute errors. Mean square error (MSE) The degree of inaccuracy in statistical models is measured by mean squared error (MSE). The average squared difference between the observed and expected values is calculated. When there is no error in a model, the MSE is 0. When model inaccuracy rises, so does its value. The mean squared deviation is another name for the mean squared error (MSD). (Frost J)
𝑀𝑆𝐸 = Σ(𝑦𝑖^ 𝑛−^ ŷ i)^2 (^8 )
4. Simulating scenario and results
4.1 Package installation Firstly, I downloaded Anaconda navigator on https://www.anaconda.com. After downloaded and run the program, you’ll see the main page as figure 3 below.
Figure 3: Anaconda navigator For this work, I’m using Jupyter lab on Anaconda. JupyterLab is the latest web-based interactive development environment for notebooks, code, and data. First, the two libraries needed for data exploration and analysis are Pandas and Numpy. The matplotlib and seaborn libraries are used in visualization. The statsmodels libraries is used for modeling. Seaborn is used for creating heatmap.
Figure 4: Added libraries
4.2 Correlation For the correlation of this dataset, I used the heat map below:
Figure 5: Heatmap I collected some high correlation pairs as well as some features I think are important when looking for a house:
Figure 7 : OLS regression result of bedrooms Looking at the result above, the R-squared = 0.095 means the number of bedrooms affect only 9.5% of the house price. The figure below show the relationship between number of bedrooms and the house prices.
Figure 8 : visualized model of bedrooms
The regression line shows that the relationship between number of bedrooms and the house prices is linear positive. As the number of bedrooms increases, house prices will also increase. However, the price of most houses with 5 rooms is not too different from that of houses with 4 rooms. On the other hand, most houses with 3, 4, and 5 bedrooms have approximately the same price. There are even bigger houses with lower prices. When buying a house, customers are often afraid of the problem that if the house has many bedrooms, the price will often be expensive. But based on the model, it can be seen that the number of rooms does not affect the price of the house too much. Not much influence does not mean that it is not an issue that needs attention. I think this feature still need to be focused on when buying a house. The formula used for this model: 𝑦ℎ𝑎𝑡 = 1. 217 𝑒 + 05 ∗ 𝑥 − 1. 298 𝑒 + 05
4.3.2 Prices and square feet living
Figure 9 : OLS regression result for sqft_living