Profitability Analysis Using Sales Data, Assignments of Advanced Data Analysis

1. A retail company collects daily sales data from its stores. Recently, the management noticed that profits are decreasing even though sales volume appears to be increasing. As a Data Analyst, you are asked to investigate the issue using the company’s dataset which contains: • Product ID • Product Category • Units Sold • Selling Price • Cost Price • Store Location • Date of Sale

Typology: Assignments

2025/2026

Available from 06/02/2026

EXAM-I-NATION
EXAM-I-NATION 🇺🇸

541 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1.
A retail company collects daily sales data from its stores. Recently, the
management noticed that profits are decreasing even though sales volume
appears to be increasing.
As a Data Analyst, you are asked to investigate the issue using the company’s
dataset which contains:
Product ID
Product Category
Units Sold
Selling Price
Cost Price
Store Location
Date of Sale
Tasks:
1. Identify the key metrics you would calculate to analyze profitability.
2. Describe the steps you would take to clean and prepare the dataset
before analysis.
3. Suggest two possible reasons why profits might be decreasing despite
higher sales.
4. Recommend two data visualizations that would help management
understand the problem.
2. You are a data analyst at a real estate company. The company wants to
predict house prices based on historical data. You have been given a dataset
with information about houses, but the data contains missing values and
categorical features. Your task is to preprocess the data so that it can be used
for machine learning models.
Dataset:
House ID
Size (sqft)
Bedrooms
Age (years)
Location
Price ($)
1
2000
3
10
A
500,000
pf3

Partial preview of the text

Download Profitability Analysis Using Sales Data and more Assignments Advanced Data Analysis in PDF only on Docsity!

A retail company collects daily sales data from its stores. Recently, the management noticed that profits are decreasing even though sales volume appears to be increasing. As a Data Analyst , you are asked to investigate the issue using the company’s dataset which contains:

  • Product ID
  • Product Category
  • Units Sold
  • Selling Price
  • Cost Price
  • Store Location
  • Date of Sale Tasks:
  1. Identify the key metrics you would calculate to analyze profitability.
  2. Describe the steps you would take to clean and prepare the dataset before analysis.
  3. Suggest two possible reasons why profits might be decreasing despite higher sales.
  4. Recommend two data visualizations that would help management understand the problem.
  5. You are a data analyst at a real estate company. The company wants to predict house prices based on historical data. You have been given a dataset with information about houses, but the data contains missing values and categorical features. Your task is to preprocess the data so that it can be used for machine learning models. Dataset: House ID Size (sqft) Bedrooms Age (years) Location Price ($) 1 2000 3 10 A 500,

House ID Size (sqft) Bedrooms Age (years) Location Price ($) 2 1500 2 5 B 350, 3 2500 4 20 C 600, 4 NaN 3 15 B 400, 5 1800 NaN 8 A 450, Assignment Tasks (provide statistical representations)

1. Show calculations for missing value replacement using mean and median Imputation, Compare the results – which method is better for this dataset and why 2. Standardize numerical features (Size, Bedrooms, Age) using Z-score normalization: 3. Detect potential outliers in the Price column using Z-score 4. Convert the Location column into one-hot encoding. 5. Standardize the Size, Bedrooms, and Age columns using Z-score normalization. Show all calculations. 6. Scale the Size column using min-max normalization, show the formula and results and compare the difference between Z-score and min-max scaling. Which situations is each method better for?. 7. Create a new feature called Price per Sqft: Price per Sqft = Price Size Show the calculated values for all houses. 8. Suppose some houses have very high Age values compared to others. Suggest a transformation to reduce skewness in the Age column (e.g., log transformation). Show an example calculation. 9. Why is data preprocessing important before training a machine learning model? Give at least 2 reasons. 10. If you trained a model without scaling numerical features, what problems 11. Suggest a machine learning algorithm suitable for this problem, Justify your answer