



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A comprehensive exam covering various topics in probability distributions, optimization models, and simulation analysis. It includes questions on modeling scenarios using poisson and weibull distributions, handling missing data in classification models, formulating mathematical constraints for diet optimization problems, analyzing simulation output, adjusting simulation estimates, classifying optimization problems, and understanding queuing models and markov chain analysis. The exam also explores customer retention approaches, multi-armed bandit methods, and variable selection techniques in regression models.
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!




The number of people clicking an online banner ad each hour can be modeled using a Poisson distribution. The Poisson distribution is appropriate for modeling the number of events occurring in a fixed interval of time or space, when the events occur independently and at a constant average rate.
The time from when a generator is turned on until it fails can be modeled using a Weibull distribution. The Weibull distribution is commonly used to model the lifetime or failure time of various components and systems.
The number of hits to a real estate website each minute can be modeled using a Poisson distribution. The Poisson distribution is appropriate for modeling the number of events occurring in a fixed interval of time or space, when the events occur independently and at a constant average rate.
The number of people entering a grocery store each minute can be modeled using a Poisson distribution. The Poisson distribution is appropriate for modeling the number of events occurring in a fixed interval of time or space, when the events occur independently and at a constant average rate.
The time between hits on a real estate website can be modeled using an Exponential distribution. The Exponential distribution is commonly used to model the time between independent events occurring at a constant average rate.
Handling Missing Data in Classification
Models
If school ratings cannot be reasonably well-predicted from the other factors, and new schools built due to recent population growth can be reasonably well-classified using the other factors, then Model 5 would be the recommended approach. Model 5 uses a categorical variable to identify neighborhoods with missing data, and then classifies the reason for the missing data (population growth or other reason).
Model 5 would be recommended in the situation where ratings cannot be well-predicted, and reasons for building schools can be well- classified. In this case, the categorical variable approach in Model 5 can effectively handle the missing data by distinguishing between neighborhoods with new schools due to population growth and those with new schools for other reasons.
Mathematical Constraints for Diet
Optimization Problems
The mathematical constraint that corresponds to the English sentence "If any amount of cheese sauce is eaten, then its binary variable ๐ฆ๐โ๐๐๐ ๐๐ ๐๐ข๐๐ must be 1" is: ๐ฅ๐โ๐๐๐ ๐๐ ๐๐ข๐๐ โค ๐ ๐ฆ๐โ๐๐๐ ๐๐ ๐๐ข๐๐ and ๐ฆ๐โ๐๐๐ ๐๐ ๐๐ข๐๐ = 1
The mathematical constraint that corresponds to the English sentence "Neither peanut butter nor cheese sauce can be eaten" is: ๐ฆ๐๐๐๐๐ข๐ก๐๐ข๐ก๐ก๐๐ + ๐ฆ๐โ๐๐๐ ๐๐ ๐๐ข๐๐ = 0
The mathematical constraint that corresponds to the English sentence "Unless peanut butter is eaten, no amount of broccoli can be eaten" is: ๐ฅ๐๐๐๐๐๐๐๐ โค ๐ ๐ฆ๐๐๐๐๐ข๐ก๐๐ข๐ก๐ก๐๐
The mathematical constraint that corresponds to the English sentence "Either peanut butter or cheese sauce, but not both, must be eaten" is: ๐ฆ๐๐๐๐๐ข๐ก๐๐ข๐ก๐ก๐๐ = 1 - ๐ฆ๐โ๐๐๐ ๐๐ ๐๐ข๐๐
Adjusting Simulation Estimates
If the simulated wait times are 50% lower than the actual wait times, on average, the recommended action is to scale up all estimates by a factor of 1/0.50 (or 2) to get the average simulation estimates to match the average actual wait times.
Classifying Optimization Problems
The optimization problem described is a Linear program , as it involves minimizing a linear objective function subject to linear constraints with non- negative variables.
The optimization problem described is an Integer program , as it involves minimizing a linear objective function subject to linear constraints with binary (0-1) variables.
The optimization problem described is a General non-convex program , as the objective function involves a non-linear (sinusoidal) term.
The optimization problem described is a Convex program , as the objective function involves the absolute value of a linear expression, which is a convex function.
The optimization problem described is a General non-convex program , as the constraints involve a quadratic expression in the variables.
The optimization problem described is a Convex quadratic program , as the objective function is a convex quadratic expression and the constraints are linear.
The optimization problem described is a Linear program , as it involves minimizing a linear objective function subject to linear constraints with non- negative variables.
Queuing Model and Markov Chain Analysis
The queuing model would be expected to show that wait times are high at both busy and non-busy times. This is because the supermarket has decided to open an express checkout line when there are 5 or more people waiting, and this line will remain open until nobody is left waiting.
The supermarket would like to model this new process with a Markov chain, where each state represents the number of people waiting (e.g., 0 people waiting, 1 person waiting, etc.). However, the transition probabilities from a state like '3 people waiting' depend on how many lines are currently open, and therefore depend on whether the system was more recently in the state '5 people waiting' or '0 people waiting'. This means the process is not memoryless, so the Markov chain model would not be well-defined.
For a Markov chain to be an appropriate model, the process must be memoryless. This is the case only if the arrivals follow the Poisson distribution and the checkout times follow the Exponential distribution.
Retail Customer Retention Approaches
The retailer is testing two different customer retention approaches, Option A and Option B. The data shows that Option A has a lower customer loss rate (9.7%) compared to Option B (10.4%), with 95% confidence intervals of 7.9%-11.5% and 8.5%-12.3%, respectively. This suggests that Option A is the better approach.
Later, the retailer developed 7 new options and used a multi-armed bandit approach, where each option is chosen with probability proportional to its likelihood of being the best. The data shows the customer loss rate, mean customer order value, and median customer order value for each option.
Lasso regression usually selects the fewest variables among the four models.
Reasons for Limiting the Number of Factors
in a Model
The main reasons for using techniques like stepwise regression, lasso, etc. to limit the number of factors in a model are:
To find a simpler model. Because there isn't enough data to avoid overfitting a model with many factors.
The reason to find a more complex model is not a valid reason for using these techniques.
Modeling Process Steps
The steps in the modeling process, in order, are:
Impute missing data values and scale data. Remove outliers. Fit a lasso regression model on all variables. Fit linear regression, regression tree, and random forest models using variables chosen by the lasso regression model. Pick the model to use based on performance on a different data set. Test the model on another different set of data to estimate quality.
Appropriate Models for Different Situations
Estimate the number of workers required at a call center: Queuing model Find the best airline flight schedule given uncertain delays: Stochastic optimization Find sets of terrorists with a lot of communication: Louvain algorithm Determine the number of tables for a restaurant: Queuing model Compare median temperatures between July and August: Non- parametric test