machine learning / statistical modeling categories, Exercises of Mechanics

ISYE 6501 FINAL EXAM 2025/2026 ACTUAL COMPLETE REAL EXAM QUESTIONS AND CORRECT ANSWERS AND CORRECT ANSWERS (VERIFIED ANSWERS) BRAND NEW !!ALREADY GRADED A+.

Typology: Exercises

2025/2026

Available from 04/23/2026

brian-mugo-2
brian-mugo-2 šŸ‡°šŸ‡Ŗ

211 documents

1 / 87

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ISYE 6501 FINAL EXAM 2025/2026 ACTUAL COMPLETE REAL
EXAM QUESTIONS AND CORRECT ANSWERS AND CORRECT
ANSWERS (VERIFIED ANSWERS) BRAND NEW !!ALREADY
GRADED A+.
A common rule of thumb is to stop branching if a leaf would
contain less than 5% of the data points. Why not keep
branching and allow models to find very close fits to each very
small subset of data? - ANSWER-Fitting to very small subsets
of data will cause overfitting. With too few data points, the
models will fit to random patterns as well as real ones
True or False: When using a random forest model, it's easy to
interpret how its results are determined. - ANSWER-False.
Unlike a model like regression where we can show the result as
a simple linear combination of each attribute times its
regression coefficient, in a random forest model there are so
many different trees used simultaneously that it's difficult to
interpret exactly how any factor or factors affect the result.
what is forward selection - ANSWER-we select the best new
factor and see if it's good enough (R^2, AIC, or p-value) add it
to our model and fit the model with the current set of factors.
Then at the end we remove factors that are lower than a certain
threshold
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57

Partial preview of the text

Download machine learning / statistical modeling categories and more Exercises Mechanics in PDF only on Docsity!

ISYE 6501 FINAL EXAM 2025/2026 ACTUAL COMPLETE REAL

EXAM QUESTIONS AND CORRECT ANSWERS AND CORRECT

ANSWERS (VERIFIED ANSWERS) BRAND NEW !!ALREADY

GRADED A+.

A common rule of thumb is to stop branching if a leaf would contain less than 5% of the data points. Why not keep branching and allow models to find very close fits to each very small subset of data? - ANSWER-Fitting to very small subsets of data will cause overfitting. With too few data points, the models will fit to random patterns as well as real ones

True or False: When using a random forest model, it's easy to interpret how its results are determined. - ANSWER-False. Unlike a model like regression where we can show the result as a simple linear combination of each attribute times its regression coefficient, in a random forest model there are so many different trees used simultaneously that it's difficult to interpret exactly how any factor or factors affect the result.

what is forward selection - ANSWER-we select the best new factor and see if it's good enough (R^2, AIC, or p-value) add it to our model and fit the model with the current set of factors. Then at the end we remove factors that are lower than a certain threshold

what is backward elimination - ANSWER-we start with all factors and find the worst on a supplied threshold (p = 0.15). If it is worse we remove it and start the process over. We do that until we have the number of factors that we want and then we move the factors lower than a second threshold (p = .05) and fit the model with all set of factors

what is stepwise regression - ANSWER-it is a combination of forward selection and backward elimination. We can either start with all factors or no factors and at each step we remove or add a factor. As we go through the procedure after adding each new factor and at the end we eliminate right away factors that no longer appear.

what type of algorithms are stepwise selection? - ANSWER- Greedy algorithms - at each step they take one thing that looks best

what is LASSO - ANSWER-a variable selection method where the coefficients are determined by both minimizing the squared error and the sum of their absolute value not being over a certain threshold t

well on other data because they fit more to random effects than you'd like and appear to have a better fit

What are the pros and cons of LASSO and elastic net - ANSWER-They are slower but help make models that make better predictions

Which two methods does elastic net look like it combines and what are the downsides from it? - ANSWER-Ridge Regression and LASSO.

Advantages: variable selection from LASSO and Predictive benefits of LASSO.

Disadvantages: Arbitrarily rules out some correlated variables like LASSO (don't know which one that is left out should be); Underestimates coefficients of very predictive variables like Ridge Regresison

What are some downsides of surveys? - ANSWER-Even if you what appears to be a representative sample in simple ways, maybe it isn't in more complex ways.

If we're testing to see whether red cars sell for higher prices than blue cars, we need to account for the type and age of the cars in our data set. This is called: - ANSWER-Controlling

what is a blocking factor - ANSWER-a source of variability that is not of primary interest to the experimenter

what is an example of a blocking factor - ANSWER-The type of car, sports car or family car, is a blocking factor that it could account for some of the difference between red cars and blue cars. Because sports cars are more likely to be red; if we account for the difference, we can reduce the variability in our estimates Under what conditions should you run A/B tests - ANSWER-When you can collect data quickly. When the data is representative and the amount of data is small compared to the whole population

Do you have to decide the sample size ahead of time for A/B tests – ANSWER-no, and we can run the hypothesis test anytime we want

What is full factorial design - ANSWER-you test every combination and then use ANOVA to determine importance of each factor

being best and start assigning new tests according to those probabilities. We keep testing multiple alternatives; so, we're still doing exploration. But we make it more likely to pick the best ones so we're also doing exploitation

What are some of the parameters in the multi-armed bandit approach - ANSWER-number of tests between recalculating probabilities; how to update the probabilities; and how to pick an alternative to test based on probabilities and/or expected values. For updating we can use bayesian updates or estimate from the observed distribution

What are common reasons that data sets are missing values?

  • ANSWER-* a person accidentally types in the wrong value * a person did not want to reveal the true value
    • an automated system did not work correctly to record the value

What are some examples of why there might be bias in missing data - ANSWER-* Income: people with higher incomes are less likely to omit this answer

  • Radar gun: a car that passes the radar gun very slowly might be treated as an anomaly and its speed might be recorded in the system
  • Heart transplants: If there's a variable "date of death" it will be missing for patients still living and thus the missing data will naturally include more successful transplant cases

What are three ways of dealing with missing data that don't require imputation - ANSWER-discard the data, use categorical variables to indicate missing data, estimate missing values

What are the pros and cons of throwing away missing data - ANSWER-Pros: not potentially introducing errors; easy to implement

Cons: don't want to lose to many data points; potential for censored or biased missing data

What is the categorical variable approach - ANSWER-If the data is categorical, we just add another category "missing". With quantitative variables you include interactions variables between the categorical variable and other variables.

Why wouldn't you want to fill in missing quantitative variabes with 0 - ANSWERIt can lead to problems if some types of data points are more likely than others to have missing data. The coefficients of the other variables might be pulled in one direction or another to try to account for the missing data

with: it's less accurate on average but has more accurate variability

When should you not use imputation? - ANSWER-When more than 5% of the data is moving per factor

what is the binomial distribution - ANSWER-the probability of getting x successes out of n independent identically distributed Bernoulli (p) trials; count of successful coin flips in n trials

What happens when n is big for binomial distribution - ANSWER-it converges to normal distribution

what is a Bernoulli distribution - ANSWER-it's like a flipping coin. It can be used to model a single event and is most useful when we put many of them together

what are some examples of a geometric distribution - ANSWER- How many interviews until first job offer; how many hits until a baseball bat breaks

what is a geometric distribution? - ANSWER-How many Bernoulli trials until ...; It is the probability of having x Bernoulli(p) falures until first success or having Bernoulli(p) success until first failure

In a geometric distribution what is the value that is set to a power - ANSWERThe thing you're trying to see how manxy X until something

What are the assumptions does a geometric distribution make?

  • ANSWEREach Bernoulli trial is independent and identically distributed

what is the Poisson distribution good at modeling - ANSWER- random arrivals

what does the Poisson distribution assume - ANSWER-arrivals are independent and identically distributed

If arrivals are poisson what then the interarrival time is what type of distribution - ANSWER-exponential

If the interarrival time is exponential what type of distribution is the arrival - ANSWER-poisson

what is the difference between Weibull and geometric distribution - ANSWERweibull - time between failures; geometric - number of tries between failures

If the data fits exponential distribution is it memoryless? - ANSWER-Yes

If a data is memoryless is it exponential - ANSWER-yes

Which distributions are memoryless - ANSWER-poisson and exponential

Can a distribution not be memoryless and still be exponential - ANSWER-no

what are deterministic simulations - ANSWER-same inputs give the same outputs

what are stochastic simulations? - ANSWER-when there is randomness

what are continuous-time simulations? - ANSWER-When changes happen continuously. Example: chemical processes, propagations

What are discrete-event simulatons - ANSWER-changes happen at discrete time points. Example: call center simulations someone calls worker finishes talking to someone.

what are the elements of simulation model? - ANSWER-entities, modules, actions, resources, decisions point, and statistical tracking

what are entities - ANSWER-things that move through the simulation (bags, people, etc)

what are modules - ANSWER-parts of process (queues, storage, etc)

what are replications - ANSWER-number of runs of a simulaiton

Why is it important to validate a simulation by comparing to real data as much as possible? - ANSWER-If the simulation isn't a good reflection of reality, then any insights we gain from studying the simulation might not be applicable in reality

what do prescriptive simulations answer - ANSWER-what-if questions

what's an example of heuristic optimization - ANSWER-what's the best buffer size to have at each step in the process

What is the difference between statistical software and optimization software? - ANSWER-Statistical software can both build and solve regression models. Optimization software only solves models; human experts are required to build optimization models.

What are the three main components of the optimization models? - ANSWERvariables, constraints, objective function

In a queuing model given a service time, an arrival rate and number of helpers/servers, how do you determine if you have enough people - ANSWER-if service_rate * arrival_rate > servers, wait time high; else wait time low

what are variables - ANSWER-decisions to be made

what are constraints - ANSWER-restrictions on variable names

Why do we need constraints? - ANSWER-Optimization solvers only look at the math. They don't look at the words of what we want the variables to mean. So if we don't use constrains to explicitly tell the solver how variables should related, the solver will go happily ahead and find mathematical solution telling our candidate to visit each state i 12 times, but that each vid should be 0.

what is an objective function? - ANSWER-The objective function is a measure of the quality of a set of values for the variables, which we're trying to maximize or minimize

what is the solution - ANSWER-values for each variable

what is a feasible solution - ANSWER-variable values that satisfy all constraints

what is an optimal solution - ANSWER-feasible solution with the best objective value

What do binary variables do - ANSWER-They allow for more- complex models

what is the objective function in linear regression - ANSWER- trying to minimize the squared error

what are the statistical variables and constants in linear regression - ANSWERthe data; the coefficients

what are the variables and constants in optimization model for linear regression - ANSWER-the data is the constant and the coefficients are the variables

what are the variables in k-means clustering - ANSWER- coordinate of cluster centers and if a point is part of certain cluster

What are the constraints in k-means clustering - ANSWER-each data point is assigned to a cluster

What is the objective function in k-means - ANSWER-minimize total distance from data points to their cluster centers

What are the order of fastest to slowest optimization problems - ANSWER-linear programs, convex quadratic programs, convex programs, integer programs, general non-convex programs

are convex optimization problems guaranteed to find optimal solution - ANSWER-yes

are non-convex optimization problems guaranteed to find an optimal solution - ANSWER-No; they might converge to an infeasible solution or to a local optimum

what is a general non-convex program - ANSWER-Optimization problem is not convex

what is a linear program? - ANSWER-f(x) is a linear function; constraint set X is defined by linear equations and inequalites

what is convex quadratic program - ANSWER-f(x) is a convex quadratic function. Minimize f(x) or Maximize -f(x). constraint set X is defined by linear equations and inequalites

what is constraint set X defined by in linear programs - ANSWER-linear equations and inequalities

what is a convex optimization problem - ANSWER-objective f(x) is concave (if maximizing) or convex (if minimizing). Constraint set X is a convex set

what is constraint set X in a convex optimization progrem - ANSWER-a convex set

what is a integer program - ANSWER-linear program plus some (or all) variables restricted to take only integer values; variables could be binary (either 0 or 1)

what are the basic steps to solve an optimization problem - ANSWER-1) Initialization: pick values for all the variables (they may be simple, bad and not satisfy all of the constraints) 2.) find an improving direction t and make a change in that