ML4T Midterm 1 Solutions Summer 2025, Georgia Tech Institute of Technology, Exams of Computer Science

Format (all 3): 25 multiple-select questions, 5 sub-answers each (A-E), blank exam + answer key. Scoring: 1 point per correctly marked/unmarked sub-answer

Typology: Exams

2025/2026

Uploaded on 03/04/2026

margaret-collins
margaret-collins 🇺🇸

3 documents

1 / 20

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
ML4T Midterm 1 Solutions
Georgia Tech Institute of Technology
Summer 2025
25 Multiple-Select Questions | 125 Sub-Answers
Each question: 5 sub-answers, check all that apply
CS 7646 Machine Learning for Trading
Scoring: 1 point per correctly marked/unmarked sub-answer
Warning: All 5 checked or all 5 unchecked = 0 for that question
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14

Partial preview of the text

Download ML4T Midterm 1 Solutions Summer 2025, Georgia Tech Institute of Technology and more Exams Computer Science in PDF only on Docsity!

ML4T Midterm 1 Solutions

Georgia Tech Institute of Technology

Summer 2025

25 Multiple-Select Questions | 125 Sub-Answers

Each question: 5 sub-answers, check all that apply

CS 7646 Machine Learning for Trading

Scoring: 1 point per correctly marked/unmarked sub-answer
Warning: All 5 checked or all 5 unchecked = 0 for that question

EXAM QUESTIONS

Q1. Under the weak form of EMH, which strategies should fail to consistently beat the market?

A. Moving average crossover strategies that rely only on past price patterns. B. Momentum trading using only historical price and volume data. C. Fundamental analysis based on publicly available earnings reports. D. Strategies using private insider information about upcoming mergers. E. Any strategy based solely on technical analysis of price charts.

Q2. Stock A has daily volatility of 1.5%. Stock B has daily volatility of 2.5%. Which has higher annualized volatility, and
by how much?

A. Annualized vol A = 1.5% * sqrt(252) = ~23.8%. B. Annualized vol B = 2.5% * sqrt(252) = ~39.7%. C. Stock B's annualized volatility is roughly 1.67x Stock A's. D. Annualized volatility = daily volatility * 252 (multiply, not square root). E. Higher volatility always implies lower expected returns.

Q3. An asset plots above the Security Market Line (SML). What does this mean for an investor?

A. The asset has positive alpha; it outperforms what CAPM predicts for its beta. B. The asset is potentially undervalued relative to its systematic risk. C. The SML plots expected return vs. beta (systematic risk only). D. The SML plots expected return vs. total risk (including unsystematic risk). E. An asset above the SML has negative alpha and is overvalued.

EXAM QUESTIONS (cont.)

Q7. What makes a short squeeze dangerous for short sellers, and what triggers it?

A. Rising prices force short sellers to buy back shares to cover, which pushes prices even higher. B. Heavily shorted stocks are most vulnerable because many sellers may need to cover simultaneously. C. Margin calls can force short sellers to close positions even if they want to hold. D. Short squeezes only happen to stocks with low short interest. E. Short sellers can hold indefinitely without any risk of forced liquidation.

Q8. You set a trailing stop at 5% below the current high of $100. The stock rises to $120, then starts falling. At what
price does the stop trigger?

A. The trailing stop adjusts upward to 5% below $120 = $114. B. If the stock drops from $120 to $114, the stop triggers and a market sell order is placed. C. Trailing stops help lock in gains during uptrends. D. The stop stays at $95 (5% below the original $100) regardless of how high the price goes. E. A trailing stop guarantees you sell at exactly $114.

Q9. Company X does a reverse split 1:4. Before the split, the stock is at $5 with 40M shares outstanding. After the
split?

A. The stock price becomes $5 * 4 = $20. B. Shares outstanding become 40M / 4 = 10M. C. Historical adjusted prices increase by 4x to maintain consistent returns. D. The total market cap changes because of the reverse split. E. Reverse splits increase the total number of shares outstanding.

EXAM QUESTIONS (cont.)

Q10. What is the difference between np.array slicing (a[1:5]) and fancy indexing (a[[0,2,4]]) in terms of memory?

A. Basic slicing creates a view: modifying the slice modifies the original array. B. Fancy indexing creates a copy: modifying the result does not affect the original. C. Understanding views vs. copies prevents subtle bugs in data processing pipelines. D. np.copy(a) creates a view, not an independent copy. E. All NumPy operations always create independent copies in memory.

Q11. How would you compute a 20-day rolling daily return in Pandas, and which operations are involved?

A. Daily returns: df['price'].pct_change() or df / df.shift(1) - 1. B. 20-day rolling mean: df['returns'].rolling(20).mean(). C. DatetimeIndex enables natural time-based slicing like df['2024-01':'2024-06']. D. df.shift(1) moves data backward in time, not forward. E. Pandas cannot compute rolling statistics on time series data.

Q12. If you backward-fill missing prices before forward-filling, and then train a trading model on the result, what's the
risk?

A. Look-ahead bias: backward fill uses future prices to fill past gaps. B. Your model might appear to predict future prices but is actually leaking future information. C. Forward-fill first is safer because it only uses past information available at each point. D. The order of fill operations doesn't matter for model training. E. Backward fill is always safe because it only uses neighboring values.

EXAM QUESTIONS (cont.)

Q16. How does leaf_size in a decision tree relate to the bias-variance tradeoff?

A. Smaller leaf_size = more complex tree = lower bias but higher variance. B. Larger leaf_size = simpler tree = higher bias but lower variance. C. leaf_size=1 means each leaf can contain a single sample, maximizing overfitting risk. D. Increasing leaf_size always worsens both training and test performance. E. Decision trees require feature scaling to properly select split points.

Q17. If you build a Random Forest with 100 trees and p=20 features, approximately how many features does each split
consider?

A. sqrt(p) = sqrt(20) ~ 4-5 features per split (common heuristic for regression). B. This random feature restriction decorrelates the trees, improving the ensemble. C. Each tree also trains on a different bootstrap sample of the data. D. Each split considers all 20 features, not a random subset. E. Random Forests cannot work with more than 10 features.

Q18. Gradient Boosting fits new trees to residuals. Why is using shallow trees (not deep ones) important?

A. Shallow trees (stumps or depth 2-3) act as weak learners that slowly correct errors. B. Using deep trees in boosting leads to severe overfitting because each tree tries to fix all remaining errors at once. C. XGBoost and LightGBM are popular gradient boosting implementations. D. Boosting always prevents overfitting regardless of tree depth or number of rounds. E. Boosting models are trained in parallel, not sequentially.

EXAM QUESTIONS (cont.)

Q19. You have a prior belief that a coin is fair (P(heads)=0.5), but you observe 90 heads in 100 flips. Using Bayes'
Theorem, how does the posterior behave?

A. The posterior shifts strongly toward P(heads)=0.9 due to overwhelming data evidence. B. The likelihood P(D|theta) heavily favors theta near 0.9 given 90/100 heads. C. With less data (e.g., 9 heads in 10 flips), the prior would have more influence on the posterior. D. The prior completely overrides the data, so the posterior remains at 0.5. E. P(D|theta) is the posterior probability, not the likelihood.

Q20. R-squared on your test set is -0.3. What does this tell you about your model?

A. The model is worse than simply predicting the mean of the target variable. B. Negative R-squared means SS_residual > SS_total: the model's errors exceed total variance. C. This can happen when the model overfits training data and fails on new data. D. Negative R-squared is mathematically impossible. E. Negative R-squared means the model is performing well but in the opposite direction.

Q21. RMSE is 5.2 and MAE is 3.8 on the same dataset. Why is RMSE larger, and what does this imply?

A. RMSE penalizes large errors more due to squaring, so a few big errors inflate it. B. The gap suggests the presence of some large outlier errors in the predictions. C. RMSE is in the same units as the target variable, making it interpretable. D. MAE penalizes large errors more heavily than RMSE. E. Lower RMSE always means the model will be profitable for live trading.

EXAM QUESTIONS (cont.)

Q25. Walk-forward analysis uses a fixed sliding window, while expanding-window CV keeps adding data. When would
you prefer walk-forward?

A. When the market regime changes over time (non-stationarity) and old data becomes irrelevant. B. Walk-forward discards old data, focusing the model on the most recent patterns. C. Expanding window is better when you believe historical patterns remain relevant indefinitely. D. Walk-forward uses future data in the training window. E. Random shuffling is better than both methods for financial time series.

ANSWER KEY

Answers follow on the next pages

ANSWER KEY (cont.)

Q4. Why does adding negatively correlated assets to a portfolio provide the greatest diversification benefit?

A. Negative correlation means when one asset falls, the other tends to rise, reducing overall variance. B. Portfolio variance formula includes covariance terms: s_p^2 = Sum(w_i * w_j * s_ij). C. Diversification can eliminate unsystematic risk but not systematic risk. D. Two perfectly correlated assets provide the same diversification benefit as negatively correlated ones. E. Portfolio variance depends only on individual asset variances, not covariances.

Q5. A hedge fund had $200M AUM at the start of the year, earned 20%, but the high-water mark from a prior peak is
$220M. How much performance fee is collected?

A. Year-end AUM = $200M * 1.20 = $240M. B. Performance fee applies only to gains above the high-water mark: $240M - $220M = $20M taxable. C. Performance fee = 20% * $20M = $4M (not 20% of total $40M profit). D. The fund collects 20% * $40M = $8M because the high-water mark is irrelevant. E. The high-water mark resets to zero at the beginning of each year.

Q6. In an order book, the best bid is $49.90 and the best ask is $50.10. A market buy order for 500 shares arrives.
What happens?

A. The order immediately executes against the best ask at $50.10 (or possibly worse if size exceeds that level). B. Market orders guarantee execution but not the exact price. C. A limit buy at $50.00 would NOT execute because the ask is $50.10. D. The market buy order matches against the bid side at $49.90. E. Market orders are held in the book until a matching limit order arrives.

ANSWER KEY (cont.)

Q7. What makes a short squeeze dangerous for short sellers, and what triggers it?

A. Rising prices force short sellers to buy back shares to cover, which pushes prices even higher. B. Heavily shorted stocks are most vulnerable because many sellers may need to cover simultaneously. C. Margin calls can force short sellers to close positions even if they want to hold. D. Short squeezes only happen to stocks with low short interest. E. Short sellers can hold indefinitely without any risk of forced liquidation.

Q8. You set a trailing stop at 5% below the current high of $100. The stock rises to $120, then starts falling. At what
price does the stop trigger?

A. The trailing stop adjusts upward to 5% below $120 = $114. B. If the stock drops from $120 to $114, the stop triggers and a market sell order is placed. C. Trailing stops help lock in gains during uptrends. D. The stop stays at $95 (5% below the original $100) regardless of how high the price goes. E. A trailing stop guarantees you sell at exactly $114.

Q9. Company X does a reverse split 1:4. Before the split, the stock is at $5 with 40M shares outstanding. After the
split?

A. The stock price becomes $5 * 4 = $20. B. Shares outstanding become 40M / 4 = 10M. C. Historical adjusted prices increase by 4x to maintain consistent returns. D. The total market cap changes because of the reverse split. E. Reverse splits increase the total number of shares outstanding.

ANSWER KEY (cont.)

Q13. The EMA smoothing factor is alpha = 2/(N+1). If N=9, what is alpha, and how does EMA differ from SMA?

A. alpha = 2/(9+1) = 0.2, meaning the most recent price gets 20% weight. B. EMA reacts faster to price changes than SMA because it overweights recent data. C. Larger N makes alpha smaller, producing a smoother, more lagged EMA. D. SMA and EMA always produce identical values for the same window. E. EMA with large N is more responsive to sudden price changes than small N.

Q14. In OLS regression, what does heteroscedasticity mean, and why does it violate assumptions?

A. Heteroscedasticity means the variance of residuals is not constant across predictions. B. OLS assumes errors are homoscedastic (constant variance) and independently distributed. C. Violating this assumption can make standard errors unreliable. D. OLS has no assumptions about error distributions whatsoever. E. Adding highly multicollinear features improves OLS estimates.

Q15. KNN with K=N (where N is the total number of training points) predicts what for every query?

A. The global mean of all training target values, regardless of the query point. B. This has the highest possible bias and lowest possible variance for KNN. C. It is the opposite extreme of K=1, which memorizes every individual point. D. K=N produces the most flexible, adaptive predictions. E. K=N gives different predictions for each query based on local patterns.

ANSWER KEY (cont.)

Q16. How does leaf_size in a decision tree relate to the bias-variance tradeoff?

A. Smaller leaf_size = more complex tree = lower bias but higher variance. B. Larger leaf_size = simpler tree = higher bias but lower variance. C. leaf_size=1 means each leaf can contain a single sample, maximizing overfitting risk. D. Increasing leaf_size always worsens both training and test performance. E. Decision trees require feature scaling to properly select split points.

Q17. If you build a Random Forest with 100 trees and p=20 features, approximately how many features does each split
consider?

A. sqrt(p) = sqrt(20) ~ 4-5 features per split (common heuristic for regression). B. This random feature restriction decorrelates the trees, improving the ensemble. C. Each tree also trains on a different bootstrap sample of the data. D. Each split considers all 20 features, not a random subset. E. Random Forests cannot work with more than 10 features.

Q18. Gradient Boosting fits new trees to residuals. Why is using shallow trees (not deep ones) important?

A. Shallow trees (stumps or depth 2-3) act as weak learners that slowly correct errors. B. Using deep trees in boosting leads to severe overfitting because each tree tries to fix all remaining errors at once. C. XGBoost and LightGBM are popular gradient boosting implementations. D. Boosting always prevents overfitting regardless of tree depth or number of rounds. E. Boosting models are trained in parallel, not sequentially.

ANSWER KEY (cont.)

Q22. A model's training R-squared is 0.99 but its test R-squared is 0.15. Name three techniques that could reduce this
gap.

A. Regularization (L1/L2) to penalize overly complex models. B. Increasing training data size to give the model more examples. C. Reducing model complexity (e.g., fewer features, larger leaf_size, lower tree depth). D. Adding more features to the model to increase its training R-squared even further. E. Removing the test set and evaluating only on training data.

Q23. Why is standard K-fold cross-validation with random shuffling inappropriate for financial time series?

A. Random shuffling breaks temporal order, leaking future data into training folds. B. Financial data is non-stationary, so training and test periods must be chronological. C. Time-series CV ensures the test fold always comes after the training period. D. Random shuffling is the recommended approach for all types of financial data. E. Temporal order doesn't matter because stock prices are independent across time.

Q24. Financial data has ~252 trading days per year. Why does this limited data size make ML particularly challenging?

A. Small datasets increase the risk of overfitting, especially with complex models. B. Low signal-to-noise ratio means most patterns are noise, not true signal. C. Simple models often outperform complex ones in finance for exactly this reason. D. 252 days per year provides more than enough data for any ML model to learn reliably. E. Non-stationarity means statistical properties stay constant, making prediction easy.

ANSWER KEY (cont.)

Q25. Walk-forward analysis uses a fixed sliding window, while expanding-window CV keeps adding data. When would
you prefer walk-forward?

A. When the market regime changes over time (non-stationarity) and old data becomes irrelevant. B. Walk-forward discards old data, focusing the model on the most recent patterns. C. Expanding window is better when you believe historical patterns remain relevant indefinitely. D. Walk-forward uses future data in the training window. E. Random shuffling is better than both methods for financial time series.