NAME____________________________
ISYE 6501, Introduction to Analytics Modeling
Midterm #2 – Friday, November 8, 2019
50 minute time limit
INSTRUCTIONS
• Work alone. Do not collaborate with or copy from anyone else.
• Write all of your answers on the answer sheet.
• You may use any of the following resources:
o One sheet (both sides) of handwritten (not photocopied or scanned) notes
o Scratch paper
• If any question seems ambiguous, use the most reasonable interpretation (i.e., don’t be like Calvin).
• Good luck!
APPROXIMATE GRADING SCALE (Note that your course grade will be based on the numeric scores, not
letters – so, for example, someone who gets “B”s on the first two exams by each time scoring one
point below the “A” cutoff, and then scores way above the “A” cutoff on the last exam, would
have an exam average in the “A” range, even with two “B”s and one “A”.)
HIGH SCORE: 105
MEDIAN SCORE: 94
MEAN SCORE: 94
POINTS GRADE # of students
94-105 A 52
89-93 A-/B+ 11
80-88 B 16
76-79 B-/C+ 3
60-75 C x
50-59 D y
0-49 F z
1. (9 points) Match each of the probability distributions with a situation that it is most appropriate
for modeling.
SITUATIONS DISTRIBUTIONS
a. Number of trucks inspected that fail an emissions test,
out of 800 inspected
b. Time between bees returning to a hive
c. Number of trucks inspected before the first one is found
that fails an emissions test
i. Weibull
ii. Poisson
iii. Geometric
iv. Exponential
v. Binomial
GRADING: 3 points for each correct answer
SOLUTIONS:
a. v
b. iv
c. iii
2. In a diet problem (like we saw in the lessons and homework), let �� be the amount of food � in
the solution (�� ≥ 0), and let � be the maximum amount that can be eaten of any food.
Suppose we added new variables �� that are binary (i.e., they must be either 0 or 1): if food � is
eaten in the solution, then it is part of the solution (�� = 1); otherwise, �� = 0.
Match each English sentence with the mathematical constraint that corresponds to it. Only 3
mathematical constraints will be used; the other 5 will not.
CONSTRAINTS
i. ������� ������ + ��ℎ���� ����� = 0
ii. ������� ������ + ��ℎ���� ����� = 1
iii. ��������� ≤ ��ℎ���� ����� + ������� ������
iv. ��������� + ��ℎ���� ����� + ������� ������ ≤ 2
v. ��ℎ���� ����� ≤ ���ℎ���� �����
vi. ��ℎ���� ����� = 1
vii. ��������� ≤ �������� ������
viii. ��������� ≥ �������� ������
a. (3 points) Either peanut butter or cheese sauce, but not both, must be eaten.
b. (3 points) Neither peanut butter nor cheese sauce may be eaten.
c. (3 points) If broccoli is eaten, then at least one of cheese sauce and peanut better must
be eaten.
GRADING: 3 points for each correct answer
SOLUTIONS:
a. ii
b. i
c. iii
d. (Extra credit – 5 points) Describe in words, without referring to variables, what the
constraint ��������� + ��ℎ���� ����� ≤ 1 + ������� ������ means.
GRADING: 5 points for a correct answer, partial credit when appropriate
SOLUTION:
There are several ways of interpreting this constraint. I accepted any of the following:
o If broccoli and cheese sauce are both eaten, then peanut butter must be eaten.
o If peanut butter isn’t eaten, then only one of broccoli and cheese sauce may be eaten.
o If peanut butter isn’t eaten, then either broccoli or cheese sauce can’t be eaten.
o If peanut butter isn’t eaten, at most one of broccoli and cheese sauce may be eaten.
o If peanut butter isn’t eaten, broccoli and cheese sauce can’t be eaten together.
o Etc.
3. (6 points) Five classification models were built for predicting whether a neighborhood will soon
see a large rise in home prices, based on public elementary school ratings and other factors.
The training data set was missing the school rating variable for every new school (3% of the data
points).
Because ratings are unavailable for newly-opened schools, it is believed that locations that have
recently experienced high population growth are more likely to have missing school rating data.
o Model 1 used imputation, filling in the missing data with the average school rating from
the rest of the data.
o Model 2 used imputation, building a regression model to fill in the missing school rating
data based on other variables.
o Model 3 used imputation, first building a classification model to estimate (based on
other variables) whether a new school is likely to have been built as a result of recent
population growth (or whether it has been built for another purpose, e.g. to replace a
very old school), and then using that classification to select one of two regression
models to fill in an estimate of the school rating; there are two different regression
models (based on other variables), one for neighborhoods with new schools built due to
population growth, and one for neighborhoods with new schools built for other reasons.
o Model 4 used a binary variable to identify locations with missing information.
o Model 5 used a categorical variable: first, a classification model was used to estimate
whether a new school is likely to have been built as a result of recent population
growth; and then each neighborhood was categorized as “data available”, “missing,
population growth”, or “missing, other reason”.
If school ratings cannot be reasonably well-predicted from the other factors, but new schools
built due to recent population growth can be reasonably well-classified using the other factors,
which model would you recommend?
i. Model 1
ii. Model 2
iii Model 3
iv. Model 4
v. Model 5
GRADING: 6 points for the correct answer
SOLUTION:
v is the correct answer. Since the problem states that we don’t have a good regression model,
we can’t use Models 2 or 3. However, because the problem states that we can predict well
from other factors, we can take advantage of that to use Model 5 instead of Models 1 or 4
that don’t use the extra information.
4. A consulting company has created a stochastic discrete-event simulation model of a large city’s
9-1-1 emergency dispatch operations, including incoming calls, dispatching of the appropriate
response (police, firefighters, paramedics, etc.), and the amount of time until the emergency
assistance arrives at the scene.
Emergency response is not first-come-first-served. When resources are limited, a moreimportant problem (like gunfire or a burning building) will be prioritized over a less-important
problem (like a cat stuck in a tree), and a more time-sensitive event (like a robbery in progress)
might be prioritized over a less-time-sensitive event (like the discovery of a robbery that took
place the day before).
When a new call for help comes in, the system automatically runs a simulation to quickly give an
estimate of the expected wait time until help arrives, which the operator relays to the caller.
Wait time is a combination of the time until the appropriate resource (e.g., a police unit) has no
higher-priority emergency to respond to, and the time it takes that resource to drive to the
scene.
a. (8 points) How many times does the company need to run the simulation for each new
help request (i.e., how many replications are needed)?
i. Once, because each request is unique.
ii. Once, because the outcome will be the same each replication.
iii. Many times, because of variability and randomness.
b. (8 points) The consulting company has found that simulated wait times are 25% higher
than actual wait times, on average. What would you recommend that they do?
i. Investigate to see what’s wrong with the simulation, because it’s a poor match
to reality.
ii. Scale up all estimates by a factor of 1.25 to get the average simulation estimates
to match the average wait times.
iii. Use the 25%-higher estimates, because that’s what the simulation output is.
GRADING: 8 points for each correct answer
SOLUTIONS:
a. iii
b. i
5. (8 points) For each optimization problem, select its most precise classification. In each model, �
are the variables, all other letters refer to known data, and the values of � are all positive.
CLASSIFICATIONS
i. Linear program
ii. Convex quadratic program
iii. Convex program
iv. Integer program
v. General non-convex program
a. minimize ∑� ����
subject to ∑� ∑� �������� ≥ �� for all �
all �� ≥ 0
b. minimize ∑� ����
subject to ∑� ����� ≤ �� for all �
all �� ∈ {0,1}
c. minimize ∑ ����
2
�
subject to ∑� ����� ≤ �� for all �
all �� ≥ 0
d. minimize ∑ (log �� � )��
subject to ∑� ����� ≤ �� for all �
all �� ≥ 0
GRADING: 2 points for each correct answer
SOLUTIONS:
a. v
b. iv
c. ii
d. i (since ci is a constant, log ci is also a constant, so it’s still linear in the x-variables)
6. (7 points) An online retailer is testing different methods of dealing with customers who want to
return a product, to see which method minimizes the number of returns. The retailer developed
7 options, so they used a multi-armed bandit approach where each option is chosen with
probability proportional to its likelihood of being the best. Their approach was to do trials in
batches of 1000, and after each 1000 trials to remove all options that are very unlikely to be
best, and then continue testing the rest. The results after the first 1000 trials are shown below.
Option
#1
Option
#2
Option
#3
Option
#4
Option
#5
Option
#6
Option
#7
Return
rate
8.0% 10.0% 13.4% 19.2% 22.0% 26.0% 26.3%
95%
confidence
interval
3.0%-
14.0%
6.0%-
14.5%
9.0%-
18.0%
16.8%-
21.8%
15.0%-
29.0%
21.0%-
31.0%
22.7%-
30.0%
Note: Lower return rates are better.
What should the retailer do?
i. Continue the multi-armed bandit approach using all seven options
ii. Continue the multi-armed bandit approach using only Options 1, 2, and 3
iii. Continue the multi-armed bandit approach using only Options 1 and 2
iv. Move to pure exploitation: use only Option 1
GRADING: 7 points for correct answer, partial credit as shown below
SOLUTION: ii (all three have overlapping confidence intervals); I gave partial credit of 4 points
to answer iii
7. A supermarket is analyzing its checkout lines, to determine how many checkout lines to have
open at each time.
At busy times (about 10% of the time), the arrival rate is 5 shoppers/minute. At other times, the
arrival rate is 2 shoppers/minute. Once a shopper starts checking out (at any time), it takes an
average of 3 minutes to complete the checkout.
a. (8 points) The first model the supermarket tries is a queuing model with 12 lines open.
What would you expect the queuing model to show?
i. Wait times are high at both busy and non-busy times.
ii. Wait times are low at both busy and non-busy times.
iii. Wait times are low at busy times and high at non-busy times.
iv. Wait times are low at non-busy times and high at busy times.
GRADING: 8 points for correct answer
SOLUTION: iv. At non-busy times, 2 shoppers/minute times 3 minutes to check out = an
average of 6 shoppers who need to be checked out simultaneously, and there are 12 lines
open. At busy times, 5 shoppers/minute times 3 minutes = 15 shoppers who need to be
checked out simultaneously, so 12 lines are not enough to keep the queue short.
The second model the supermarket tries is a Markov chain, where each state is the number of
people waiting at the end of a 1-minute interval (e.g., 0 people waiting, 1 person waiting, etc.).
In this model, the supermarket adds a more-complex staffing rule: In addition to the default of
12 lines open, whenever at least 5 people are waiting (total across all lines), the supermarket
will open 4 new checkout lines, which remain open until no more than 1 person is waiting.
b. (5 points) Which one of the following statements about the model and the memoryless
property is true?
i. The process is definitely memoryless.
ii. The process is definitely not memoryless.
iii. We can’t tell if the process is memoryless without knowing whether arrivals are
Poisson and whether checkout times are Exponential. If we collect data and
discover that both of these are true, then the process is memoryless.
GRADING: 5 points for correct answer
SOLUTION: This question was a little ambiguous, so there were two answers that could be
correct depending on how you look at it. ii is correct if you consider one model for a full day,
so the transition probabilities will be different depending on what time of day it is (and
whether it’s busy or not). iii could be correct if you assume two different models, one for busy
times and one for non-busy times. There are other interpretations for which ii and iii are
correct, but none for which i would be correct.
8a. (8 points) Rank the following regression and variable-selection/regularization methods from
fewest variables selected to most variables selected. All four methods will be used (the bottom
contains two equivalent spaces).
METHODS
A. Simple linear regression
B. Elastic net
C. Lasso regression
D. Ridge regression
GRADING: 2 points for each correctly-placed method
SOLUTION:
o Fewest = C (Because I accidentally wrote “Simple” linear regression (which sometimes
means just one variable), I also accepted A for fewest variables selected.)
o Middle = B
o Most = A,D (in any order)
8b. (14 points) Put the following seven steps in order, from what is done first to what is done last.
1. Scale data
2. Fit lasso regression model on all variables
3. Pick model to use based on performance on a different data set
4. Test model on another different set of data to estimate model quality
5. Impute missing data values
6. Remove outliers
7. Fit linear regression, regression tree, and random forest models using variables
chosen by lasso regression
GRADING: 2 points for each correctly-placed step. This (intentionally) means that someone
who switches the order of two steps, but has all the rest correct, will earn 10 points.
However, for someone who had one step out of order which pushed many others up or down
by one slot, I capped the penalty at 4 points.
SOLUTION: 6, {1 or 5}, {1 or 5}, 2, 7, 3, 4. Note that I accepted either 1 second and 5 third, or 5
second and 1 third. However, imputing or scaling before removing outliers is not a correct
solution, because the imputation would be done on bad data, and scaling would be stretched
too far.
9. (10 points) For each situation, select the most appropriate model/approach for it.
MODELS/APPROACHES
i. Game theoretic analysis
ii. Louvain algorithm
iii. Non-parametric test
iv. Queuing
v. Stochastic optimization
a. What distinct sets of furniture can be identified where there are many components
shared within each set?
b. How much should a bidder in an art auction bid, given that there will be competition
from other potential buyers?
c. How many people should be working in a call center to avoid long hold times when
someone calls with a question?
d. How many flights should an airline schedule between Atlanta and Boston each day for
the next six months, given demand uncertainty?
e. Is the median of the top 100 Kenyan runners’ marathon times different from the median
of the top 100 Swedish runners’ marathon times?
GRADING: 2 points for each correct answer
SOLUTIONS:
a. ii
b. i
c. iv
d. v
e. iii
Version | latest |
Category | Exam (elaborations) |
Included files | |
Authors | qwivy.com |
Pages | 10 |
Language | English |
Tags | Georgia Institute Of Technology - ISYE 6501midterm 2 solutions |
Comments | 0 |
High resolution | Yes |
Sales | 0 |
{{ userMessage }}