Georgia Institute Of Technology - ISYE 6501midterm 2 solutions

NAME____________________________

ISYE 6501, Introduction to Analytics Modeling

Midterm #2 – Friday, November 8, 2019

50 minute time limit

INSTRUCTIONS

• Work alone. Do not collaborate with or copy from anyone else.

• Write all of your answers on the answer sheet.

• You may use any of the following resources:

o One sheet (both sides) of handwritten (not photocopied or scanned) notes

o Scratch paper

• If any question seems ambiguous, use the most reasonable interpretation (i.e., don’t be like Calvin).

• Good luck!

APPROXIMATE GRADING SCALE (Note that your course grade will be based on the numeric scores, not

letters – so, for example, someone who gets “B”s on the first two exams by each time scoring one

point below the “A” cutoff, and then scores way above the “A” cutoff on the last exam, would

have an exam average in the “A” range, even with two “B”s and one “A”.)

HIGH SCORE: 105

MEDIAN SCORE: 94

MEAN SCORE: 94

POINTS GRADE # of students

94-105 A 52

89-93 A-/B+ 11

80-88 B 16

76-79 B-/C+ 3

60-75 C x

50-59 D y

0-49 F z

1. (9 points) Match each of the probability distributions with a situation that it is most appropriate

for modeling.

SITUATIONS DISTRIBUTIONS

a. Number of trucks inspected that fail an emissions test,

out of 800 inspected

b. Time between bees returning to a hive

c. Number of trucks inspected before the first one is found

that fails an emissions test

i. Weibull

ii. Poisson

iii. Geometric

iv. Exponential

v. Binomial

GRADING: 3 points for each correct answer

SOLUTIONS:

a. v

b. iv

c. iii

2. In a diet problem (like we saw in the lessons and homework), let �� be the amount of food � in

the solution (�� ≥ 0), and let � be the maximum amount that can be eaten of any food.

Suppose we added new variables �� that are binary (i.e., they must be either 0 or 1): if food � is

eaten in the solution, then it is part of the solution (�� = 1); otherwise, �� = 0.

Match each English sentence with the mathematical constraint that corresponds to it. Only 3

mathematical constraints will be used; the other 5 will not.

CONSTRAINTS

i. ������� ������ + ��ℎ���� ����� = 0

ii. ������� ������ + ��ℎ���� ����� = 1

iii. ��������� ≤ ��ℎ���� ����� + ������� ������

iv. ��������� + ��ℎ���� ����� + ������� ������ ≤ 2

v. ��ℎ���� ����� ≤ ���ℎ���� �����

vi. ��ℎ���� ����� = 1

vii. ��������� ≤ �������� ������

viii. ��������� ≥ �������� ������

a. (3 points) Either peanut butter or cheese sauce, but not both, must be eaten.

b. (3 points) Neither peanut butter nor cheese sauce may be eaten.

c. (3 points) If broccoli is eaten, then at least one of cheese sauce and peanut better must

be eaten.

GRADING: 3 points for each correct answer

SOLUTIONS:

a. ii

b. i

c. iii

d. (Extra credit – 5 points) Describe in words, without referring to variables, what the

constraint ��������� + ��ℎ���� ����� ≤ 1 + ������� ������ means.

GRADING: 5 points for a correct answer, partial credit when appropriate

SOLUTION:

There are several ways of interpreting this constraint. I accepted any of the following:

o If broccoli and cheese sauce are both eaten, then peanut butter must be eaten.

o If peanut butter isn’t eaten, then only one of broccoli and cheese sauce may be eaten.

o If peanut butter isn’t eaten, then either broccoli or cheese sauce can’t be eaten.

o If peanut butter isn’t eaten, at most one of broccoli and cheese sauce may be eaten.

o If peanut butter isn’t eaten, broccoli and cheese sauce can’t be eaten together.

o Etc.

3. (6 points) Five classification models were built for predicting whether a neighborhood will soon

see a large rise in home prices, based on public elementary school ratings and other factors.

The training data set was missing the school rating variable for every new school (3% of the data

points).

Because ratings are unavailable for newly-opened schools, it is believed that locations that have

recently experienced high population growth are more likely to have missing school rating data.

o Model 1 used imputation, filling in the missing data with the average school rating from

the rest of the data.

o Model 2 used imputation, building a regression model to fill in the missing school rating

data based on other variables.

o Model 3 used imputation, first building a classification model to estimate (based on

other variables) whether a new school is likely to have been built as a result of recent

population growth (or whether it has been built for another purpose, e.g. to replace a

very old school), and then using that classification to select one of two regression

models to fill in an estimate of the school rating; there are two different regression

models (based on other variables), one for neighborhoods with new schools built due to

population growth, and one for neighborhoods with new schools built for other reasons.

o Model 4 used a binary variable to identify locations with missing information.

o Model 5 used a categorical variable: first, a classification model was used to estimate

whether a new school is likely to have been built as a result of recent population

growth; and then each neighborhood was categorized as “data available”, “missing,

population growth”, or “missing, other reason”.

If school ratings cannot be reasonably well-predicted from the other factors, but new schools

built due to recent population growth can be reasonably well-classified using the other factors,

which model would you recommend?

i. Model 1

ii. Model 2

iii Model 3

iv. Model 4

v. Model 5

GRADING: 6 points for the correct answer

SOLUTION:

v is the correct answer. Since the problem states that we don’t have a good regression model,

we can’t use Models 2 or 3. However, because the problem states that we can predict well

from other factors, we can take advantage of that to use Model 5 instead of Models 1 or 4

that don’t use the extra information.

4. A consulting company has created a stochastic discrete-event simulation model of a large city’s

9-1-1 emergency dispatch operations, including incoming calls, dispatching of the appropriate

response (police, firefighters, paramedics, etc.), and the amount of time until the emergency

assistance arrives at the scene.

Emergency response is not first-come-first-served. When resources are limited, a moreimportant problem (like gunfire or a burning building) will be prioritized over a less-important

problem (like a cat stuck in a tree), and a more time-sensitive event (like a robbery in progress)

might be prioritized over a less-time-sensitive event (like the discovery of a robbery that took

place the day before).

When a new call for help comes in, the system automatically runs a simulation to quickly give an

estimate of the expected wait time until help arrives, which the operator relays to the caller.

Wait time is a combination of the time until the appropriate resource (e.g., a police unit) has no

higher-priority emergency to respond to, and the time it takes that resource to drive to the

scene.

a. (8 points) How many times does the company need to run the simulation for each new

help request (i.e., how many replications are needed)?

i. Once, because each request is unique.

ii. Once, because the outcome will be the same each replication.

iii. Many times, because of variability and randomness.

b. (8 points) The consulting company has found that simulated wait times are 25% higher

than actual wait times, on average. What would you recommend that they do?

i. Investigate to see what’s wrong with the simulation, because it’s a poor match

to reality.

ii. Scale up all estimates by a factor of 1.25 to get the average simulation estimates

to match the average wait times.

iii. Use the 25%-higher estimates, because that’s what the simulation output is.

GRADING: 8 points for each correct answer

SOLUTIONS:

a. iii

b. i

5. (8 points) For each optimization problem, select its most precise classification. In each model, �

are the variables, all other letters refer to known data, and the values of � are all positive.

CLASSIFICATIONS

i. Linear program

ii. Convex quadratic program

iii. Convex program

iv. Integer program

v. General non-convex program

a. minimize ∑� ����

subject to ∑� ∑� �������� ≥ �� for all �

all �� ≥ 0

b. minimize ∑� ����

subject to ∑� ����� ≤ �� for all �

all �� ∈ {0,1}

c. minimize ∑ ����

2

subject to ∑� ����� ≤ �� for all �

all �� ≥ 0

d. minimize ∑ (log �� � )��

subject to ∑� ����� ≤ �� for all �

all �� ≥ 0

GRADING: 2 points for each correct answer

SOLUTIONS:

a. v

b. iv

c. ii

d. i (since ci is a constant, log ci is also a constant, so it’s still linear in the x-variables)

6. (7 points) An online retailer is testing different methods of dealing with customers who want to

return a product, to see which method minimizes the number of returns. The retailer developed

7 options, so they used a multi-armed bandit approach where each option is chosen with

probability proportional to its likelihood of being the best. Their approach was to do trials in

batches of 1000, and after each 1000 trials to remove all options that are very unlikely to be

best, and then continue testing the rest. The results after the first 1000 trials are shown below.

Option

#1

Option

#2

Option

#3

Option

#4

Option

#5

Option

#6

Option

#7

Return

rate

8.0% 10.0% 13.4% 19.2% 22.0% 26.0% 26.3%

95%

confidence

interval

3.0%-

14.0%

6.0%-

14.5%

9.0%-

18.0%

16.8%-

21.8%

15.0%-

29.0%

21.0%-

31.0%

22.7%-

30.0%

Note: Lower return rates are better.

What should the retailer do?

i. Continue the multi-armed bandit approach using all seven options

ii. Continue the multi-armed bandit approach using only Options 1, 2, and 3

iii. Continue the multi-armed bandit approach using only Options 1 and 2

iv. Move to pure exploitation: use only Option 1

GRADING: 7 points for correct answer, partial credit as shown below

SOLUTION: ii (all three have overlapping confidence intervals); I gave partial credit of 4 points

to answer iii

7. A supermarket is analyzing its checkout lines, to determine how many checkout lines to have

open at each time.

At busy times (about 10% of the time), the arrival rate is 5 shoppers/minute. At other times, the

arrival rate is 2 shoppers/minute. Once a shopper starts checking out (at any time), it takes an

average of 3 minutes to complete the checkout.

a. (8 points) The first model the supermarket tries is a queuing model with 12 lines open.

What would you expect the queuing model to show?

i. Wait times are high at both busy and non-busy times.

ii. Wait times are low at both busy and non-busy times.

iii. Wait times are low at busy times and high at non-busy times.

iv. Wait times are low at non-busy times and high at busy times.

GRADING: 8 points for correct answer

SOLUTION: iv. At non-busy times, 2 shoppers/minute times 3 minutes to check out = an

average of 6 shoppers who need to be checked out simultaneously, and there are 12 lines

open. At busy times, 5 shoppers/minute times 3 minutes = 15 shoppers who need to be

checked out simultaneously, so 12 lines are not enough to keep the queue short.

The second model the supermarket tries is a Markov chain, where each state is the number of

people waiting at the end of a 1-minute interval (e.g., 0 people waiting, 1 person waiting, etc.).

In this model, the supermarket adds a more-complex staffing rule: In addition to the default of

12 lines open, whenever at least 5 people are waiting (total across all lines), the supermarket

will open 4 new checkout lines, which remain open until no more than 1 person is waiting.

b. (5 points) Which one of the following statements about the model and the memoryless

property is true?

i. The process is definitely memoryless.

ii. The process is definitely not memoryless.

iii. We can’t tell if the process is memoryless without knowing whether arrivals are

Poisson and whether checkout times are Exponential. If we collect data and

discover that both of these are true, then the process is memoryless.

GRADING: 5 points for correct answer

SOLUTION: This question was a little ambiguous, so there were two answers that could be

correct depending on how you look at it. ii is correct if you consider one model for a full day,

so the transition probabilities will be different depending on what time of day it is (and

whether it’s busy or not). iii could be correct if you assume two different models, one for busy

times and one for non-busy times. There are other interpretations for which ii and iii are

correct, but none for which i would be correct.

8a. (8 points) Rank the following regression and variable-selection/regularization methods from

fewest variables selected to most variables selected. All four methods will be used (the bottom

contains two equivalent spaces).

METHODS

A. Simple linear regression

B. Elastic net

C. Lasso regression

D. Ridge regression

GRADING: 2 points for each correctly-placed method

SOLUTION:

o Fewest = C (Because I accidentally wrote “Simple” linear regression (which sometimes

means just one variable), I also accepted A for fewest variables selected.)

o Middle = B

o Most = A,D (in any order)

8b. (14 points) Put the following seven steps in order, from what is done first to what is done last.

1. Scale data

2. Fit lasso regression model on all variables

3. Pick model to use based on performance on a different data set

4. Test model on another different set of data to estimate model quality

5. Impute missing data values

6. Remove outliers

7. Fit linear regression, regression tree, and random forest models using variables

chosen by lasso regression

GRADING: 2 points for each correctly-placed step. This (intentionally) means that someone

who switches the order of two steps, but has all the rest correct, will earn 10 points.

However, for someone who had one step out of order which pushed many others up or down

by one slot, I capped the penalty at 4 points.

SOLUTION: 6, {1 or 5}, {1 or 5}, 2, 7, 3, 4. Note that I accepted either 1 second and 5 third, or 5

second and 1 third. However, imputing or scaling before removing outliers is not a correct

solution, because the imputation would be done on bad data, and scaling would be stretched

too far.

9. (10 points) For each situation, select the most appropriate model/approach for it.

MODELS/APPROACHES

i. Game theoretic analysis

ii. Louvain algorithm

iii. Non-parametric test

iv. Queuing

v. Stochastic optimization

a. What distinct sets of furniture can be identified where there are many components

shared within each set?

b. How much should a bidder in an art auction bid, given that there will be competition

from other potential buyers?

c. How many people should be working in a call center to avoid long hold times when

someone calls with a question?

d. How many flights should an airline schedule between Atlanta and Boston each day for

the next six months, given demand uncertainty?

e. Is the median of the top 100 Kenyan runners’ marathon times different from the median

of the top 100 Swedish runners’ marathon times?

GRADING: 2 points for each correct answer

SOLUTIONS:

a. ii

b. i

c. iv

d. v

e. iii

No comments found.
Login to post a comment
This item has not received any review yet.
Login to review this item
No Questions / Answers added yet.
Version latest
Category Exam (elaborations)
Included files pdf
Authors qwivy.com
Pages 10
Language English
Tags Georgia Institute Of Technology - ISYE 6501midterm 2 solutions
Comments 0
High resolution Yes
Sales 0
Recently viewed items

We use cookies to understand how you use our website and to improve your experience. This includes personalizing content and advertising. To learn more, please click Here. By continuing to use our website, you accept our use of cookies, Privacy policy and terms & conditions.

Processing