0% found this document useful (0 votes)
48 views11 pages

Qns Exam2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views11 pages

Qns Exam2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Qn 1

The marketing manager at Tom Thumb wants to investigate the sales data of their new products in its
branches during the last six months. She is interested in finding important factors in sales unit to
optimize the store’s revenue. She collected the following data for each product id. The Tom Thumb hires
you as a data scientist to analyze the above sales data set. When you open the data set, you see the
following structure in Tom Thumb’s data:

Product ID Sales unit (I) Price (II) Month (III) Likability (IV)
1 High Regular price January 4.2
2 High Promoted price January 4.7
3 Low Regular price March 3.1
4 Average Promoted price February 3.7
⋮ ⋮ ⋮ ⋮ ⋮

Now, which types of model can be considered as the best to analyze the sales units as factors/functions
of price. Which... Sales unit is either high, average, or low.

• Linear regression.
• Binary logit.
• Multinomial logit.
• A&C.

Qn 2
After receiving the above data set, you decide to talk to the IT-group to see how far you can obtain more
data on sales in Tom Thumb’s branches during the last six months. The IT-group provide you the
following table:

Product ID Sales unit (I) Price (II) Month (III) Likability (IV)

1 4561 Regular price January 4.2

2 5671 Promoted price January 4.7

3 1408 Regular price March 3.1

4 2550 Promoted price February 3.7

The IT-group told you that the High and Low sale mean that the sales unit is greater than 4000 and less
than 2000 units, respectively. Otherwise, they coded it as the Average sale. They give you the absolute
sales unit in the new table. Also, they told you that when there is no discount, Price is coded as the
Regular price. Otherwise, it is called promoted price. Unfortunately, they do not have the absolute prices
due to a technical issue. The Tom Thumb manager wants you to predict how many units of product
they can sell if they always use Promoted prices. Now, which types of model can be considered to
answer the Tom Thumb manager request?

• Linear regression.
• Multinomial Probit.
• Multinomial logit.
• A&C.

Qn 3
An advertising firm wants to understand the relationship between the number of clicks of an online
advertising banner and number of times the ad has been loaded on pages. Now, Which types of models
can be considered to answer the above ___ issue?

• Linear Regression
• Binary Probit
• Binary Logit
• Ordered Logit

Qn 4
The above advertising firm also wants to understand the relationship between which of the two online
advertising banners will be clicked by a subject, based on their two different designs. Now, which types
of model can be considered as the best to answer the above ___ issue?

• Linear Regression
• Binary Probit
• Binary Logit
• B and C

Qn 5
The above advertising firm wants to understand the relationship between which of online advertising
banners will be clicked by a subject, based on their different designs and sizes. The marketing team
considered two types of design Normal vs. fancy and two types of size Small vs. Large. In total, the
marketing team provides 4 = 2 X 2 advertising banners. After collecting the data, which type of model
can be used to analyze the data?

• Linear Regression
• Multinomial Probit
• Multinomial Logit
• Band C
Qn 6
Amazon wants to know the effect of observed average price of product i on the rating of customers
about it. On amazon website, each product can be rated either 1, 2, 3, 4, or 5 stars. If Amazon wants to
study

Rating i = f (price i) + e

Which types of model is more appropriate to find the above effect?

• Multinomial Probit
• Ordered Probit
• Multinomial Logit
• Linear Regression

Qn 7
Tax fraud is an important economic issue in every country. IRS (The USA Tax System) wants to predict the
USA tax income in the next ten years based on people's income database in the past years. Clearly, every
citizen prefers to pay lower income tax. If IRS wants to study

PaidTAXAmount i =f (income i, age i, race i) + e

where i denotes person i in the income database, which types of model is more appropriate to predict
the USA tax income in following years?

• Multinomial Probit because of violation of lIA property. High income people are similar to each
other.
• Tobit Regression
• Truncated Linear regression
• Truncated Linear regression wherein incomei2 must be included to pin down non-linear trend,
i.e., high income people want to pay less tax

Qn 8
Chicago police wants to understand important factors of high crime rate in a neighborhood based on its
demographics. They decided to use their past database which records all crimes occurred in a
neighborhood by an adult over 18 years old, according the crime law in USA. If Chicago police wants to
study

CrimeRate i = f (income i, age i, race i, education i) + e

where i denotes neighborhood i in the crime database, which types of model is more appropriate to
predict the crime rate in Chicago?

• Linear regression
• Tobit Regression
• Truncated Regression
• Heckman Two-Steps Selection Model.

Qn 9
A large US bank is interested in predicting the loan default rate. The loan default means a customer
decided to not pay back full amount of his loan on time. The bank decided to use its database on loans'
status in the past 20 years. They observe a dependent variable Default which is 0 if the subject did not
pay back on time. Otherwise, it denotes the amount of the loan that had been paid back. Let's consider
the following systematic relationship

Default i =f (income i, age i, creditscore i) +e

where i denotes customer i in the database, which types of model is more appropriate to predict the
Default in future?

• Truncated regression
• Tobit Regression
• Heckman Two-Steps Selection Model
• B and C

Qn 10
What is the definition of the endogeneity in a regression model Y= β1 X1 + … βp Xp + ε

• There is an unobserved variable Z which is correlated with at least one of our predictors X j
• There is an unobserved variable Z which is correlated with all of our predictors X 1 , …, and X p
• There exists an omitted variable Z which correlated with Y
• There exists a very high level of correlation between at least two predictors X j, and X j’ such that
corr(X j, X j’) > 0.9

Qn 11
A retailer wants to understand the relationship between the observed price in store and customers’
decision of “Buy” or “Not to Buy”. According to its customer dataset, the retailer did a logistic analysis as
follows.

Prob (Y = "Buy”) = (exp(β0 + β1 Price))/(1 + exp (β0 + β1 Price))

The estimation result is provided in the below table.

Parameters DF Estimate STD error Pr > Chisq

Intercept 1 4 0.6979 <0.0001

Price 1 -2 0.4669 <0.0001

Which of the following statements is true?

• If price goes up by $x, the probability of buying will drop by 2x%.


• If price goes up by $x, the ratio of odds of buying to not buying will decrease by factor exp(-2x)
• If price goes up by $x, the ratio of odds of buying to not buying will increase by a factor exp(-2x)
• If price goes up by $x, the log of ratio of odds of buying to not buying will decrease by a factor
exp(-2x)

Qn 12
A retailer wants to understand the relationship between the observed price in store and customers
decision of buy or not to buy? According to its customer dataset, the retailer did a logistic analysis as
follows.

Prob (Y = “Buy”) = 1- Φ ε (- β0 - β1 Price)

Where Φ ε denotes the cumulative distribution of normal distribution. The estimation result is provided
in the below table.

Parameters DF Estimate STD error Pr > Chisq

Intercept 1 4.5 0.6989 <0.0001

Price 1 -2.5 0.4899 <0.0001

Which of the following statements is true?

• If price goes up by $x, the probability of buying will drop by Ø (-2.5x)%.


• If price goes up by $x, the ratio of odds of buying to not buying will increase by factor exp(-2.5x)
• If price goes up by $x, the ratio of odds of buying to not buying will decrease.
• If price goes up by$, the log of ratio of odds of buying to not buying will decrease by a factor
exp(-2.5x) Ø (-2.5x)

Qn 13
Let’s consider the following table. This table shows the result of a logistic model that have been done on
a conjoint study about cars’ design. In this study, designers show different combination of cars based on
the following attributes the number of seats, the cargo space, the engine, and the price of a car.

• Either 2, 4, or 6 seats in a car


• Either 2ft or 3ft cargo space in a car
• Either Gas, hybrid, or electrical engine in a car
• Either 30K, 40K, or 60K price of a car

Moreover, the designers ask about the marital status of participant. You can consider “marital status” as
dummy variable that is 0 if the participant is married. Otherwise, “married status” is 1, i.e., the
participant either single, divorced or separated. In the following table, you can assume that a car with 2
sears, 2ft cargo, and Gas engine has been chosen as the reference categories for the categorical variables
seat, cargo, and engine respectively. Precisely, the researchers estimated the following latent utility
model based on the multinomial logistic approach:
U ij = β price j + β 4 seats I 4 seats + β 6 seats I 6 seats + β 3ft cargo I 3ft cargo + β electrical I electrical + β hybrid I hybrid +I marital status *
(β’ price j + β’ 4 seats I 4 seats + β’ 6 seats I 6 seats + β’ 3ft cargo I 3ft cargo + β’ electrical I electrical + β’ hybrid I hybrid )+ ε

If the price of a car goes up by 20K in the market, based on the above table, which of the following
statement is true?

• The demand of that car will drop in the market.


• The demand of married people will drop more than the other people.
• The demand of married people will drop less than the other people.
• A and B
Qn 14

Based on the above table, which type of car is the most favorite choice for unmarried people?

• A car with 2 seats, 3ft cargo space with an electrical engine.


• A car with 4 seats, 2ft cargo space with a gas engine.
• A car with 2 seats, 3ft cargo space with a gas engine.
• A car with 4 seats, 2ft cargo space with an electrical engine.

Qn 15
Based on the above table, which type of car is the most favorite choice for married people?

• A car with 6 seats, 2ft cargo space with a hybrid engine.


• A car with 6 seats, 3 ft cargo space with an electrical engine.
• A car with 4 seats, 3ft cargo space with a hybrid engine.
• A car with 4 seat, 2ft cargo space with an electrical engine.

Qn 16

Based on the above table, which of the following statement is true?

• The married people receive a higher utility from a greater number of seats than unmarried
people
• The married people receive a higher utility from a smaller number of seats than unmarried
people.
• The married people receive a higher utility from a greater number of seats than unmarried
people if the car has a hybrid engine.
• Since we only observer their choices of alternatives and do not observe the utility of people,
none of the above can be chosen.
Qn 17

Based on the above table, which of the following statement is true?

• The married people receive a higher utility from smaller cargo space than unmarried people
• The married people receive a higher utility from larger cargo space than unmarried people.
• The married people receive a higher utility from larger cargo space than unmarried people if the
car has an electrical engine.
• Since we only observer their choices of alternatives and do not observe the utility of people,
none of the above can be chosen.
Qn 18

We know that the above conjoint analysis had been done in southern USA’s states. According to USA
culture, southern people marry at early ages. Based on the above table and question 13-17, which of the
following statement is true?

• The manufacturer will maximize its market share if it provides only a car that has all desired
features for married people.
• The manufacturer will maximize its market share if it provides only a car that has all desired
features for unmarried people.
• The manufacturer will maximize its market share if it provides two cars to target all customers
(married and unmarried) by providing all desired features for each group of people.
• None of the above

Qn 19
ROC Model Area AUC Standard Error 95% wald confidence limits

price_Pampers 0.6639 0.0295 0.6061 0.6916

price_Huggies 0.7421 0.0313 0.7208 0.7934

display_Pampers 0.7278 0.0252 0.7184 0.7672

display_Huggies 0.5421 0.0265 0.4902 0.594

age 0.5862 0.0291 0.5291 0.6433


The above table shows the area under ROC curves with respect to each independent variable in the
diaper product category. The table is based on a logit model that considers the consumers’ choice
between two market leaders Pampers vs. Huggies. Which of the following statement is true based on
the above table?

• The Price of Huggies is significantly the best predictor to predict consumers’ choice in the diaper
product category.
• The Price of Huggies and Display of Pampers are significantly the best predictors to predict
consumers’ choice in the diaper product category.
• The Price of Huggies should be better, on average, than display of pampers to predict
consumers’ choice in the diaper product category since its AUC is Larger.
• B AND C

Qn 20
Let’s assume that you have evidence of an asymmetric switching pattern among alternatives in your data
set, i.e. there is ___ against IIA property. What types of model should be used to capture the violation of
the asymmetric switching pattern?

• Ordered Logit or Probit


• Multinomial Logit or Multinomial Probit
• Multinomial Nested Logit or Multinomial Probit
• Multinomial Nested Logit

You might also like