ID Maxel Test
ID Maxel Test
1. Average income of incoming population for new product is 15 mil IDR. Median income is 6 mil
IDR. Which of the following statement is correct?
a. Average is always higher than median hence everything is as expected
b. Since average is so much higher than median it means we have people with extremely
low incomes and we should check for outliers in the data
c. Since average is so much higher than median it means we have with extremely high
incomes and we should check for outliers in the data
d. Average must be always lower than median hence there must be some mistake in data
2. You would like to bet on who will win upcoming football match between 2 teams. You have 3
experts giving their prediction on who will win the match. Each expert has probability 0.6 that he
will be right. You will decide who to bet on based on poll from the experts (you will bet on team
whom 2 or 3 experts say will win). Under assumptions that the experts’ opinions are
independent what is the probability that you will win your bet?
3. You would like to bet on who will win upcoming football match between 2 teams. You have 3
experts giving their prediction on who will win the match. Each expert has probability 0.6 that he
will be right. You will decide who to bet on based on poll from the experts (you will bet on team
whom 2 or 3 experts say will win). Under assumptions that the experts’ opinions are not
independent what is the probability that you will win your bet?
4. We rolled a 6-sided dice 60 times. Number ‘2’ fell 12 times. Is the dice normal or is it somehow
biased?
a. To decide whether the dice is biased you would need to choose level of confidence and
test your hypothesis using Binomial distribution
b. The dice is biased and number ‘2’ is more likely to fall on the dice than on normal dice
c. The dice is not biased and number ‘2’ is not more likely to fall on the dice than on
normal dice
d. To decide whether the dice is biased you would need to choose level of confidence and
test your hypothesis using Normal distribution
5. You have Credit portfolio that consists of only 2 products - A and B. Your analyst delivered
following graph showing the risk of the whole portfolio (A+B) and risk only of product A and
share of product A in the portfolio.
As you can see the risk of your portfolio is significantly growing (blue line). Your manager is
asking you what is going on? Why is the risk growing? What will you reply to him?
a. Product A has higher risk than B and since the share of A is significantly growing. This is
most probably causing the growth of the risk of all portfolio. From the underlying data,
we could calculate exactly how risk of product B behaves.
b. Product A has higher risk than B and since the share of A is significantly growing. This is
most probably causing the growth of the risk of all portfolio. From the underlying data,
we can’t calculate exactly how risk of product B behaves. We can just approximate it.
c. There are some important data missing we need more graphs to say anything.
d. Since risk of product A is constant and share of product A grows and risk of portfolio
growths as well it means that risk of B must be growing and thus, we need to focus on
identifying why Risk on B is growing.
6. You are in a game show. There are 3 boxes. 1 contains price. Other 2 are empty. You have picked
one box. The host opened one box that you haven’t picked, and it is was empty. Now 2 boxes are
left. Would you like to change your box or keep the box that you chose previously?
a. Keep the previous box
b. Change the box
c. It doesn’t matter there are only 2 boxes so in any way chances are 50/50
d. I don’t know
7. 10% of people in our database have an iPhone. We are creating a sample for a Marketing
campaign where we offer McBook to our clients. To ensure success we offered the campaign to
all clients who have iPhone and to 10% of our clients without iPhone. What is the share of clients
with iPhone we have in our campaign sample?
8. We would like to create model predicting clients’ probability to default based on clients age and
other predictors. We have drawn graph of age against default rate and it looks as follows:
How would you add age predictor to Logistic regression model? Which of the following
statements is correct?
a. The predictor can be added to the model as it is as logistic regression can handle non-
linear relationship between the predictor and the target
b. It is impossible to add the predictor since logistic regression cannot handle non-linear
relationship between the predictor and the target
c. In order to add the predictor, we must do logit transformation of the predictor to deal
with non- linear relationship between the predictor and the target
d. In order to add the predictor, we must group the predictor and instead of values of
predictor we must use default rate or WOE of the group to deal with the nonlinear
relationship between predictor and the target
9. Your company is developing a vaccination. For 1 in 1000 patients the vaccination will not work,
and they will require to be given significantly more expensive version (for each patient you can
give only one vaccination per life). There is no clear rule which patience need cheaper or more
expensive vaccination. On historical data of 100 000 patients you developed 3 logistic
regression models predicting if the vaccination will work on the patient or not based on
demographics and blood test analysis. Model A has R square 0.98 and Gini 0.02. Model B has R
square 0.78 and Gini 0.25. Model C has R square 0.74 and Gini 0.30. Which model you should
use as it will be the best model to predict which patients will likely need the more expensive
vaccination
a. Model A
b. Model B
c. Model C
d. Any model is ok as the sample on which the data was developed is big enough
10. Your junior colleague created a logistic regression model. He wants your feedback and asks you if
he should continue improving the model. He has 20 predictors in the model, and it was built on
sample of 200k clients, Gini of the model on test sample (not used for modelling – sometimes
also called holdout sample) is 0.97 (highest possible Gini is 1). What will you say to your junior
colleague?
a. Congratulate the colleague the colleague and tell him not to do anything else as he is
already very close to the theoretical maximum so there is no point of wasting more time
on improvement of the model.
b. Tell the colleague that he probably overfitted the model and should rebuild model using
less degrees of freedom or using Cross validation.
c. Tell the colleague that he probably used information from future (now known but not
known at time of decision making) or have some target leak as his Gini is suspiciously
high.
d. Tell colleague that his sample is too small for logistic regression and that is why together
with very high Gini the model looks overfitted. Suggest to the colleague to use Extreme
gradient boosting method for such small sample.
11. Let’s imagine someone was guesting all previous 10 questions what would be his probability that
he has all answers correct so far?
12. All clients in our database have status single, married or unknown. 2 400 000 clients are in status
married, 600 000 have status unknown, 40% of all our clients are single. How many clients we
have in the database?
13. There are 40 000 people in our campaign. 9 000 people own a house, 10 000 people own a car.
25 000 do not own anything. How many people own a house and don’t own a car?
14. You are walking with your dog from place A to place B. Distance between A and B is 10 km. Your
dog is walking 2 times faster than you so it keeps running ahead to B and back to you. What
distance will your dog have traveled at the time you arrive to B?
15. You and your friend started reading same book. You read 10 pages a day while your friend read
12 pages a day. Your friend finished the book two days earlier than you. How many pages did the
book have?
- Goodluck -