0% found this document useful (0 votes)
322 views6 pages

DS1 Sample Questions Set1

This document contains 14 sample questions related to decision sciences and statistics. Some key details include: - Question 1 asks to arrange measures of dispersion (standard deviation, range, IQR) in order of increasing robustness. - Question 2 asks about conclusions that can be drawn about the number of candidates scoring between certain values based on mean and variance of scores in a competitive exam. - Question 3 provides sample data on rice and wheat production across districts and asks questions related to descriptive statistics and outliers.

Uploaded by

Udaiveer Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
322 views6 pages

DS1 Sample Questions Set1

This document contains 14 sample questions related to decision sciences and statistics. Some key details include: - Question 1 asks to arrange measures of dispersion (standard deviation, range, IQR) in order of increasing robustness. - Question 2 asks about conclusions that can be drawn about the number of candidates scoring between certain values based on mean and variance of scores in a competitive exam. - Question 3 provides sample data on rice and wheat production across districts and asks questions related to descriptive statistics and outliers.

Uploaded by

Udaiveer Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Decision Sciences 1: Sample questions

(1) A statistical measure is considered to be robust, if it is less affected by outliers. Arrange the three
measures of dispersion - standard deviation, range, and IQR -- in increasing order of robustness
(starting with the least robust measure).

(2) In a competitive exam taken by 1 lakh candidates, the candidate scores were found to have a
mean of 50 and variance 16. What best can be concluded about the number of candidates who
scored:
a) Between 40 and 60?
b) More than 55?

(3) Mr. Zoobi Doobi is examining the data on district-wise (310 districts) rice and wheat production
in 2017 from a website. The column-headers and data from the first few districts in shown below
(units in bracket. Ha = hectare, 1 ton = 1000 kg).

Some of the summary statistics for the six variables, as computed by the “Descriptive Statistics” tool
in MS Excel, is given below:

RICE AREA RICE RICE WHEAT WHEAT WHEAT


(1000 ha) PRODUCTION YIELD (kg AREA PRODUCTION YIELD (kg
(1000 tons) per ha) (1000 ha) (1000 tons) per ha)
Mean 138.63 368.77 2249.72 96.95 356.19 2342.36
Variance 27649.97 223244.3 1320845 14932.43 298236.3 2148959
Maximum 970.95 3001.78 5159.93 829.58 4131.43 5490.02
Count 310 310 310 310 310 310

On further inspection Zoobi Doobi found that in 29 of these districts, the rice area was reported as 0
(as well as the rice production). However, rice yield was reported to be 0 in only 28 of these districts,
while in the other one (Tonk) the rice yield was reported as 2500. Similarly, in 53 districts, the wheat
area was reported as 0, (as well as the wheat production), but only in 47 of these districts, the wheat
yield was reported to be also 0, while for Karimnagar and Warangal the wheat yield was reported to
be 1428.57 and for Srikakulam, Visakhapatnam, Kottayam and Ernakulam, the reported wheat yield
was 1000.
a) What was the mean rice production among the districts which was reported to have positive
rice production in 2017?
b) Find the standard deviation and one more measure of dispersion (not variance) of wheat
production of all 310 districts in 2017.
c) What best can be concluded about the number of districts which had rice yield more than
5000 kg per ha in 2017?
d) Zoobi Doobi thinks that for either crop, yield values are 1000 time the ratio of the crop
production to the area (cultivated). But in the average reported (for all districts), e.g., for
rice 1000*368.77/138.63 gives 2660.05 and not 2249.72. Explain clearly articulating why
either Zoobi’s calculation or logic is wrong.

(4) 𝑋 and 𝑌 are mutually exclusive events with 𝑃(𝑋) = 0.295 and 𝑃(𝑌) = 0.32 respectively.
Compute the value of 𝑃(𝑋|𝑌)𝑃(𝑌|𝑋).

(5) The temperature in New York City (measured in Celsius) during the month of July, has mean
equal to 25 and variance equal to 5, respectively. What are the mean and variance of the
temperature in New York City on those days as measured on a Fahrenheit scale? The relation of
Celsius (C) and Fahrenheit (F) is given by the following:

𝐶 𝐹 − 32
= .
5 9

(6) Suppose that in a T20 cricket match, Indian players can score (in a ball) according to the following
probability distribution:

Runs (per ball) Probability


0 0.3
1 0.3
2 0.2
3 0.05
4 0.1
5 0.005
6 0.045

a) What is the probability of scoring at most 2 runs in a ball?


b) How many runs are expected to be scored in an over? (You may assume that the over had
exactly six balls.)
c) Let us now consider that a batsman can get out in a ball with probability 0.05, and that no
run is scored in the ball that he gets out. However, if there is no wicket in a ball, then the
scoring follows the above probability distribution. Under this setup, how would your answers
to part (a) and (b) change?

(7) Megacycles store gets their brakes from three companies. 75% are from company A, 15% from B,
and 10% from C. The defect rates of brakes produced by these companies A, B and C are 4%, 6% and
8% respectively. If a cycle is randomly tested and the brake is found to be defective, what is the
probability that it is produced by A?

(8) In the Cardiology Department at Apollo Hospitals, the probability that a patient visiting Dr.
Bhagwan complains about customer service is 0.05. If on a given day, 10 patients visit Dr. Bhagwan,
what is the chance that at least 2 of them will complain about customer service?
(9) A treadmill stress testing has the following known reliability results in the diagnosis of coronary
artery disease (CAD). 95% of people with CAD show POSITIVE test result when they take the
treadmill stress test. However, 10% of people who do not have CAD also show POSITIVE test result
when they take the treadmill stress test. Otherwise the test shows NEGATIVE result. When Mr. Zoobi
Doobi visited Dr. Bhagwan for routine health check-up, the latter assigned some prior probability p,
based on some preliminary enquiry, for Mr. Zoobi Doobi having CAD. When the treadmill stress test
result for Mr. Zoobi Doobi came out as POSITIVE, the doctor concluded that Mr. Zoobi Doobi has
75% chance of having CAD. What must have been the prior probability p?

(10) A new shipment is about to be dispatched from a warehouse when the manager comes to know
that a few defective items in the shipment may be defective. She starts to choose items from the
shipment at random and inspects them one by one. She decides to cancel the whole shipment as
soon as she finds two defective items. Assume that each item is independently manufactured and
each one is 5% likely to be defective. What is the probability that the manager will cancel the
shipment after inspecting five items from the shipment?

(11) Purna Analytics has the following two-stage process for taking up the analytics projects. The
final approval decision is taken by the data science team. If the data science approves the project, it
is taken up; otherwise, it is rejected. However, at the first stage, it is checked by the compliance
team. The compliance team gives a go ahead or a red flag with equal chance.
The data science team can approve projects which have received red flag from the compliance team.
It has been observed from the historical data that for projects given go ahead by the compliance
team, the data science team approves with chance 2/3, while for projects getting red flag by the
compliance team, the data science team approves with chance 1/3. (This happens for every
project.)

a) Find the probability of any project getting an approval.


b) On 7th July two projects have come to Purna. If the first one is approved, what is the chance
of the second one getting approved?
c) Continuing from part 2, if both these projects got approved, what is the chance that exactly
one of them was passed by the compliance team?

(12) If Rs 𝑥 is invested in mutual fund A, its worth after one year is Normally distributed with mean
1.05𝑥 and variance 0.002𝑥 ! . If the same amount is invested in mutual fund B, its worth after one
year is Normally distributed with mean 1.06𝑥 and variance 0.003𝑥 ! . Mary is considering two ways
in which she might want to invest her savings of Rs 1000 into these mutual funds.

• Option 1: Invest Rs 600 into mutual fund A and Rs 400 into mutual fund B.
• Option 2: Invest Rs 400 into mutual fund A and Rs 600 into mutual fund B

a) Compute the mean and variance of the total worth of the investment after one year,
assuming that growth in investment in the two funds have a correlation of 0.4, for each
option.
b) What is the probability that after one year, the total worth of the investment under Option 1
will be worth between Rs 1100 and Rs 1500?
c) What is the probability that after one year, the total worth of the investment under Option 2
will be worth between Rs 1100 and Rs 1500?
d) What is the probability that after one year, the total worth of the investment under Option 1
will be more than the total worth of the investment under Option 2?
(13) Items from a manufacturing process are subject to 3 different and independent tests: Test A,
Test B and Test C respectively. The final score of the manufacturing process is computed as the
weighted sum of the scores in the three tests, with 40% weight to the score in Test A, 40% weight to
the score in Test B and 20% weight to the score in Test C. Suppose that the results of Test A have a
mean of 59 and standard deviation of 10, and the results of Test B have a mean of 67 and a standard
deviation of 13. If the final score has a mean of 65 and standard deviation of 7, determine the mean
and standard deviation of the results from Test C.
(14) Mushfiqur wants to explore the issue of global warming with real data. So, he finds the monthly
global land average temperature data since 1998 (up until 2015) and plots it as follows.

From the plot, he concludes that the temperature does not change much, and therefore the issue of
global warming is not something to worry about. To confirm this belief, he next finds the mean
temperature for different months over the whole period and plots it against the monthly
temperatures of 2015. Here, he notices that the temperatures for 2015 are slightly higher than the
overall average on almost all cases. Confused, he turns to you for advice.
a) Based on the first plot, can we conclude that the global temperature is not rising
significantly? Justify your answer using advantages and/or disadvantages of the data and the
visualization technique.
b) Based on the second plot, can we conclude that the global temperature is rising? Justify your
answer using advantages and/or disadvantages of the data and the visualization technique.
c) From the second plot, can we conclude that the global temperature has a symmetric
distribution?
d) Mushfiqur now turns his attention to just the southern part of India, and wants to see how
the temperatures are distributed over the states of Karnataka, Kerala and Tamil Nadu. He
finds the following measures for the temperature in these three states for a common period
(given in the table below). Using the table, can you compute the mean, median and
standard deviation of the average temperature in whole of the Southern India? Mention any
assumptions you are making and whether those assumptions are valid.

Karnataka Kerala Tamil Nadu


Mean 20.72 24.67 27.31
Median 20.15 25.75 28.72
Standard deviation 4.07 4.33 4.67

(15) When a person is tested for COVID infection, the test result is found to be either positive or
negative. A COVID infected patient can have either low viral load or high viral load. If a COVID
infected patient with low viral load is tested, there is 60% chance that the test will show positive
result. If a COVID infected patient with high viral load is tested, there is 85% chance that the test will
show positive result. If a person is not infected with COVID, there is 99.5% chance of the test
showing negative result.
In a country 20% of the people being tested actually are infected with COVID. However, only 14.4%
of those being tested are found to be positive.

a) What is the chance of a COVID infected patient having low viral load?
b) If 50 COVID infected patients with low viral load are tested, how likely is it that at most 30 of
them will show positive test result?
c) If 50 COVID infected patients with high viral load are tested, how likely is it that more than
40 of them will show positive test result?
d) If 500 people from this country tested for COVID infection, how many of them are expected
to be found positive as per the test result?

(16) The Bureau of Labour statistics reports that the average annual salary in Bardhaman is
Rs 45000. Suppose annual salaries in the area are normally distributed, with a standard deviation of
Rs 5000. Workers are randomly sampled.

a) What is the chance of a worker’s annual income being more than Rs 55,000?
b) What is the chance that among 3 workers sampled, exactly one has an earning more than Rs
60,000?
c) What is the chance that among 200 workers sampled, at least 157 workers have salaries
between Rs 40,000 and Rs 50,000?
d) Suppose there is a household with 2 members, whose salaries are independently
distributed. What is the chance that the collective income of the family is more than Rs
82,929?
(17) A railway engineer is monitoring cracks on the railway line segment due to wear and tear. She
notices that on an average there are 2 cracks in every 5 kms of the railway line.

a) What is the probability that there are more than 10 cracks in 25 kms of the railway track?
b) What is the probability that there are less than 3 cracks in a 10-km stretch of the track?
c) If the engineer found a crack at the 2 km mark from a railway station, what is the probability
that no crack is detected in the next 7 kms?

(18) Amazon’s last-mile distribution center works 18-hour per day and processes on an average 360
packages in a day. Packages are classified as Small, Droner, and Bulky, based on two independent
parameters, their weight and largest diagonal. The weight of packages has an Exponential
distribution with a mean of 6 lbs, and the largest diagonal has a Uniformly distribution between 6
and 36 inches. A package is a Droner if its largest diagonal is between 12 and 24 inches and its
weight is between 2 and 4 lbs. A package is Bulky if it is either bigger or heavier than a Droner, or
both. All other packages are considered Small. Last mile delivery of packages is carried out in “hourly
waves” (one-hour duration). For example, packages arriving between 8 am and 9 am are dispatched
for last mile delivery at 9 am. Last mile delivery is done using drones for Droner packages, vans for
Bulky packages, and bicycle carriers for Small packages.

a) If packages arriving for processing follow a Poisson process, compute the probability that at
least 15 packages need to be processed in an “hourly wave” (one-hour duration).
b) Assume that exactly 15 packages need to be processed during a particular “hourly-wave” and
each drone trip carries a single Droner package. What is the probability that they will need at
most three drone trips to process this “hourly wave”?
c) Assume that exactly 15 packages need to be processed during a particular “hourly-wave” and
each van trip can carry at most 10 Bulky packages. What is the probability that they will need
at least a second van trip to process this “hourly wave”?

You might also like