0% found this document useful (0 votes)
153 views2 pages

SMDM Project Report Dsba

This document contains a report on a data science project analyzing customer spending data across different regions and channels. It includes the following key points: 1) Descriptive statistics are used to identify which region and channel spent the most and least. Various item varieties are also described and compared across regions and channels. 2) Contingency tables are constructed to examine relationships between gender and other variables like graduation intention, employment, and computer ownership. Probabilities are calculated to assess independence from gender. 3) Numerical variables like salary, spending, and text messages are evaluated to determine if they follow a normal distribution. 4) Hypothesis tests are proposed to determine if the population means are equal for items A and

Uploaded by

MANOJ MISRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
153 views2 pages

SMDM Project Report Dsba

This document contains a report on a data science project analyzing customer spending data across different regions and channels. It includes the following key points: 1) Descriptive statistics are used to identify which region and channel spent the most and least. Various item varieties are also described and compared across regions and channels. 2) Contingency tables are constructed to examine relationships between gender and other variables like graduation intention, employment, and computer ownership. Probabilities are calculated to assess independence from gender. 3) Numerical variables like salary, spending, and text messages are evaluated to determine if they follow a normal distribution. 4) Hypothesis tests are proposed to determine if the population means are equal for items A and

Uploaded by

MANOJ MISRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

__________________________

SMDM PROJECT REPORT DSBA


__________________________
Contents
Problem1.........................................................................................................................................2

1.1. Use methods of descriptive statistics to summarize data. Which Region and which Channel spent
the most? Which Region and which Channel spent the least? ................................................2

1.2. There are 6 different varieties of items that are considered. Describe and comment/explain all
the varieties across Region and Channel? Provide a detailed justification for your answer..............2
1.3. On the basis of the descriptive measure of variability, which item shows the most inconsistent
behaviour? Which items shows the least inconsistent behaviour? ...................................................2
1.4. Are there any outliers in the data? Back up your answer with a suitable plot/technique with the
help of detailed comments...........................................................................................................2 1.5.
On the basis of your analysis, what are your recommendations for the business? How can your
analysis help the business to solve its problem? Answer from the business perspective.........3
Problem 2................................................................................................................................................3
2.1. For this data, construct the following contingency tables (Keep Gender as row variable).........3
2.1.2. Gender and Grad Intention...................................................................................................3
2.1.3. Gender and Employment......................................................................................................3
2.1.4. Gender and Computer ..........................................................................................................3 2.2.
Assume that the sample is a representative of the population of CMSU. Based on the data, answer
the following questions:.........................................................................................................3 2.2.1.
What is the probability that a randomly selected CMSU student will be male? What is the probability
that a randomly selected CMSU student will be female? ............................................3 2.3. Based on
the above probabilities, do you think that the column variable in each case is independent of
Gender? Justify your comment in each case. ...........................................................4 2.4. Note that
there are three numerical (continuous) variables in the data set, Salary, Spending and Text
Messages. For each of them comment whether they follow a normal distribution. Write a note
summarizing your conclusions. [Recall that symmetric histogram does not necessarily mean that the
underlying distribution is symmetric]...................................................................................4 Problem
3................................................................................................................................................5 3.1 Do
you think that the population means for shingles A and B are equal? Form the hypothesis and
conduct the test of the hypothesis. What assumption do you need to check before the test for
equality of means is performed? ........................................................................................................5
3.2 What assumption about the population distribution is needed in order to conduct the hypothesis
tests above?......................................................................................................................5

You might also like