0% found this document useful (0 votes)
9 views16 pages

Quantative Reasoning Topic

Bivariate analysis is a statistical method used to examine the relationship between two variables, identifying patterns, correlations, or associations. It includes techniques such as correlation analysis, regression analysis, and cross-tabulation, which help in exploring relationships, predicting values, and assessing associations between categorical variables. Additionally, the document discusses point estimation, confidence intervals, and hypothesis testing as essential components of inferential statistics.

Uploaded by

i233052
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
9 views16 pages

Quantative Reasoning Topic

Bivariate analysis is a statistical method used to examine the relationship between two variables, identifying patterns, correlations, or associations. It includes techniques such as correlation analysis, regression analysis, and cross-tabulation, which help in exploring relationships, predicting values, and assessing associations between categorical variables. Additionally, the document discusses point estimation, confidence intervals, and hypothesis testing as essential components of inferential statistics.

Uploaded by

i233052
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 16
KS BIVARIATE ANALYSIS Bivariate analysis is a statistical technique that comprises the analysis of two-variables simultaneously to recognize .the relationship between them. This type of analysis can help identify patterns, correlations, or associations between the variables, The bivariate analysis is use to: a 2 3. The main types of bivariate analysis are as follow: 1 Explore the relationship between two variables, such as understanding if there is a connectian between study hours and,exam scores. Predict the value of one variable based on the value of another, such as predicting house prices based on square footage. Identify if there is an association between categorical variables, such as gender and voting preference. a Correlation Analysis: Pearson correlation coefficient is used to measures the strength and direction of the linear relationship between two continuous variables, . Regression Analysis: Simple linear regression is used to model the relationship between a dependent variable and an independent variable by fitting a linear equation to the observed data. Cross-Tabulation and Chi-Square Test: A contingency table is used to examine the relationship between two categorical variables by creating a table that displays the frequency distribution of variables. While, Chi-Square test is used to determine if there is a significant association between two, categorical variables, 253 cate that as study hours increase, exam scores tend to Increase as correlation would well Example 36 ‘A company with ten operating plants of produicing small spare parts have observed the following pattern of expenituts (Rs. 5 per 1000 units) on inspection and defective parts delivered to the customer, per thousand units. Expenditures | 25 [30 [15 [75 [40 [65 [45 [24 [35 [70 Defectives | 50, 135 _|60.|15 |46 [20 | 28 [45 [42 | 22 Draw a scatter diagram to see how strong the «lationship is between inspection expenditure and the number of faulty items delivered. Solution The data can be grephed using the horizontal axis for the independent variable (expenditure). The vertical avs is used for dependent variable (defectives). So the scatter diagram is 3a oh ee eT Ipc nto pe 8 we Figure shows 2 downwards trend in, defectives, delvered as inspection expenditure increases. This is known as @ negative slope or negative relationship, which is high but not perfect. 4.10 INTRODUCTION,TO LINEAR REGRESSION ANALYSIS ae Francis Galton (1886) introduced the term regression. Galton found that although there ‘was a téndency for tll parents to have tll children and for short parents to have short children. The average height of children of parents of'a given height tended to move or “regress" towards the average height in the population as.@ whole. In other words the height of children of unusvaly tll or unusualy short parents tends to move toward the ‘average height of population. Kat Pearson, collected more than a thousand records of heights of the members of family groups and confirmed the Galton’ Law of Universal Regression. He found that the average height of sons ofa group of tal fathers was less than theirs father’s height and the average height ofa sons of @ group of short fathers ‘was greater than their fathers height, thus “regressng’, tall and short sons alike towards the average height of a men. the words of Galton this was "regression to mediocrity’ in Herein State. In sort, cilden of al pares teh oles land eiren by {Beamscamner ‘The regression analysis is concerned with the study of dependence of a variable (called the dependent variable), on a set of other variables (called the independent or explanatory variables). The dependent variable is assumed to be a random variable whereas the independent variables are assumed to have fixed values (non-random). The relation between the expected value of the dependent variable and the independent variable is called a regressign relation. Moreover, a ‘variable whose variation is try to explain is a dependent variable while an independent variable is used to explain that variation. The few Examples may be: 1). The relationship between the sales revenue of a firm and the amount spent on advertising. 2) The performance of the stock to the current.discount rate of Federal Board of Revenue (FBR) : 3) The dependence of consumption expenditure on disposable income. 4) The dependence of crop yield (say, wheat), on temperature, rainfall, amount of sunshine and fertility. If we are studying the dependence of-a variable on only a single explanatory variable, such as that of consumption expenditure on real income, such a study is known as single, or two variable, regression analysis. However, if we are studying the dependence of one variable on more than one explanatory variable, such as the crop yield, rainfall, temperature, sunshine and fertilizer, it is known as multiple regression analysis. A. mathematical equation that allow to predict values of a dependent variable from known values of a independent variables is called’a regression equation. When the dependence is represented by a straight line equation, the regression is said to be linear. Regression equation also.defined as mathematical equation that defines the relationship between two variables. J 7 v 8.4 Correlation and Regression Simple regression analysis showed that how variables are linearly related; Correlation analysis will only show the degree to which variables are linearly related. In regression analysis, a whole function is estimated, (a regression equation), but correlation analysis yield only one number (an index designed to give an immediate picture of how closely two variables move together), Although correlation is a less powerful technique than regression, the two are so closely related that correlation often becomes a useful aid in interpreting regression. In the correlation both variables are assumed to be random whereas in regression one variable is assumed to be random and other non-random or fixed. : STATISTICAL MODELING AND — ANALYSES - II /5:1__ BASICS OF ESTIMATION ‘Inferential statistics is a body of methods used to draw conclusions of inferences about characteristics of population based on sample data. It has two purposes: to make an Satmate about pep (esimeo and Go draw 9 conc obout poplton Grypatness tein ‘An estimator of a population parameter is a random variable ‘that depends on the Sample information and whose comprehensions provide approximations! to” this unknown parameter. In other words, an estimator of the population parameter is a sample Stic eorputed fom dit tat repens Be best rman Boe Picks parents sts tat ere oral corerporen tert sane | promect ac pnt prone ee omg sac eaination iste proceso ning or einaing the tue bit known vate ofthe popuision saree by wns the ping uk A rurmerestoeie ofa enon orate’ dined by epi «ora tvaled an estate For ample oes the value of population mean ut from the sample observations, have X= 7.5, then 7.5 is the estimate of yA formula is used to find out the estimate of population parameter is called the estimator. For Example, to estimate the population mean p, used X X is cole exiator of 1 an estimator i 2 ue o method for etimaing 2 Parameter based on sme data and itis andom varal, whe an estimate the Specific numeral va one by spphing an esbmetr to 8 gen sample and is 3 ted ruber Tee mop festa poeta nia ean”) 5.2 (POINT ESTIMATION (7A single statistic value obtained trom the sampling units that are used to estimate the value of the target parameter is called point estimate of that parameter. For Example, to draw 100 sampling units of an object the mean of 100 units i called the point estimate Cf the mean of all the units present in that objec. That i, let = 75 is the point estimate of population mean y. The relailty of a point estimate for a parameter is a matter of concern) For meaningful results, an inference concerning @ parameter riot only must consist a point estimate, but also must be accompanied by a measure of the reliability, of the saint ‘other words, one must be able to state how close our estimate is likely to be to thé true valve of the population parameter. This can be done by using the characteristics of the sampling distribution of the statistics that was used to obtain the point estimate: This procedure is called estimation by confidence interval 297 {Bcomscanner 298, Statistical Modeling and Analyse '8 point estimator Grawsiinferences about a population by estimating the values of an «unknown parameter using @ single Value. Unbiasedness isa desirable property for an estimator. An unbiased estimator is neither systematically too high nor too low ‘compared with the corresponding population parameter ie. whose expected value is equal tothe parameter. ‘An estimate isa specific realization of an estimator. In other words, the actual number computed from the data is called an estimate ofthe population parameter. For Example; sample mean isan estimator ofthe population mean and an estimate may be 60 inches, Since sample mean X is an unbiased estimator of population mean u 30, X = 60 is a intestate ofthe population mean, (eee the folowing Example to differentiate between an estimator and an estimate. Let the estimate of the mean income of all families based on random sample of 30, families, We can say thatthe estimator ofthe population mean i the sample mean if we base our concusions on the sample. mean income. Let the average income of the families in the sample is R5.50,000. Then the estimate ofthe population mean family income is Rs 50,000. In this Example, the parameter to be estimated is the population mean family income. The point estimator used isthe sample mean, and the resulting Point estimate is Rs 50,000, Population Parameter Estimator Estimate r Mean Ga) X ft / Variance (0%) ee oy, J [Bardo bevaton’ea = y Proportion ® Be example Ca sample consist of 15,7, and 3 Find the point estimate of the ‘population mean (i) population standard deviation (i andarderor ofthe mean Solution 1e5s7e3_ 16 Meal asian (i. The sample mean: X= =, Thus. the point estimate of population mean wis 4 and X is the estimator. a (i The sample standard deviation: S, 256 Thus, the Point estimate of population standard deviation @ is 258 and S, is the estimator. {Bcomscanner - LMI Quantitative Reasoni 299 (i) We know the standard error of mean is: ox . Using the sample standard saons . : ss 258 deviation S to estimate the standard error of mean a; S.E(X) = Sx= an 4 258, ots .29 Here, *y" the estimator of o, and 1.29 is the point estimate of standard error of the mean. ( 5.4 CONFIDENCE INTERVAL ESTIMATE FOR POPULATION MEAN A confidence interval for the population mean provides a range of values that is likely to contain the true population mean. It gives an estimate of the uncertainty around the sample mean,x, which is used to infer the population mean. For Example, a confidence interval that: the axsrage filling contents of a mineral water bottle by company A is 500ml. ( © construct caonecigs interval for 4, consider that the population is normal or not normal from which the samples are drawn. Hence we discuss the conditions about population and intervals: If the population is normal then we have to see whether the population standard deviation is given or not. /” {Beamscamner reaouriniy = Ih 323 Note, when we have no knowledge of approximate value ofp, let p 5. 5 because the product p (1 ~ p) equals maximum valve at p (5.13 HYPOTHESIS TESTING ‘When a researcher in any field sets out to test a new theory, he first set or formulates a hnypothesis or claim, which he believes to be true((n statistical terms, the hypothesis that the researcher tries to formulate is called the ‘alternative hypothesis. Against it, he formulates the null hypothesis. Null and alternative hypothesis both stated in terms of appropriates population parameters) We describe two states of nature about parameter that cannot simultaneously be true. Instead of trying, to. show that the alternative hypothesis is true, we attempt to produce evidence to show that the null hypothesis is. faise(A statistical hypothesis is @ statement about the numerical value of a population parameter, The null hypothesis denoted by H, is usually the hypothesis that is to be tested: for possible rejection. The alternative hypothesis denoted by H, or H, is the hypothesis against the null hypothesis 5.13.1 Simple Hypothesis and Composite Hypothesis % ‘simple hypothesis is one in which all parameters of the distribution are specified, For Example, the salaries of the government officers at a certain city are normally distributed with mean of 16000 afd variance of 1000. That i, we have stated, H,: 4 = 16000 is called the simple hypothesis. ° Or —Heat)-Hp= 0 isan Example of simple hypothesis | Let we have the followirig hypothesis Ho: 4 = 16000 and a?'< 1000 or Hg K= 16009 ands? > 1000 or” Hg ws 16000 anda = 1000 \ are all composite hypotheses because we cannot exactly know the distribution of parameters. A composite hypothesis is generally in the form: H,.0 <0, or 020,. The concept of simple'and composite hypothesis is applied to’ both’ full and alternative hypothesis 5.13.2 Test-Statistic 4 ‘Random variable whose value i calculated from the saifiple data and is used in making the decision "fail to reject H,” or ‘reject'H,". The value of the Calculated test statistic is used jn conjunction with a decision rule to determine’either “eject H," or “fail to reject Ho". This decision’rule must establish prior to collécting the data and specifies how you willteach the decision, The numerical value obtained from test-statstc is called the test value. There are two important statistical tests or test statistics. (i) the z test, used to test for mean of 2 large sample and (il'the't test, used for @ meat of a small sample. The general formula for tost-statstic is Tat yg «(obsetved value — (expected value) J standard error The observed value isthe statistic (eg, X), the expected value is the parameter (e9.. 1). that one would expect to obtain if the null hypothesis were true. That eo {Beamscamner 5.15 OUTLINE FOR TESTING A HYPOTHESES Specify the null and alternative hypotheses H, and H,. Specify the level of significance. Obtain a random sample from the popuiation of interest. Apply the oroper test-statistic and compute its value using the sample data. Specify the rejection region. This will depend on the value of a selected. Beamseanner Make the appropriate conclusion by observing whether the computed value of the test-statistic lies in the rejection region. If so, reject the null hypotheses; otherwise accept H,, 3 There are few important points, should be kept carefully. (i) The value of «, the probability of type | error, is specified in advance. It may be small or large, typical values are a = 0.01, 0.02, 0.05 and 0.10. For a fixed sample size, the size of rejection region decreases as the value of a decreases. (ii) The test-statistic is standardized to provide a measure of how great its departure is from the null hypothesized value of the parameter. The general formula of test statistic may be written as Estimate — Hypothesized value _ tvs Ey Test-statistic = “standard error oftestimate | 7 SD(rv) * For mean it is written’ as: joy ot 8 nome sa . est-statistic = where .o, ox en {Beamscamner 328 Statistical Modeling and Analyses - Il 5.17 (Test OF HYPOTHESES ABOUT A POPULATION MEAN Testing hypotheses about a population mean involves determining whether a sample mean significantly differs from a known or hypothesized population mean. Testing hypotheses about a population mean can vary depending on whether the population standard deviation (a) is known or unknown. When the population standard deviation (a) is known, a Z-test is typically used. This situation is more common when dealing with large populations or when historical data provides a reliable estimate of o. When the population standard deviation (c) is unknown, a T-test is used. This test is more appropriate for smaller sample sizes (n < 30) and situations where is not reliably known. Thus, to select the appropriate test depending on the situation, as follows: () Z-test is applied when the population variance is known and also when the sample size is large (n 2 30). (ii’ T-test is used when the population variance is unknown and the sample size is small (n < 30). This test accounts for additional uncertainty by using the sample standard deviation. One-sample T-test is applied when comparing the sample mean to a known value without considering another sample. {Beamscamner att iid tlemelacl ersen_afy ens ii) Populediiew sRemcleatl lost Sob. Dteaa: 4 ope Ln} MAEx = 14617 Geanscomer iii) Stowelayel_mecin enftolne tnlewal < Aveoge £ mag + Ho em £ Tena iy aby 4 Fein) @ Vee averege scone iin_a_mats beh 3 n_fr hey Qo te fo. b f, t mecgin error Dilawer from austnge to —<_1{- 20 oR So -U BL, Le oe ‘ Geanscomer wnat fun ti panekege eel = pete 4 messalnlel eae ne eeloe he Beamscanner Geanscomer

You might also like