0% found this document useful (0 votes)
16 views12 pages

Introduction To Statistics

The document provides an overview of statistics, including its types: descriptive and inferential statistics, along with a focus on biostatistics related to biological data. It explains measures of central tendency (mean, median, mode) and measures of dispersion (range, standard deviation), as well as correlation and its types. Additionally, it discusses Karl Pearson's Coefficient of Correlation and multiple correlation, highlighting their merits and demerits.

Uploaded by

impulzedutech
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views12 pages

Introduction To Statistics

The document provides an overview of statistics, including its types: descriptive and inferential statistics, along with a focus on biostatistics related to biological data. It explains measures of central tendency (mean, median, mode) and measures of dispersion (range, standard deviation), as well as correlation and its types. Additionally, it discusses Karl Pearson's Coefficient of Correlation and multiple correlation, highlighting their merits and demerits.

Uploaded by

impulzedutech
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Exercise : BP608T

BIOSTATISTICS AND RESEARCH


METHODOLOGY:: INTRODUCTION TO STATISTICS
Introduction to Statistics
Statistics is defined as the discipline that deals with the collection, organization, analysis,
summarization, interpretation, and presentation of data.
Types of Statistics
1.Descriptive Statistics: A summary statistic that quantitatively describes or summarizes features of
collected data. It involves using and analyzing a set of collected data without making inferences
about a larger population.
2.Inferential Statistics: The process of analyzing data to deduce properties of an underlying
probability distribution. It is used to infer properties of a population by testing hypotheses and
deriving estimates.
Biostatistics
Biostatistics is a branch of statistics that deals with data related to living organisms. It is applied in
the collection, analysis, and interpretation of biological data, particularly in health and medicine.
Steps in Biostatistics:
1.Generation of a hypothesis
2.Collection of experimental data
3.Classification of collected data
4.Categorization and analysis of collected data
5.Interpretation of data
Exercise : BP608T
Measures of Central Tendency
Mean
Mean is the average of a given set of numbers and is calculated by dividing the sum of the numbers
by the total count of values in the dataset.
Formula:

Median
Median is the middle value in a dataset when arranged in ascending or descending order.
•If the number of observations is odd, the median is the middle value.
•If the number of observations is even, the median is the average of the two middle values.

Mode
Mode is the value that appears most frequently in a dataset. Example: Given the dataset {2, 4, 5, 5,
6, 7}, the mode is 5 since it appears twice.

Exercise : BP608T
Exercise : BP608T
Measures of Dispersion
Dispersion refers to the spread or variability of data in a dataset. It helps in understanding how much
the data varies from one another.
Types of Dispersion Measures
1.Absolute Measures of Dispersion: These have the same unit as the original dataset (e.g., Range,
Standard Deviation, Quartile Deviation).
2.Relative Measures of Dispersion: These are expressed as ratios or percentages (e.g., Coefficients
of Range, Variation, Standard Deviation, Quartile Deviation, Mean Deviation).
Range
Range is the difference between the highest and lowest values in a dataset.
Standard Deviation (SD)
Standard Deviation (SD) measures the amount of variation or dispersion in a dataset.
•A low SD indicates that values are close to the mean.
•A high SD suggests that values are spread over a wider range.

Exercise : BP608T
Standard Deviation (SD)
Standard Deviation (SD) measures the amount of variation or dispersion in a dataset.
•A low SD indicates that values are close to the mean.
•A high SD suggests that values are spread over a wider range.
Formula:

Characteristics of Standard Deviation:


1.It includes algebraic signs and is less affected by sampling fluctuations.
2.A small SD indicates a high probability of values being close to the mean.
3.It is independent of origin but not of scale.
4.SD can be mathematically manipulated.
5.It is widely used as a measure of dispersion.
Applications of Standard Deviation:
1.Describes the variation of a large distribution from the mean.
2.Indicates whether a difference from the mean is by chance or due to some special cause.
3.Helps identify errors in statistical calculations.
4.Determines the suitable sample size for valid conclusions.
5.Measures confidence in statistical conclusions.
6.Compares real-world data against statistical models.
Exercise : BP608T
Exercise : BP608T
Correlation
Correlation measures the relationship between two or more variables. It indicates the extent to
which they fluctuate together.
•Positive Correlation: Both variables increase or decrease together.
•Negative Correlation: One variable increases while the other decreases.
•Linear Correlation: A constant ratio exists between the changes in values of both variables.
•Non-Linear Correlation: The change in one variable does not follow a constant ratio with the other
variable.

Exercise : BP608T
Karl Pearson's Coefficient of Correlation
Karl Pearson's Coefficient of Correlation is a statistical measure used to determine the degree of
linear relationship between two variables. It is also called the product-moment correlation
coefficient and is denoted by 'r'.
Formula
The coefficient of correlation is given by:

Exercise : BP608T
Merits
1.It provides a precise and quantitative measure of correlation with a meaningful interpretation.
2.It indicates both the direction (positive or negative) and the strength of the correlation.
Demerits
1.The calculation is time-consuming for large datasets.
2.The correlation coefficient is always between -1 and +1:

Exercise : BP608T
Multiple Correlation
Multiple correlation measures the relationship between three or more variables.
If z is the dependent variable and x and y are independent variables, the multiple correlation
formula is:

Exercise : BP608T

You might also like