0% found this document useful (0 votes)
8 views

Notes

Note

Uploaded by

Kritika Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Notes

Note

Uploaded by

Kritika Tiwari
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Statistics is a field of mathematics that deals with collecting, organizing, analyzing, and interpreting

data.

Data is a collection of facts or figures that can be used to answer questions or make decisions.

1. Statistics can be used to:


2. Describe data
3. Find patterns in data
4. Make predictions about data
5. Compare different sets of data

Statistics is used in many different fields, including:

1. Business
2. Economics
3. Science
4. Medicine
5. Sociology

In simple words, statistics is a way of making sense of data. It can be used to answer questions about
the world around us, and to make predictions about the future.

Statistics are used to summarize data. This means that they are used to describe the data in a way
that makes it easy to understand. For example, statistics can be used to calculate the average,
median, and mode of a set of data.

Statistics are used to find patterns in data. This means that they are used to look for relationships
between different variables. For example, statistics can be used to see if there is a correlation
between the number of hours a student studies and their grades.

Statistics are used to make predictions about data. This means that they are used to use the patterns
that have been found in the data to make guesses about what will happen in the future. For example,
statistics can be used to predict the weather or the outcome of an election.

Where are statistics found? Statistics are found in many different places, including:

 Newspapers and magazines: Newspapers and magazines often publish statistics about
current events, such as the unemployment rate or the crime rate.
 Government websites: Government websites often publish statistics about a variety of
topics, such as the population, the economy, and the environment.
 Academic journals: Academic journals often publish statistics about research studies.
 Business reports: Businesses often use statistics to track their performance and to make
decisions about their products and services.

Why do I need statistics? You need statistics because they can help you to:

 Understand the world around you: Statistics can help you to understand the world around
you by giving you information about things like population trends, economic growth, and
crime rates.
 Make better decisions: Statistics can help you to make better decisions by giving you
information about the likely outcomes of different choices.
 Solve problems: Statistics can help you to solve problems by giving you information about
the causes and effects of different phenomena.
 In short, statistics are a powerful tool that can be used to understand the world around us,
make better decisions, and solve problems.

Vocabulary of Statistics

 Variable is a characteristic of an item or individual. It is also


called as a data item.
 Population consists of all the items or individuals about which
you want to make inferences. In statistics population denotes a
large group consisting of elements having at least one common
feature.
 Sample is a subset of population selected for analysis to make
inferences. It is a finite subset of the population, that is chosen
by a systematic process, to find out the characteristics of the
parent set.
 Parameter is numerical measure that describes a characteristic
of a population such as the, population mean, population
standard deviation, population variance
 Statistic is numerical measure that describes a characteristic of a
sample.

Data analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of
discovering useful information, informing conclusions, and supporting decision-making.

There are four main types of data analysis:

1. Descriptive analysis: This type of analysis summarizes data and answers the question "what
happened?". It is the most basic type of data analysis and is often used to track key
performance indicators (KPIs) or to identify trends.
2. Diagnostic analysis: This type of analysis identifies the patterns of behavior that led to certain
outcomes. It is often used to troubleshoot problems or to understand why something
happened.
3. Predictive analysis: This type of analysis uses past data to predict future outcomes. It is often
used to forecast sales, identify risks, or segment customers.
4. Prescriptive analysis: This type of analysis combines the insights from descriptive, diagnostic,
and predictive analysis to recommend a course of action. It is often used to make decisions
or to solve problems.

Sure, here is a simpler explanation of data analysis:


Data analysis is the process of inspecting, cleaning, transforming, and modeling data with the goal of
discovering useful information, informing conclusions, and supporting decision-making.

There are four main types of data analysis:

Descriptive analysis: This type of analysis summarizes data and answers the question "what
happened?". It is the most basic type of data analysis and is often used to track key performance
indicators (KPIs) or to identify trends.

Diagnostic analysis: This type of analysis identifies the patterns of behavior that led to certain
outcomes. It is often used to troubleshoot problems or to understand why something happened.

Predictive analysis: This type of analysis uses past data to predict future outcomes. It is often used to
forecast sales, identify risks, or segment customers.

Prescriptive analysis: This type of analysis combines the insights from descriptive, diagnostic, and
predictive analysis to recommend a course of action. It is often used to make decisions or to solve
problems.

Data analysis is a powerful tool that can be used to understand the past, predict the future, and
make better decisions. It is a complex field, but there are many resources available to help you learn
more about it.

Here are some examples of how data analysis is used in the real world:

 Sales forecasting: Companies use data analysis to forecast future sales. This helps them to
plan production, inventory, and marketing campaigns.
 Risk assessment: Banks use data analysis to assess the risk of lending money to borrowers.
This helps them to make informed decisions about who to lend money to and how much to
lend.
 Customer segmentation: Retailers use data analysis to segment their customers into different
groups. This helps them to target their marketing campaigns more effectively.
 Fraud detection: Financial institutions use data analysis to detect fraud. This helps them to
protect their customers from financial losses.
 These are just a few examples of how data analysis is used in the real world. As the amount
of data available to us continues to grow, the use of data analysis is only going to become
more important.
o Eg. https://github.com/rohitgit1/Twitter-Trend-Analyzer-using-NLP

Importance of Descriptive analytics

• It enables to present the data in a meaningful and understandable way, which allows simpler
interpretation of the data.
• Descriptive statistics provides methods such as Regression, summary statistics are used to
get the unique insights, trends, and patterns in data.

• It helps to analyze the historical data based on which strategic business decisions are taken

• It helps to describe what is or what the data shows.

• It helps to simplify large amounts of data in a sensible way.

Data exploration:

The first step in data analysis is data exploration, here data is described using statistical techniques
and data visualization. It will help to determine the size, quantity, and the accuracy of the data to
understand what the nature of the data is. In data exploration, one can identify the relationships
between the variables, find the structure of the data, check for outliers and the distribution of data
to get patterns in the data. These patterns will then help to get an insight into the data.

Visual data is more understandable and appealing to see as compared to large data sets in tabular
forms. It’s difficult to get meaning out of large data that has rows and columns. Hence, data
visualization is good to get insights of the data using graphs, columns, dimensions, lines, etc.

Data exploration analysis (EDA) is a statistical technique used to analyze large data sets to obtain the
insights on the patterns, trends, relationships present in the data.

Types of Data exploration techniques:

Below are three types of Data exploration techniques:

1. Univariate Analysis

2. Bivariate Analysis

3. Multivariate Analysis

1. Univariate Analysis: Univariate analysis is the simplest form of analyzing the data, as Uni
stands for one which means that the data has only one variable. The patterns found in univariate
analysis can be describes through as follows:

 Frequency distribution tables: It shows the occurrence of values in a data. It will give a brief
idea about the data and easier to create patterns.
 Bar charts: It is useful for categorical data or grouped data like data containing age groups. It
tracks changes over time.
 Histogram: Data is divided into bins which is usually a range and calculate the number of
values in the range.
 Pie charts: It is used to see the percentage/portion of each group of data against the entire
data. Whole pie represents 100% and the sizes denote the relative sizes
 Frequency polygons: A frequency polygon is very similar to a histogram; it is used to compare
sets of data or to display a cumulative frequency distribution
 Skewness check: Skewness is a measure of shape. A common approach to check for
skewness is to plot the predictor variable
2. Bivariate Analysis: Bivariate analysis is one of the simplest forms of analyzing the data, as Bi stands
for two which means that the data has exactly two variable. The patterns found in Bivariate
analysis can be describes through as follows:

 Scatter plots: It shows data as dots or points. This plot helps to easily identify if two variables
are related. The pattern shows the type, if linear or no linear, and the strength of relationship
between two variables.
 Regression analysis: It will help to identify the magnitude of change in the dependent
variable due to the independent variables. It shows the relationships between variables.
 Linear correlation: This represents the strength of the relationship between two variables.
Correlation is represented by a value between +1 and -1. +1 denotes that there is highly
positive correlation, -1 denotes there is highly negative correlation. ) denotes no correlation.
 Analysis of variance (ANOVA): An ANOVA test is a way to find out if an experiment results is
significant or not. In other words, it help you to figure out if you need to reject the null
hypothesis or accept the alternate hypothesis

3. Multivariate Analysis: Multivariate analysis is used for analyzing the data which has more than one
outcome variable. It is used to analyze more complex sets of data. The patterns found in
Multivariate analysis can be describes through as follows:

 Principal component analysis (PCA): It is used to find the variability and reduce the
dimensions of the data set
 Correspondence analysis: Correspondence analysis makes use of contingency tables which
explains the way to represent data sets that fall into two or more categories.
 Cluster analysis: It classifies objects into clusters where each cluster data is similar with data
within the cluster but are different from data in other clusters.
 Multi-Regression Analysis: It is used where we have two or more explanatory variables which
have a linear relationship with the dependent variables

You might also like