2.1 Introduction To Satistical Data Analysis
2.1 Introduction To Satistical Data Analysis
Course
Basics & Terminology
Venkat Reddy
Statistical Packages
SAS
R
SPSS
Stata
Minitab
Excel
SAP HANA
HP SAS
R- Hadoop
What is Statistics?
Statistics is the science of data that involves:
Collecting
Classifying
Summarizing
Organizing and
Interpretation
Of numerical information.
Examples:
Stock prices
Climatology data such as rainfall amounts, average
temperatures
Marketing information
Gambling?
Key Terms
What is Data?
Sample?
Variables
Age
Gender
Income
Sales
Cost
Types Of Variables
In causal relationships:
CAUSE
EFFECT
independent variable dependent variable
Sales
Eg: Income
Lab
Descriptive Statistics
Measures of Variation
Variance, Standard Deviation, z-scores
10
Lab
Run proc means on market_final_data to print the
descriptive statistics
Average leads per campaign
Average reach
The vertical with highest number of campaigns
Average budget per campaign
11
-Details later
Inferential Statistics
12
13
Predictive Modeling
The science of predicting future outcomes based
on historical events.
14
-Details later
This presentation is just class notes. The course notes for Data Analysis Training is by written by me (Venkata Reddy
Konasani) as an aid for myself. The best way to treat this is as a high-level summary; the actual session went more in
depth (explained the examples, for instance) and contained other information. Most of this material was written as
informal notes, not intended for publication