Unit 3 Data Analysis

This document discusses different types of variables in data analysis including numeric, categorical, univariate, and multivariate variables. It provides examples of continuous and discrete numeric variables as well as nominal and ordinal categorical variables. It also defines univariate data as relating to one variable and multivariate data as having multiple related variables. Parameters describe characteristics of the entire population while statistics are estimates of parameters calculated from a sample of data.

Uploaded by

Shreya Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views

Unit 3 Data Analysis

Uploaded by

Shreya Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Unit 3 Data Analysis: Data: Numeric, Categorical, univariate and multivariate

Describing Raw Data

Raw data are the records or observations that make up a sample. Depending on the nature of
the intended analysis, these data could be stored in a specialized R object, often a data frame,
possibly read in from an external file using techniques. Before summarizing or modeling the
data, it is important to clearly identify your available variables.
A variable is a characteristic of an individual in a population, the value of which can differ
between entities within that population. Variables can take on a number of forms, determined
by the nature of the values they may take.

Numeric Variables
A numeric variable is one whose observations are naturally recorded as numbers. There are
two types of numeric variables: continuous and discrete.
A continuous variable can be recorded as any value in some interval, up to any number of
decimals. For example, if you were observing rainfall amount, a value of 15 mm would make
sense, but so would a value of 15.42135 mm. Any degree of measurement precision gives a
valid observation.
A discrete variable, on the other hand, may take on only distinct numeric values—and if the
range is restricted, then the number of possible values is finite. For example, if you were
observing the number of heads in 20 flips of a coin, only whole numbers would make sense.
It would not make sense to observe 15.42135 heads; the possible outcomes are restricted to
the integers from 0 to 20 (inclusive).

Categorical Variables
Though numeric observations are common for many variables, it’s also important to consider
categorical variables. Like some discrete variables, categorical variables may take only one
of a finite number of possibilities. Unlike discrete variables, however, categorical
observations are not always recorded as numeric values.
There are two types of categorical variables. Those that cannot be logically ranked are called
nominal. A good example of a categorical-nominal variable is gender. In most data sets, it
has two fixed possible values, male and female, and the order of these categories is irrelevant.
Categorical variables that can be naturally ranked are called ordinal. An example of a
categorical-ordinal variable would be the dose of a medicine, with the possible values low,
medium, and high. These values can be ordered in either increasing or decreasing amounts,
and the ordering might be relevant to the research.

Once you know what to look for, identifying the types of variables in a given data set is
straightforward. Take the data frame chickwts, which is available in the automatically loaded
datasets package. At the prompt, directly entering the following gives you the first five
records of this data set.
R> chickwts[1:5,]
weight feed
179 horsebean
2160 horsebean
3136 horsebean
4227 horsebean
5217 horsebean

R’s help file (?chickwts) describes these data as comprising the weights of 71 chicks (in
grams) after six weeks, based on the type of food provided to them. Now let’s take a look at
the two columns in their entirety as vectors:

R> chickwts$weight
179 160 136 227 217 168 108 124 143 140 309 229 181 141
260 203 148 169 213 257 244 271 243 230 248 327 329 250
193 271 316 267 199 171 158 248 423 340 392 339 341
226 320 295 334 322 297 318 325 257 303 315 380 153 263
242 206 344 258 368 390 379 260 404 318 352 359 216
222 283 332

R> chickwts$feed

weight is a numeric measurement that can fall anywhere on a continuum, so this is a numeric-
continuous variable. The fact that the chick weights appear to have been rounded or recorded
to the nearest gram does not affect this definition because in reality the weights can be any
value. feed is clearly a categorical variable because it has only six possible outcomes, which
aren’t numeric. The absence of any natural or easily identifiable ordering leads to the
conclusion that feed is a categorical-nominal variable.
Univariate and Multivariate Data
When discussing or analyzing data related to only one dimension, is called univariate data.
For example, the weight variable in the earlier example is univariate since each measurement
can be expressed with one component—a single number.
When it’s necessary to consider data with respect to variables that exist in more than one
dimension (in other words, with more than one component or measurement associated with
each observation), then the data are considered multivariate. Multivariate measurements are
arguably most relevant when the individual components aren’t as useful when considered on
their own (in other words, as univariate quantities) in any given statistical analysis.
An ideal example is that of spatial coordinates, which must be considered in terms of at least
two components—a horizontal x-coordinate and a vertical y-coordinate. The univariate data
alone—for example, the x-axis values only—aren’t especially useful. Consider the quakes
data set, which contains observations on 1,000 seismic events recorded off the coast of Fiji.
If you look at the first five records and read the descriptions in the help file ?quakes, you
quickly get a good understanding of what’s presented.

R>quakes[1:5,]
lat long depth mag stations
-20.42 181.62 562 4.8 41
-20.62 181.03 650 4.2 15
-26.00 184.10 42 5.4 43
-17.97 181.66 626 4.1 19
-20.42 181.96 649 4.0 11

The columns lat and long provide the latitude and longitude of the event, depth provides the
depth of the event (in kilometers), mag provides the magnitude, and stations provides the
number of observation stations that detected the event.

Parameter or Statistic
Statistics as a discipline is concerned with understanding features of an overall population,
defined as the entire collection of individuals or entities of interest. The characteristics of
that population are referred to as parameters. Because researchers are rarely able to access
relevant data on every single member of the population of interest, they typically collect a
sample of entities to represent the population and record relevant data from these entities.
They may then estimate the parameters of interest using the sample data—and those
estimates are the statistics.
For example, if you were interested in the average age of women in the United States who
own cats, the population of interest would be all women residing in the United States who
own at least one cat. The parameter of interest is the true mean age of women in the United
States who own at least one cat. Of course, obtaining the age of every single female
American with a cat would be a difficult feat. A more feasible approach would be to
randomly identify a smaller number of cat-owning American women and take data from
them—this is your sample, and the mean age of the women in the sample is your statistic.
Thus, the key difference between a statistic and a parameter is whether the characteristic
refers to the sample you drew your data from or the wider population.

7CCMMS61 Statistics For Data Analysis: Francisco Javier Rubio Department of Mathematics
No ratings yet
7CCMMS61 Statistics For Data Analysis: Francisco Javier Rubio Department of Mathematics
19 pages
Econ 1006 Summary Notes 1
No ratings yet
Econ 1006 Summary Notes 1
24 pages
2348314_BioStats_CIA1
No ratings yet
2348314_BioStats_CIA1
10 pages
01 - Introduction To Biostatistics
No ratings yet
01 - Introduction To Biostatistics
16 pages
Chapter 4_Statistical Data and Variables_Nursing Level 500_31.10.2024
No ratings yet
Chapter 4_Statistical Data and Variables_Nursing Level 500_31.10.2024
6 pages
Chapter 2
No ratings yet
Chapter 2
6 pages
Chapters 1 and 2
No ratings yet
Chapters 1 and 2
12 pages
STAT Lec1 2023
No ratings yet
STAT Lec1 2023
27 pages
MATH& 146 Lesson 3: Sections 1.1 and 1.2
No ratings yet
MATH& 146 Lesson 3: Sections 1.1 and 1.2
29 pages
Bio Statistics
No ratings yet
Bio Statistics
107 pages
Introduction To Data Analtsis
No ratings yet
Introduction To Data Analtsis
33 pages
Introduction To Basic Statistics
100% (2)
Introduction To Basic Statistics
31 pages
Introduction To Qa
No ratings yet
Introduction To Qa
4 pages
Unit One Graphing and Descriptive Statis-1
No ratings yet
Unit One Graphing and Descriptive Statis-1
12 pages
Lecture Notes Quanti 1
No ratings yet
Lecture Notes Quanti 1
105 pages
Introduction To The Study of Statistics - KBO
No ratings yet
Introduction To The Study of Statistics - KBO
22 pages
R-Training For Print
No ratings yet
R-Training For Print
11 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
27 pages
Lecture 1 Introduction To Biostatistics
No ratings yet
Lecture 1 Introduction To Biostatistics
31 pages
Lecture 1 (4)
No ratings yet
Lecture 1 (4)
4 pages
Statistics May Be Divided Into Two Main Branches
100% (2)
Statistics May Be Divided Into Two Main Branches
3 pages
chapter 1
No ratings yet
chapter 1
109 pages
1 Elements, Variables and Data Categorization
No ratings yet
1 Elements, Variables and Data Categorization
27 pages
1.introduction To Biostatistics
No ratings yet
1.introduction To Biostatistics
56 pages
Ns Statistics 2022
No ratings yet
Ns Statistics 2022
70 pages
Types of Data
No ratings yet
Types of Data
34 pages
Stats_Notes
No ratings yet
Stats_Notes
81 pages
statistics notes part - 1
No ratings yet
statistics notes part - 1
25 pages
Lecture 2
No ratings yet
Lecture 2
50 pages
Module Stat
No ratings yet
Module Stat
56 pages
1. Biostatistics- First things first – the nature-converted
No ratings yet
1. Biostatistics- First things first – the nature-converted
28 pages
Week-1
No ratings yet
Week-1
23 pages
1.1 NOTES Ada
No ratings yet
1.1 NOTES Ada
15 pages
Unit-2 Ids
No ratings yet
Unit-2 Ids
64 pages
Statistics
No ratings yet
Statistics
81 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
42 pages
1 - 2 Biostatistics
No ratings yet
1 - 2 Biostatistics
24 pages
Sta 103 L1 Upda2
No ratings yet
Sta 103 L1 Upda2
104 pages
Data Science Unit 2 Notes
No ratings yet
Data Science Unit 2 Notes
35 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
272 pages
CHP1 Mat161
No ratings yet
CHP1 Mat161
4 pages
Intro to Biostat (1)
No ratings yet
Intro to Biostat (1)
43 pages
Statistics For Economists - Lecture Notes
No ratings yet
Statistics For Economists - Lecture Notes
171 pages
0 Lec 4 5
No ratings yet
0 Lec 4 5
29 pages
STAE Lecture Notes - LU1
No ratings yet
STAE Lecture Notes - LU1
7 pages
Chapter 1 Part 2
No ratings yet
Chapter 1 Part 2
113 pages
TALLER 2
No ratings yet
TALLER 2
11 pages
Types of Statistical Data
No ratings yet
Types of Statistical Data
35 pages
Chapter 1 - Definition and Uses of Statistics - 1
No ratings yet
Chapter 1 - Definition and Uses of Statistics - 1
15 pages
IEM Outline Lecture Notes Autumn 2016
No ratings yet
IEM Outline Lecture Notes Autumn 2016
198 pages
Basic Concepts and Foundations of Quantitative Research
No ratings yet
Basic Concepts and Foundations of Quantitative Research
18 pages
7CCMMS61 Statistics For Data Analysis: Francisco Javier Rubio Department of Mathematics
No ratings yet
7CCMMS61 Statistics For Data Analysis: Francisco Javier Rubio Department of Mathematics
13 pages
FDS Unit II Notes
No ratings yet
FDS Unit II Notes
48 pages
Unit 1 Computational Statistics
No ratings yet
Unit 1 Computational Statistics
4 pages
Inferential Statistics
No ratings yet
Inferential Statistics
92 pages
Introduction to Statistics
No ratings yet
Introduction to Statistics
12 pages
Statistics Part1
No ratings yet
Statistics Part1
28 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
Unit 3 Covariance and Correlation
No ratings yet
Unit 3 Covariance and Correlation
7 pages
Unit 3 Count Propotion
No ratings yet
Unit 3 Count Propotion
5 pages
Unit1 Matrix and Array
No ratings yet
Unit1 Matrix and Array
19 pages
Unit 1 R Reading-Writing Files
No ratings yet
Unit 1 R Reading-Writing Files
8 pages
Digital Watermarking Tech Overview-WIPRO
No ratings yet
Digital Watermarking Tech Overview-WIPRO
9 pages
Cryptography & Protocols: Presented By: Dr. S. S. Bedi Department of CSIT, MJP Rohilkhsnd University, Bareilly
No ratings yet
Cryptography & Protocols: Presented By: Dr. S. S. Bedi Department of CSIT, MJP Rohilkhsnd University, Bareilly
59 pages
Network Security
No ratings yet
Network Security
43 pages
Precalculus First Bimester Syllabus
No ratings yet
Precalculus First Bimester Syllabus
5 pages
Prempaste
No ratings yet
Prempaste
1 page
Assignment No.1
No ratings yet
Assignment No.1
3 pages
Silicon Rectifier Transformer PDF
No ratings yet
Silicon Rectifier Transformer PDF
72 pages
APRU Post Event Report Final
No ratings yet
APRU Post Event Report Final
23 pages
Test in Science 8 Sounds
No ratings yet
Test in Science 8 Sounds
1 page
Instant Download The Ethics of Social Roles Alex Barber (Editor) PDF All Chapters
100% (2)
Instant Download The Ethics of Social Roles Alex Barber (Editor) PDF All Chapters
57 pages
1st Grade Science - Butterfly
No ratings yet
1st Grade Science - Butterfly
7 pages
Digital Activism, Community Media, And Sustainable Communication In Latin America 1st Edition Edition Cheryl Martens 2024 scribd download
100% (1)
Digital Activism, Community Media, And Sustainable Communication In Latin America 1st Edition Edition Cheryl Martens 2024 scribd download
66 pages
Crano Gardikiotis Encyclopedia
No ratings yet
Crano Gardikiotis Encyclopedia
8 pages
DLL English 10 Q3 W1
No ratings yet
DLL English 10 Q3 W1
4 pages
Apex Connection of Pitched Portal Frame
No ratings yet
Apex Connection of Pitched Portal Frame
4 pages
Full download Multimessenger Astronomy John Etienne Beckman pdf docx
No ratings yet
Full download Multimessenger Astronomy John Etienne Beckman pdf docx
47 pages
Uk Handbook For Aml - United Kingdom Hydrographic Office
No ratings yet
Uk Handbook For Aml - United Kingdom Hydrographic Office
76 pages
Arithmetic Series
No ratings yet
Arithmetic Series
11 pages
El Nino Plan
No ratings yet
El Nino Plan
4 pages
Blagouchine Malmsten Integrals and Their Evaluation by Contour Integration Methods Ramanujan J 2014
No ratings yet
Blagouchine Malmsten Integrals and Their Evaluation by Contour Integration Methods Ramanujan J 2014
96 pages
Crio - Copy Business Operations - Case Study Assignment
No ratings yet
Crio - Copy Business Operations - Case Study Assignment
3 pages
Nstp2 Lesson 1 Community 2023
No ratings yet
Nstp2 Lesson 1 Community 2023
20 pages
Article 4 - How Teenager Juliane Koepcke Survived A Plane Crash and Solo 11-Day Trek Out of The Amazon - ABC News
No ratings yet
Article 4 - How Teenager Juliane Koepcke Survived A Plane Crash and Solo 11-Day Trek Out of The Amazon - ABC News
10 pages
De MM
No ratings yet
De MM
2 pages
Zrb2 SWR Bridge
No ratings yet
Zrb2 SWR Bridge
2 pages
Project: A&F Chengdu Location: Chengdu, China Subject: Insulating Glass Unit Qualification (10 MM & 8mm Lite)
No ratings yet
Project: A&F Chengdu Location: Chengdu, China Subject: Insulating Glass Unit Qualification (10 MM & 8mm Lite)
4 pages
pf_30
No ratings yet
pf_30
2 pages
Well Seismic Tie PDF
No ratings yet
Well Seismic Tie PDF
9 pages
Agphp
No ratings yet
Agphp
15 pages
Sets Lesson Plan
No ratings yet
Sets Lesson Plan
2 pages
NPS/002/020 - Technical Specification For 11 & 20kV Power Cables
No ratings yet
NPS/002/020 - Technical Specification For 11 & 20kV Power Cables
18 pages
Hygiene Catalogue INT 14112022 Compressed
No ratings yet
Hygiene Catalogue INT 14112022 Compressed
80 pages
Literature Review On Student Absenteeism
100% (2)
Literature Review On Student Absenteeism
4 pages

Unit 3 Data Analysis

Uploaded by

Unit 3 Data Analysis

Uploaded by

Unit 3 Data Analysis: Data: Numeric, Categorical, univariate and multivariate

Describing Raw Data

You might also like