0% found this document useful (0 votes)

152 views23 pages

Introduction To R Programming

R is an interpreted programming language used for statistical analysis and graphical display of data. It was created at the University of Auckland and is now developed by the R Core Team. R can be used for machine learning, statistics, data analysis and visualization. It has a wide range of statistical and graphical techniques built-in and can easily create functions and packages. R is open-source, platform independent and has a large user community.

Uploaded by

tapstaps902

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

152 views23 pages

Introduction To R Programming

Uploaded by

tapstaps902

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 23

"R is an interpreted computer programming language which was

created by Ross Ihaka and Robert Gentleman at the University of

Auckland, New Zealand." The R Development Core Team currently
develops R. It is also a software environment used to
analyze statistical information, graphical representation, reporting,
and data modeling. R is the implementation of the S
programming language, which is combined with lexical scoping
semantics.
R Programming Language
 R programming is used as a leading tool for machine learning,
statistics, and data analysis. Objects, functions, and packages can
easily be created by R.
 It’s a platform-independent language. This means it can be applied to
all operating systems.
 It’s an open-source free language. That means anyone can install it in
any organization without purchasing a license.
 R programming language is not only a statistic package but also
allows us to integrate with other languages (C, C++). Thus, you can
easily interact with many data sources and statistical packages.
 The R programming language has a vast community of users and it’s
growing day by day.
 R is currently one of the most requested programming languages in
the Data Science job market which makes it the hottest trend
nowadays
 It was designed by Ross Ihaka and Robert Gentleman at the
University of Auckland, New Zealand, and is currently being
developed by the R Development Core Team.
 R programming language is an implementation of the S programming
language. It also combines with lexical scoping semantics inspired by
Scheme. Moreover, the project was conceived in 1992, with an initial
version released in 1995 and a stable beta version in 2000.
Why Use R?
 Statistical Analysis: R is designed for analysis and It provides an
extensive collection of graphical and statistical techniques, By
making a preferred choice for statisticians and data analysts.
 Open Source: R is an open – source software, which means it is
freely available to anyone. It can be accessble by a vibrant
community of users and developers.
 Data Visulaization : R boasts an array of libraries like ggplot2 that
enable the creation of high-quality, customizable data visualizations.
 Data Manipulation : R offers tools that are for data manipulation
and transformation. For example: IT simplifies the process of
filtering , summarizing and transforming data.
 Integration : R can be easily integrate with other programming
languages and data sources. IT has connectors to various databases
and can be used in conjunction with python, SQL and other tools.
 Community and Packages: R has vast ecosystem of packages that
extend its functionality. There are packages that can help you
accomplish needs of analytics.
Features of R Programming Language
 R Packages: One of the major features of R is it has a wide
availability of libraries. R has CRAN(Comprehensive R Archive
Network), which is a repository holding more than 10, 0000
packages.
 Distributed Computing: Distributed computing is a model in which
components of a software system are shared among multiple
computers to improve efficiency and performance. Two new
packages ddR and multidplyr used for distributed programming in R
were released in November 2015.
Features of R Programming Language

1. Open-source
R in data science is free software that is accessible to everyone.
Furthermore, the programming language is adaptable, making it easy to
integrate with different applications and processes. Despite being open-
source software, the programming language exudes quality and is usable
and versatile.
2. Powerful graphics
This is one of the attractive features of R programming. Many data
scientists use R while analyzing data because it has static graphics that
produce good-quality data visualizations. Moreover, the programming
language has a comprehensive library that provides interactive graphics
and makes data visualization and representation easy to analyze. From
elaborative and interactive flow diagrams to bar graphs, R has everything
that makes data analysis enigmatic and easy.
3. Widely used
R fosters a community of its own. The programming language is widely
used by data scientists and business leaders across the globe because it is
an open-source computer system that evokes a sense of community
among its users.
4. Performs complex statistical calculations
R’s wide popularity is because of its ability to perform simple and
complex mathematical and statistical calculations. It is also used for
analyzing data in many industries.
5. Compatibility
R is compatible with computer programs like C, C++, Java, Python, etc.
Its functions can be easily integrated into different computer programs.
These features and more make R a favourite choice of many data
scientists. Interestingly, web applications like Twitter, Google analytics,
BBC, others use R in data science. They use it to collect data and create
clear data visualizations or graphs to draw inferences and implement their
business conclusions. If you are a data scientist wondering how to
procure R for data analysis, proceed to the next section wherein we have
explained how to install the R program below.
Basic R program
Since R is much similar to other widely used languages syntactically, it
is easier to code and learn in R. Programs can be written in R in any of
the widely used IDE like R Studio, Rattle, Tinn-R, etc. After writing
the program save the file with the extension .r. To run the program use
the following command on the command line:

# R program to print Welcome to GFG!

# Below line will print "Welcome to GFG!"
cat("Welcome to GFG!")
Output:
Welcome to GFG!

Advantages of R
 R is the most comprehensive statistical analysis package. As new
technology and concepts often appear first in R.
 As R programming language is an open source. Thus, you can run R
anywhere and at any time.
 R programming language is suitable for GNU/Linux and Windows
operating systems.
 R programming is cross-platform and runs on any operating system.
 In R, everyone is welcome to provide new packages, bug fixes, and
code enhancements.
Disadvantages of R
 In the R programming language, the standard of some packages is
less than perfect.
 Although, R commands give little pressure on memory management.
So R programming language may consume all available memory.
 In R basically, nobody to complain if something doesn’t work.
 R programming language is much slower than other programming
languages such as Python and MATLAB.
Applications of R
 We use R for Data Science. It gives us a broad variety of libraries
related to statistics. It also provides the environment for statistical
computing and design.
 R is used by many quantitative analysts as its programming tool.
Thus, it helps in data importing and cleaning.
 R is the most prevalent language. So many data analysts and research
programmers use it. Hence, it is used as a fundamental tool for
finance.
 Tech giants like Google, Facebook, Bing, Twitter, Accenture, Wipro,
and many more using R nowadays.
Importnce of R programming in Data Analysis :-

Six Reasons Why You Should Learn R for Data Science

So you want to learn data skills. That’s great! But we offer tons of data
science courses. Why should you learn R programming specifically?
Would it be better to learn Python?

If you really want to dig into that question, we’ve demonstrated Python
vs. R to show how each language handles common data science tasks.
And while the the bottom line is that each language has its own strengths,
and both are great choices for data science, R does have unique strengths
that are worth considering!

1. R is built for statistics.

R was originally designed by statisticians for doing statistical analysis,
and it remains the programming choice of most statisticians today. R’s
syntax makes it easy to create complex statistical models with just a few
lines of code. Since so many statisticians use and contribute to R
packages, you’re likely to be able to find support for any statistical
analysis you need to perform.

For related reasons, R is the statistical and data analysis language of

course in many academic settings. If you aspire to work in academia —
or if you'd just like to read academic papers and then be able to dig into
the code behind them — having R programming skills can be a must.

2. R is a popular language for data science at top tech firms

Almost all of them hire data scientists who use R. Facebook, for example,
uses R to do behavioral analysis with user post data. Google uses R to
assess ad effectiveness and make economic forecasts. Twitter uses R for
data visualization and semantic clustering. Microsoft, Uber, AirBnb,
IBM, HP – they all hire data scientists who can program in R.

And by the way, it’s not just tech firms: R is in use at analysis and
consulting firms, banks and other financial institutions, academic
institutions and research labs, and pretty much everywhere else data
needs analyzing and visualizing. Even the New York Times uses R!

3. Learning the data science basics is arguably easier in R.

Python may be one of the most beginner-friendly programming
languages, but once you get past the syntax, R has a big advantage: it was
designed specifically with data manipulation and analysis in mind.

Because of that, learning the core skills of data science – data

manipulation, data visualization, and machine learning – can actually be
easier in R once you’ve gotten through the basic fundamentals. Check
out, for example, how straightforward it is to create these common data
visualization styles in R.

And of course, there's the tidyverse, a group of packages that's built

specifically to make data work in R quicker, easier, and more accessible.
In fact, that's really an advantage in and of itself:

4. Amazing packages that make your life easier.

Because R was designed with statistical analysis in mind, it has a
fantastic ecosystem of packages and other resources that are great for data
science. The dplyr package, for example, makes data manipulation a
breeze, and ggplot2 is a fantastic tool for data visualization.

These packages are part of the tidyverse, a growing collection of

packages maintained by RStudio, a certifed B-corp that also creates a
free-to-use R environment of the same name that's perfect for data work.
These packages are powerful, easy to access, and have great
documentation.

RStudio, a company that produces some amazing R packages and the

6. Put another tool in your toolkit.

Even if you’re already a Python expert, no one language is going to be
the right tool for every job. Adding R to your repertoire will make some
projects easier – and of course, it’ll also make you a more flexible and
marketable employee when you’re looking for jobs in data science.

Even if you don't want to use R yourself, learning the basics will make it
easier for you to follow someone else's R code if you ever have to take
over a coworker's project. Being able to look at R and translate it into
Python means that the amazing resources of both languages are open to
you.

Long story short: there are lots of great reasons why you should learn R,
because it's a fantastic language for data science.

Types of Data Analysis :-

ata analysis is an aspect of data science and data analytics that is all about
analyzing data for different kinds of purposes. The data analysis process
involves inspecting, cleaning, transforming and modeling data to draw
useful insights from it.
WHAT ARE THE DIFFERENT TYPES OF DATA ANALYSIS?

Types of Data Analysis

Data analysis can be separated and organized into types, arranged in an

increasing order of complexity.

1. Descriptive analysis
2. Diagnostic analysis
3. Exploratory analysis
4. Inferential analysis
5. Predictive analysis
6. Causal analysis
7. Mechanistic analysis
8. Prescriptive analysis

1. DESCRIPTIVE ANALYSIS

The goal of descriptive analysis is to describe or summarize a set of data.

Here’s what you need to know:

 Descriptive analysis is the very first analysis performed in the data

analysis process.
 It generates simple summaries about samples and measurements.
 It involves common, descriptive statistics like measures of central
tendency, variability, frequency and position.

Descriptive Analysis Example

Take the Covid-19 statistics page on Google, for example. The line graph
is a pure summary of the cases/deaths, a presentation and description of
the population of a particular country infected by the virus.

Descriptive analysis is the first step in analysis where you summarize and
describe the data you have using descriptive statistics, and the result is a
simple presentation of your data.
2. DIAGNOSTIC ANALYSIS

Diagnostic analysis seeks to answer the question “Why did this happen?”
by taking a more in-depth look at data to uncover subtle patterns. Here’s
what you need to know:

 Diagnostic analysis typically comes after descriptive analysis, taking

initial findings and investigating why certain patterns in data happen.
 Diagnostic analysis may involve analyzing other related data sources,
including past data, to reveal more insights into current data trends.
 Diagnostic analysis is ideal for further exploring patterns in data to
explain anomalies.

Diagnostic Analysis Example

A footwear store wants to review its website traffic levels over the
previous 12 months. Upon compiling and assessing the data, the
company’s marketing team finds that June experienced above-average
levels of traffic while July and August witnessed slightly lower levels of
traffic. To find out why this difference occurred, the marketing team
takes a deeper look. Team members break down the data to focus on
specific categories of footwear. In the month of June, they discovered that
pages featuring sandals and other beach-related footwear received a high
number of views while these numbers dropped in July and August.

Marketers may also review other factors like seasonal changes and
company sales events to see if other variables could have contributed to
this trend.

3. EXPLORATORY ANALYSIS (EDA)

Exploratory analysis involves examining or exploring data and finding

relationships between variables that were previously unknown. Here’s
what you need to know:
 EDA helps you discover relationships between measures in your data,
which are not evidence for the existence of the correlation, as denoted by
the phrase, “Correlation doesn’t imply causation.”
 It’s useful for discovering new connections and forming hypotheses. It
drives design planning and data collection.

Exploratory Analysis Example

Climate change is an increasingly important topic as the global

temperature has gradually risen over the years. One example of an
exploratory data analysis on climate change involves taking the rise in
temperature over the years from 1950 to 2020 and the increase of human
activities and industrialization to find relationships from the data. For
example, you may increase the number of factories, cars on the road and
airplane flights to see how that correlates with the rise in temperature.

Exploratory analysis explores data to find relationships between measures

without identifying the cause. It’s most useful when formulating
hypotheses.

4. INFERENTIAL ANALYSIS

Inferential analysis involves using a small sample of data to infer

information about a larger population of data.

The goal of statistical modeling itself is all about using a small amount of
information to extrapolate and generalize information to a larger group.
Here’s what you need to know:

 Inferential analysis involves using estimated data that is representative of

a population and gives a measure of uncertainty or standard deviation to
your estimation.
 The accuracy of inference depends heavily on your sampling scheme. If
the sample isn’t representative of the population, the generalization will
be inaccurate. This is known as the central limit theorem.
Inferential Analysis Example

The idea of drawing an inference about the population at large with a

smaller sample size is intuitive. Many statistics you see on the media and
the internet are inferential; a prediction of an event based on a small
sample. For example, a psychological study on the benefits of sleep might
have a total of 500 people involved. When they followed up with the
candidates, the candidates reported to have better overall attention spans
and well-being with seven-to-nine hours of sleep, while those with less
sleep and more sleep than the given range suffered from reduced attention
spans and energy. This study drawn from 500 people was just a tiny
portion of the 7 billion people in the world, and is thus an inference of the
larger population.Inferential analysis extrapolates and generalizes the
information of the larger group with a smaller sample to generate analysis
and predictions.

5. PREDICTIVE ANALYSIS

Predictive analysis involves using historical or current data to find

patterns and make predictions about the future. Here’s what you need to
know:

 The accuracy of the predictions depends on the input variables.

 Accuracy also depends on the types of models. A linear model might
work well in some cases, and in other cases it might not.
 Using a variable to predict another one doesn’t denote a causal
relationship.

Predictive Analysis Example

The 2020 US election is a popular topic and many prediction models are
built to predict the winning candidate. FiveThirtyEight did this to forecast
the 2016 and 2020 elections. Prediction analysis for an election would
require input variables such as historical polling data, trends and current
polling data in order to return a good prediction. Something as large as an
election wouldn’t just be using a linear model, but a complex model with
certain tunings to best serve its purpose.

Predictive analysis takes data from the past and present to make
predictions about the future.

6. CAUSAL ANALYSIS

Causal analysis looks at the cause and effect of relationships between

variables and is focused on finding the cause of a correlation. Here’s what
you need to know:

 To find the cause, you have to question whether the observed correlations
driving your conclusion are valid. Just looking at the surface data won’t
help you discover the hidden mechanisms underlying the correlations.
 Causal analysis is applied in randomized studies focused on identifying
causation.
 Causal analysis is the gold standard in data analysis and scientific studies
where the cause of phenomenon is to be extracted and singled out, like
separating wheat from chaff.
 Good data is hard to find and requires expensive research and studies.
These studies are analyzed in aggregate (multiple groups), and the
observed relationships are just average effects (mean) of the whole
population. This means the results might not apply to everyone.

Causal Analysis Example

Say you want to test out whether a new drug improves human strength
and focus. To do that, you perform randomized control trials for the drug
to test its effect. You compare the sample of candidates for your new drug
against the candidates receiving a mock control drug through a few tests
focused on strength and overall focus and attention. This will allow you
to observe how the drug affects the outcome.
Causal analysis is about finding out the causal relationship between
variables, and examining how a change in one variable affects another.

7. MECHANISTIC ANALYSIS

Mechanistic analysis is used to understand exact changes in variables that

lead to other changes in other variables. Here’s what you need to know:

 It’s applied in physical or engineering sciences, situations that require

high precision and little room for error, only noise in data is measurement
error.
 It’s designed to understand a biological or behavioral process, the
pathophysiology of a disease or the mechanism of action of an
intervention.

Mechanistic Analysis Example

Many graduate-level research and complex topics are suitable examples,

but to put it in simple terms, let’s say an experiment is done to simulate
safe and effective nuclear fusion to power the world. A mechanistic
analysis of the study would entail a precise balance of controlling and
manipulating variables with highly accurate measures of both variables
and the desired outcomes. It’s this intricate and meticulous modus
operandi toward these big topics that allows for scientific breakthroughs
and advancement of society.

Mechanistic analysis is in some ways a predictive analysis, but modified

to tackle studies that require high precision and meticulous
methodologies for physical or engineering science.

8. PRESCRIPTIVE ANALYSIS

Prescriptive analysis compiles insights from other previous data analyses

and determines actions that teams or companies can take to prepare for
predicted trends. Here’s what you need to know:
 Prescriptive analysis may come right after predictive analysis, but it may
involve combining many different data analyses.
 Companies need advanced technology and plenty of resources to conduct
prescriptive analysis. AI systems that process data and adjust automated
tasks are an example of the technology required to perform prescriptive
analysis.

Prescriptive Analysis Example

Prescriptive analysis is pervasive in everyday life, driving the curated

content users consume on social media. On platforms like TikTok and
Instagram, algorithms can apply prescriptive analysis to review past
content a user has engaged with and the kinds of behaviors they exhibited
with specific posts. Based on these factors, an algorithm seeks out similar
content that is likely to elicit the same response and recommends it on a
user’s personal feed.

R Data Types :-

Generally, while doing programming in any programming language, you

need to use various variables to store various information. Variables are
nothing but reserved memory locations to store values. This means that,
when you create a variable you reserve some space in memory.
You may like to store information of various data types like character,
wide character, integer, floating point, double floating point, Boolean etc.
Based on the data type of a variable, the operating system allocates
memory and decides what can be stored in the reserved memory.

R has a variety of data types and object classes. You will learn much
more about these as you continue to get to know R.

Basic data types in R can be divided into the following types:

 numeric - (10.5, 55, 787)

 integer - (1L, 55L, 100L, where the letter "L" declares this as an
integer)
 complex - (9 + 3i, where "i" is the imaginary part)
 character (a.k.a. string) - ("k", "R is exciting", "FALSE", "11.5")
 logical (a.k.a. boolean) - (TRUE or FALSE)

Imp :- We can use the class() function to check the data type of a
variable:
Example

# numeric
x <- 10.5
class(x)

# integer
x <- 1000L
class(x)

# complex
x <- 9i + 3
class(x)

# character/string
x <- "R is exciting"
class(x)

# logical/boolean
x <- TRUE
class(x)

Assigning variables in R :-
In computer programming, a variable is a named memory location where
data is stored. For example,
x = 13.8

Here, x is the variable where the data 13.8 is stored. Now, whenever we
use x in our program, we will get 13.8.

x = 13.8

# print variableprint(x)

Output

[1] 13.8

As you can see, when we print x we get 13.8 as output.

Rules to Declare R Variables

A variable can have a short name (like x and y) or a more descriptive
name (age, carname, total_volume). Rules for R variables are:
 A variable name must start with a letter and can be a combination
of letters, digits, period(.)
and underscore(_). If it starts with period(.), it cannot be followed
by a digit.
 A variable name cannot start with a number or underscore (_)
 Variable names are case-sensitive (age, Age and AGE are three
different variables)
 Reserved words cannot be used as variables (TRUE, FALSE,
NULL, if...)

# Legal variable names:

myvar <- "John"
my_var <- "John"
myVar <- "John"
MYVAR <- "John"
myvar2 <- "John"
.myvar <- "John"
# Illegal variable names:
2myvar <- "John"
my-var <- "John"
my var <- "John"
_my_var <- "John"
my_v@ar <- "John"
TRUE <- "John"

Types of R Variables

Depending on the type of data that you want to store, variables can be
divided into the following types.

1. Boolean Variables

It stores single bit data which is either TRUE or FALSE.

Here, TRUE means yes and FALSE means no. For example,

a = TRUE

print(a)print(class(a))

Output

[1] TRUE

[1] "logical"

Here, we have declared the boolean variable a with the value TRUE.
Boolean variables belong to the logical class so class(a) returns "logical".

2. Integer Variables

It stores numeric data without any decimal values. For example,

A = 14L
print(A)print(class(A))

Output

[1] 14

[1] "integer"

Here, L represents integer value. In R, integer variables belong to the

integer class so, class(a) returns "integer".

3. Floating Point Variables

It stores numeric data with decimal values. For example,

x = 13.4

print(x)print(class(x))

Output

[1] 13.4

[1] "numeric"

Here, we have created a floating point variable named x. You can see that
the floating point variable belongs to the numeric class.

4. Character Variables

It stores a single character data. For example,

alphabet = "a"

print(alphabet)print(class(alphabet))
Output

[1] "a"

[1] "character"

Here, we have created a character variable named alphabet. Since

character variables belong to the character
class, class(alphabet) returns "character".

5. String Variables

It stores data that is composed of more than one character. We use double
quotes to represent string data. For example,

message = "Welcome to Programiz!"

print(message)print(class(message))

Output

[1] "Welcome to Programiz!"

[1] "character"

Here, we have created a string variable named message. You can see that
the string variable also belongs to the character class.

Changing Value of Variables

Depending on the conditions or information passed into the program, you

can change the value of a variable. For example,

message = "Hello World!"print(message)

# changing value of a variable
message <- "Welcome to Programiz!"

print(message)

Output

[1] "Hello World!"

[1] "Welcome to Programiz!"

In this program,

 "Hello World!" - initial value of message

 "Welcome to Programiz!" - changed value of message

You can see that the value of a variable can be changed anytime.

Basic Operations

Once you have a vector (or a list of numbers) in memory most basic
operations are available. Most of the basic operations will act on a whole
vector and can be used to quickly perform a large number of calculations
with a single command. There is one thing to note, if you perform an
operation on more than one vector it is often necessary that the vectors all
contain the same number of entries.

Here we first define a vector which we will call “a” and will look at how
to add and subtract constant numbers from all of the numbers in the
vector. First, the vector will contain the numbers 1, 2, 3, and 4. We then
see how to add 5 to each of the numbers, subtract 10 from each of the
numbers, multiply each number by 4, and divide each number by 5.

> a <- c(1,2,3,4)

>a
[1] 1 2 3 4
>a+5
[1] 6 7 8 9
> a - 10
[1] -9 -8 -7 -6
> a*4
[1] 4 8 12 16
> a/5
[1] 0.2 0.4 0.6 0.8
We can save the results in another vector called b:

> b <- a - 10
>b
[1] -9 -8 -7 -6
If you want to take the square root, find e raised to each number, the
logarithm, etc., then the usual commands can be used:

> sqrt(a)
[1] 1.000000 1.414214 1.732051 2.000000
> exp(a)
[1] 2.718282 7.389056 20.085537 54.598150

> log(a)
[1] 0.0000000 0.6931472 1.0986123 1.3862944
> exp(log(a))
[1] 1 2 3 4
By combining operations and using parentheses you can make more
complicated expressions:

> c <- (a + sqrt(a))/(exp(2)+1)>

C
[1] 0.2384058 0.4069842 0.5640743 0.7152175

Note that you can do the same operations with vector arguments. For
example to add the elements in vector a to the elements in vector b use
the following command:
>a+b
[1] -8 -6 -4 -2
The operation is performed on an element by element basis. Note this is
true for almost all of the basic functions. So you can bring together all
kinds of complicated expressions:

> a*b
[1] -9 -16 -21 -24
> a/b
[1] -0.1111111 -0.2500000 -0.4285714 -0.6666667
> (a+3)/(sqrt(1-b)*2-1)
[1] 0.7512364 1.0000000 1.2884234 1.6311303
You need to be careful of one thing. When you do operations on vectors
they are performed on an element by element basis. One ramification of
this is that all of the vectors in an expression must be the same length. If
the lengths of the vectors differ then you may get an error message, or
worse, a warning message and unpredictable results:

> a <- c(1,2,3)

> b <- c(10,11,12,13)

> a+b

[1] 11 13 15 14

Warning message:longer object length is not a multiple of shorter object

length in: a + b

As you work in R and create new vectors it can be easy to lose track of
what variables you have defined. To get a list of all of the variables that
have been defined use the ls() command:

> ls()

[1] "a" "b" "bubba" "c" "last.warning"

[6] "tree" "trees"

Finally, you should keep in mind that the basic operations almost always
work on an element by element basis. There are rare exceptions to this
general rule. For example, if you look at the minimum of two vectors
using the min command you will get the minimum of all of the numbers.
There is a special command, called pmin, that may be the command you
want in some circumstances:

> a <- c(1,-2,3,-4)

> b <- c(-1,2,-3,4)
> min(a,b)
[1] -4
> pmin(a,b)
[1] -1 -2 -3 -4

Imp pmin - parallel minima ad pmax - parallel maxima

MYSQL Introduction
No ratings yet
MYSQL Introduction
3 pages
Statistics R Charts and Graphs Assignment
No ratings yet
Statistics R Charts and Graphs Assignment
13 pages
SAP S4HANA Build A Draft Enabled Business Object For Custom Functionality
No ratings yet
SAP S4HANA Build A Draft Enabled Business Object For Custom Functionality
38 pages
Report Writer: Course Manual and Activity Guide
No ratings yet
Report Writer: Course Manual and Activity Guide
89 pages
R Language 1st Unit Deep
100% (3)
R Language 1st Unit Deep
61 pages
R Material
No ratings yet
R Material
105 pages
R Programming Language
No ratings yet
R Programming Language
6 pages
Data Sources of Healthcare
No ratings yet
Data Sources of Healthcare
25 pages
Glucose Measurement Paper
No ratings yet
Glucose Measurement Paper
161 pages
R VS Python
No ratings yet
R VS Python
12 pages
100 Flashcards For CSIR-NET Part A - Christy-S Cla - 250714 - 185703
No ratings yet
100 Flashcards For CSIR-NET Part A - Christy-S Cla - 250714 - 185703
100 pages
1 Physics Formulae
100% (1)
1 Physics Formulae
99 pages
AVL Tree
No ratings yet
AVL Tree
27 pages
BioPerl Tutorial
100% (1)
BioPerl Tutorial
12 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
Matrix and Determinant - Eduncle Study Notes With Formulas, Questions & Solutions
No ratings yet
Matrix and Determinant - Eduncle Study Notes With Formulas, Questions & Solutions
15 pages
Measurement of GHT Glucose, Heart Rate, Temperature Using Non Invasive Method
No ratings yet
Measurement of GHT Glucose, Heart Rate, Temperature Using Non Invasive Method
3 pages
R D Sharma Maths Soln
No ratings yet
R D Sharma Maths Soln
46 pages
Environmental Microbiology Notes
No ratings yet
Environmental Microbiology Notes
183 pages
Determinant Sheet 1632418974083
100% (1)
Determinant Sheet 1632418974083
40 pages
Lecture Notes
100% (1)
Lecture Notes
82 pages
CS 19 GATE PSUs
No ratings yet
CS 19 GATE PSUs
2 pages
Org Nic Lecture Notes: Topic: Isomerism
No ratings yet
Org Nic Lecture Notes: Topic: Isomerism
51 pages
Higher Engineering Mathematics
No ratings yet
Higher Engineering Mathematics
1,327 pages
Computational Chemistry: Geometry Optimization Using Avogadro Software
No ratings yet
Computational Chemistry: Geometry Optimization Using Avogadro Software
7 pages
Dplyr
No ratings yet
Dplyr
106 pages
Drug Design Poster Template Fall 2019
No ratings yet
Drug Design Poster Template Fall 2019
1 page
Robert H Lipson - David L. Andrews - Molecular Photophysics and Spectroscopy (2021)
No ratings yet
Robert H Lipson - David L. Andrews - Molecular Photophysics and Spectroscopy (2021)
154 pages
দেবীপুরাণোক্ত শ্রীশ্রীদুর্গাপূজা পদ্ধতি
No ratings yet
দেবীপুরাণোক্ত শ্রীশ্রীদুর্গাপূজা পদ্ধতি
64 pages
Use of Mathematics in Daily Life
No ratings yet
Use of Mathematics in Daily Life
10 pages
DataScience With R (Assignment 5-Report)
No ratings yet
DataScience With R (Assignment 5-Report)
9 pages
Descriptive Analysis in R Programming - GeeksforGeeks-1-12
No ratings yet
Descriptive Analysis in R Programming - GeeksforGeeks-1-12
12 pages
11 Ip Mysql Handwritten Notes 1
No ratings yet
11 Ip Mysql Handwritten Notes 1
5 pages
1 What Is Bioinformatics
No ratings yet
1 What Is Bioinformatics
34 pages
Genomics
No ratings yet
Genomics
14 pages
INPS Vocabulary Study Material
100% (1)
INPS Vocabulary Study Material
16 pages
Chemical+Bonding+Class 11+ (JEE+2024)
No ratings yet
Chemical+Bonding+Class 11+ (JEE+2024)
231 pages
KV No 1, Afs, Agra: Section - A
No ratings yet
KV No 1, Afs, Agra: Section - A
4 pages
MCA-25 Discrete Mathematics
No ratings yet
MCA-25 Discrete Mathematics
250 pages
Syllabus of MSC Mathematics CUJ
No ratings yet
Syllabus of MSC Mathematics CUJ
30 pages
Non Invasive Blood Glucose Monitoring
No ratings yet
Non Invasive Blood Glucose Monitoring
3 pages
Data Mining Ii Sol
No ratings yet
Data Mining Ii Sol
106 pages
(Object Oriented Programming Structure) : Amity Institute of Information Technology
No ratings yet
(Object Oriented Programming Structure) : Amity Institute of Information Technology
36 pages
ENGINEERING PHYSICS-I (BASIC PH - Dr. M. S. PAWAR
No ratings yet
ENGINEERING PHYSICS-I (BASIC PH - Dr. M. S. PAWAR
217 pages
Deepak Singh: Contact Number Flat No. 701, A Wing, Kshitij Tower, Sec 19, Sanpada, Along Palm Beach Road
No ratings yet
Deepak Singh: Contact Number Flat No. 701, A Wing, Kshitij Tower, Sec 19, Sanpada, Along Palm Beach Road
1 page
Important Topics For BSC-IT Sem-6 (E-Next - In)
No ratings yet
Important Topics For BSC-IT Sem-6 (E-Next - In)
2 pages
Inps Circle Study Material
No ratings yet
Inps Circle Study Material
35 pages
Botany Syllabus & Credits 2025-26 TG
No ratings yet
Botany Syllabus & Credits 2025-26 TG
7 pages
Botany Deleted and Added Portion
No ratings yet
Botany Deleted and Added Portion
2 pages
Atomic Structure Key Notes PDF
No ratings yet
Atomic Structure Key Notes PDF
10 pages
R Programming Course Notes
No ratings yet
R Programming Course Notes
28 pages
Module2 NLP BAD613B Notes
100% (1)
Module2 NLP BAD613B Notes
16 pages
Proposed PHD in Data Science
No ratings yet
Proposed PHD in Data Science
166 pages
Cplus Faq
No ratings yet
Cplus Faq
287 pages
Advanced Level Physics by Keith Gibbs
No ratings yet
Advanced Level Physics by Keith Gibbs
2 pages
R - Programming - Fundamentals - PPT 1
No ratings yet
R - Programming - Fundamentals - PPT 1
14 pages
Unit1 Introduction To R Programming
No ratings yet
Unit1 Introduction To R Programming
85 pages
R Programming Language
No ratings yet
R Programming Language
7 pages
SC&RP - Unit 1
No ratings yet
SC&RP - Unit 1
106 pages
R Programming Unit 1
No ratings yet
R Programming Unit 1
83 pages
R Lang
No ratings yet
R Lang
3 pages
R Programming Unit-1
No ratings yet
R Programming Unit-1
108 pages
Q Bank Est
No ratings yet
Q Bank Est
7 pages
Cybersecurity in AI
No ratings yet
Cybersecurity in AI
6 pages
DTE Front Pages
No ratings yet
DTE Front Pages
2 pages
DMS Micro Project Report With Diary
No ratings yet
DMS Micro Project Report With Diary
15 pages
DMS Acknowledgement
No ratings yet
DMS Acknowledgement
3 pages
R Lab Assignment
No ratings yet
R Lab Assignment
17 pages
DTE Final Report With Diary
No ratings yet
DTE Final Report With Diary
12 pages
UDACITY Introduction To Generative AI With AWS Project Documentation Report
No ratings yet
UDACITY Introduction To Generative AI With AWS Project Documentation Report
1 page
Soft Engg 2
No ratings yet
Soft Engg 2
25 pages
Micro Project 02 Content Page
No ratings yet
Micro Project 02 Content Page
2 pages
HTML Tribute Page Project Report
No ratings yet
HTML Tribute Page Project Report
9 pages
Micro Project 01 Front Pages
No ratings yet
Micro Project 01 Front Pages
2 pages
R Programming Syllabus
No ratings yet
R Programming Syllabus
3 pages
C Project
No ratings yet
C Project
5 pages
First PAGES of SIT Institute
No ratings yet
First PAGES of SIT Institute
5 pages
Software 4 Model Answer Papers From 2019 To 2022
0% (1)
Software 4 Model Answer Papers From 2019 To 2022
27 pages
Java-Project-Snake Game
100% (1)
Java-Project-Snake Game
21 pages
Osore Dickson Waliaro
No ratings yet
Osore Dickson Waliaro
119 pages
TCQS DOC 01 4 CE E CEE Client User Manual
No ratings yet
TCQS DOC 01 4 CE E CEE Client User Manual
29 pages
AZ-900 Exam Self-Serve Guide
No ratings yet
AZ-900 Exam Self-Serve Guide
6 pages
Ramdump Modem 2025-04-15 10-04-53 Props
No ratings yet
Ramdump Modem 2025-04-15 10-04-53 Props
26 pages
B4100456 01 UserGuide
No ratings yet
B4100456 01 UserGuide
31 pages
Sheet 1 - RTL Part1
No ratings yet
Sheet 1 - RTL Part1
2 pages
Midterm Lab Quiz 2 - Attempt Review
No ratings yet
Midterm Lab Quiz 2 - Attempt Review
5 pages
Game Crash Log
No ratings yet
Game Crash Log
6 pages
Cosmos: User Guide
No ratings yet
Cosmos: User Guide
8 pages
Lab Mannual Week 1 AICT
No ratings yet
Lab Mannual Week 1 AICT
10 pages
Details Guide For Etabs Analysis
No ratings yet
Details Guide For Etabs Analysis
1 page
Barman EWS 057
No ratings yet
Barman EWS 057
11 pages
WnO POS Server Setup Guide
No ratings yet
WnO POS Server Setup Guide
14 pages
MPDU User Manual V1
No ratings yet
MPDU User Manual V1
44 pages
Computer Architecture and Parallel Processing
No ratings yet
Computer Architecture and Parallel Processing
1 page
Apogee ONE User Guide
No ratings yet
Apogee ONE User Guide
42 pages
Wipro PRP
71% (7)
Wipro PRP
82 pages
PhreePlot (234 467)
No ratings yet
PhreePlot (234 467)
234 pages
Rosemount™ 3051 Coplanar™ Pressure Transmitter: Close
No ratings yet
Rosemount™ 3051 Coplanar™ Pressure Transmitter: Close
9 pages
Veeam Backup and Replication Operations Guide - Volume 3
No ratings yet
Veeam Backup and Replication Operations Guide - Volume 3
531 pages
Lvtds Converter
No ratings yet
Lvtds Converter
10 pages
Wincupl
No ratings yet
Wincupl
96 pages
Program 000 vcRuntimeMinimum x64
No ratings yet
Program 000 vcRuntimeMinimum x64
44 pages
Avrae Comandos
No ratings yet
Avrae Comandos
17 pages
Compiler Lecture 3
No ratings yet
Compiler Lecture 3
21 pages
Ccs334 Unit 1
No ratings yet
Ccs334 Unit 1
44 pages
G9 - CSS Lesson 2
No ratings yet
G9 - CSS Lesson 2
34 pages
Internship
No ratings yet
Internship
16 pages

Introduction To R Programming

Uploaded by

Introduction To R Programming

Uploaded by

"R is an interpreted computer programming language which was

created by Ross Ihaka and Robert Gentleman at the University of

# R program to print Welcome to GFG!

Six Reasons Why You Should Learn R for Data Science

1. R is built for statistics.

For related reasons, R is the statistical and data analysis language of

2. R is a popular language for data science at top tech firms

3. Learning the data science basics is arguably easier in R.

Because of that, learning the core skills of data science – data

And of course, there's the tidyverse, a group of packages that's built

4. Amazing packages that make your life easier.

These packages are part of the tidyverse, a growing collection of

RStudio, a company that produces some amazing R packages and the

6. Put another tool in your toolkit.

Types of Data Analysis :-

Types of Data Analysis

Data analysis can be separated and organized into types, arranged in an

The goal of descriptive analysis is to describe or summarize a set of data.

 Descriptive analysis is the very first analysis performed in the data

Descriptive Analysis Example

 Diagnostic analysis typically comes after descriptive analysis, taking

Diagnostic Analysis Example

3. EXPLORATORY ANALYSIS (EDA)

Exploratory analysis involves examining or exploring data and finding

Exploratory Analysis Example

Climate change is an increasingly important topic as the global

Exploratory analysis explores data to find relationships between measures

Inferential analysis involves using a small sample of data to infer

 Inferential analysis involves using estimated data that is representative of

The idea of drawing an inference about the population at large with a

Predictive analysis involves using historical or current data to find

 The accuracy of the predictions depends on the input variables.

Predictive Analysis Example

Causal analysis looks at the cause and effect of relationships between

Causal Analysis Example

Mechanistic analysis is used to understand exact changes in variables that

 It’s applied in physical or engineering sciences, situations that require

Mechanistic Analysis Example

Many graduate-level research and complex topics are suitable examples,

Mechanistic analysis is in some ways a predictive analysis, but modified

Prescriptive analysis compiles insights from other previous data analyses

Prescriptive Analysis Example

Prescriptive analysis is pervasive in everyday life, driving the curated

Generally, while doing programming in any programming language, you

Basic data types in R can be divided into the following types:

 numeric - (10.5, 55, 787)

As you can see, when we print x we get 13.8 as output.

Rules to Declare R Variables

# Legal variable names:

It stores single bit data which is either TRUE or FALSE.

It stores numeric data without any decimal values. For example,

Here, L represents integer value. In R, integer variables belong to the

3. Floating Point Variables

It stores numeric data with decimal values. For example,

It stores a single character data. For example,

Here, we have created a character variable named alphabet. Since

message = "Welcome to Programiz!"

[1] "Welcome to Programiz!"

Changing Value of Variables

Depending on the conditions or information passed into the program, you

message = "Hello World!"print(message)

[1] "Hello World!"

[1] "Welcome to Programiz!"

 "Hello World!" - initial value of message

 "Welcome to Programiz!" - changed value of message

> a <- c(1,2,3,4)

> c <- (a + sqrt(a))/(exp(2)+1)>

> a <- c(1,2,3)

> b <- c(10,11,12,13)

Warning message:longer object length is not a multiple of shorter object

[1] "a" "b" "bubba" "c" "last.warning"

> a <- c(1,-2,3,-4)

Imp pmin - parallel minima ad pmax - parallel maxima

You might also like