0% found this document useful (0 votes)

14 views30 pages

Unit3 R

The document covers statistical concepts and functions in R, including mean, median, mode, variance, covariance, and correlation, along with their respective syntax and examples. It also discusses basic data visualization techniques using various R packages and functions for creating different types of charts such as bar plots, histograms, pie charts, and scatter plots. Additionally, it introduces common probability distributions, particularly the normal distribution, and the built-in R functions for generating and analyzing these distributions.

Uploaded by

nivassivakumar887

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views30 pages

Unit3 R

Uploaded by

nivassivakumar887

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 30

BCA V R

UNIT-3
Statistics and Probability
Mean, Median, Mode
Mean:It is calculated by taking the sum of the values and dividing with the
number of values in a data series.
Syntax:
The basic syntax for calculating mean in R is −
mean(x, trim = 0, na.rm = FALSE, ...)
Following is the description of the parameters used −
 x is the input vector.
 trim is used to drop some observations from both end of the sorted vector.
 na.rm is used to remove the missing values from the input vector.
Example:
# Create a vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)
# Find Mean.
result.mean <- mean(x)
print(result.mean)
o/p:
[1] 8.22
Median: The middle most value in a data series is called the median. The
median() function is used in R to calculate this value.
Syntax
The basic syntax for calculating median in R is −
median(x, na.rm = FALSE)
Following is the description of the parameters used −
 x is the input vector.
 na.rm is used to remove the missing values from the input vector.
Example
# Create the vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)
# Find the median.
median.result <- median(x)
print(median.result)
o/p:
[1] 5.6
BCA V R

Mode:The mode is the value that has highest number of occurrences in a set of
data. Unike mean and median, mode can have both numeric and character data.
R does not have a standard in-built function to calculate mode.
Finding a mode is perhaps most easily achieved by using R’s table function,
which gives you the frequencies you need.
Example:
R> xdata <- c(2,4.4,3,3,2,2.2,2,4)
R> xtab <- table(xdata)
R> xtab
xdata
2 2.2 3 4 4.4
3 1 21 1

The min and max functions will report the smallest and largest values, with range
returning both in a vector of length 2.
R> min(xdata)
[1] 2
R> max(xdata)
[1] 4.4
R> range(xdata)
[1] 2.0 4.4
tapply() function
The tapply() helps us to compute statistical measures (mean, median, min, max,
etc..) or a self-written function operation for each factor variable in a vector.
Syntax: tapply( x, index, fun )
 x: determines the input vector or an object.
 index: determines the factor vector that helps us distinguish the data.
 fun: determines the function that is to be applied to input data.
tapply(chickwts$weight,INDEX=chickwts$feed,FUN=function(x) length(x) /
nrow(chickwts) )
casein horsebean linseed meatmeal soybean sunflower
0.1690141 0.1408451 0.1690141 0.1549296 0.1971831 0.1690141
round function, which rounds numeric data output to a certain number of
decimal places.
R> round(table(chickwts$feed)/nrow(chickwts),digits=3)
casein horsebean linseed meatmeal soybean sunflower
0.169 0.141 0.169 0.155 0.197 0.169
Quantiles, Percentiles, and the Five-Number Summary:A quantile is a value
computed from a collection of numeric measurements that indicates an
observation’s rank when compared to all the other present observations. For
BCA V R

example, the median is itself a quantile—it gives you a value below which half
of the measurements lie—it’s the 0:5th quantile. Alternatively, quantiles can be
expressed as a percentile—this is identical but on a “percent scale” of 0 to 100.
quantile function:
Syntax: quantile(x)
x: Data set
Example:
R> xdata <- c(2,4.4,3,3,2,2.2,2,4)
R> quantile(xdata,prob=0.8)
80%
3.6
Summary Function: The summary function also provides summary of all the
above statistics.
R> summary(xdata)
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.000 2.000 2.600 2.825 3.250 4.400
Variance: The variance is a particular representation of the average squared
distance of each observation when compared to the mean.
The standard deviation is simply the square root of the variance.
The interquartile range(IQR) measures the width of the “middle 50 percent” of
the data, that is, the range of values that lie within a 25 percent quartile on either
side of the median.
The direct R commands for computing these measures of spread are
var(variance), sd (standard deviation), and IQR (interquartile range).
R> var(xdata)
[1] 0.9078571
R> sd(xdata)
[1] 0.9528154
R> IQR(xdata)
[1] 1.25
Covariance and Correlation
 The covariance expresses how much two numeric variables “change
together” and the nature of that relationship, whether it is positive or
negative.
 Correlation allows you to interpret the covariance further by identifying
both the direction and the strength of any association.
R> xdata <- c(2,4.4,3,3,2,2.2,2,4)
R> ydata <- c(1,4.4,1,3,2,2.2,2,7)
R> cov(xdata,ydata)
[1] 1.479286
BCA V R

R> cor(xdata,ydata)
[1] 0.7713962
BASIC DATA VISUALIZATION
R Visualization Packages
1) plotly
The plotly package provides online interactive and quality graphs. This package
extends upon the JavaScript library ?plotly.js.
2) ggplot2
R allows us to create graphics declaratively. R provides the ggplot package for
this purpose. This package is famous for its elegant and quality graphs, which
sets it apart from other visualization packages.
3) tidyquant
The tidyquant is a financial package that is used for carrying out quantitative
financial analysis. This package adds under tidyverse universe as a financial
package that is used for importing, analyzing, and visualizing the data.
4) taucharts
Data plays an important role in taucharts. The library provides a declarative
interface for rapid mapping of data fields to visual properties.
5) ggiraph
It is a tool that allows us to create dynamic ggplot graphs. This package allows
us to add tooltips, JavaScript actions, and animations to the graphics.
6) geofacets
This package provides geofaceting functionality for 'ggplot2'. Geofaceting
arranges a sequence of plots for different geographical entities into a grid that
preserves some of the geographical orientation.
7) googleVis
googleVis provides an interface between R and Google's charts tools. With the
help of this package, we can create web pages with interactive charts based on R
data frames.
8) RColorBrewer
This package provides color schemes for maps and other graphics, which are
designed by Cynthia Brewer.
BCA V R

9) dygraphs
The dygraphs package is an R interface to the dygraphs JavaScript charting
library. It provides rich features for charting time-series data in R.
10) shiny
R allows us to develop interactive and aesthetically pleasing web apps by
providing a shiny package. This package provides various extensions with
HTML widgets, CSS, and JavaScript.
barplot(): R uses the barplot() function to create bar charts. Here, both vertical
and Horizontal bars can be drawn.
Syntax:
barplot(H, xlab, ylab, main, names.arg, col)
Parameters:
H: This parameter is a vector or matrix containing numeric values which are used
in bar chart.
xlab: This parameter is the label for x axis in bar chart.
ylab: This parameter is the label for y axis in bar chart.
main: This parameter is the title of the bar chart.
names.arg: This parameter is a vector of names appearing under each bar in bar
chart.
col: This parameter is used to give colors to the bars in the graph.
Example:
# Create the data for the chart
A <- c(17, 32, 8, 53, 1)
# Plot the bar chart
barplot(A, xlab = "X-axis", ylab = "Y-axis", main ="Bar-Chart")
BCA V R

Creating a Horizontal Bar Chart in R

To create a horizontal bar chart:
 Take all parameters which are required to make a simple bar chart.
 Now to make it horizontal new parameter is added.
barplot(A, horiz=TRUE )
Example:
barplot(A, horiz = TRUE, xlab = "X-axis",ylab = "Y-axis", main ="Horizontal
Bar Chart” )

R Histogram
A histogram is a type of bar chart which shows the frequency of the number of
values which are compared with a set of values ranges. The histogram is used for
the distribution, whereas a bar chart is used for comparing different entities.
Syntax
The basic syntax for creating a histogram using R is −
hist(v,main,xlab,xlim,ylim,breaks,col,border)
 v is a vector containing numeric values used in histogram.
 main indicates title of the chart.
 col is used to set color of the bars.
 border is used to set border color of each bar.
 xlab is used to give description of x-axis.
 xlim is used to specify the range of values on the x-axis.
 ylim is used to specify the range of values on the y-axis.
 breaks is used to mention the width of each bar.
Example
# Creating data for the graph.
v <- c(12,24,16,38,21,13,55,17,39,10,60)
# Giving a name to the chart file.
BCA V R

png(file = "histogram_chart.png")

# Creating the histogram.

hist(v,xlab = "Weight",ylab="Frequency",col = "green",border = "red")
# Saving the file.
dev.off()
O/p

R Pie Charts
A pie-chart is a representation of values in the form of slices of a circle with
different colors. The Pie charts are created with the help of pie () function,
Syntax:
pie(x, labels, radius, main, col, clockwise)
 x is a vector containing the numeric values used in the pie chart.
 labels is used to give description to the slices.
 radius indicates the radius of the circle of the pie chart.(value between −1
and +1).
 main indicates the title of the chart.
 col indicates the color palette.
 clockwise is a logical value indicating if the slices are drawn clockwise or
anti clockwise.
Example:
# Create data for the graph.
x <- c(21, 62, 10, 53)
labels <- c("London", "New York", "Singapore", "Mumbai")
BCA V R

# Plot the chart.

pie(x,labels)

R - Line Chart
A line chart is a graph that connects a series of points by drawing line segments
between them. The plot() function in R is used to create the line graph.
Syntax
plot(v,type,col,xlab,ylab)
 v is a vector containing the numeric values.
 type takes the value "p" to draw only the points, "l" to draw only the lines
and "o" to draw both points and lines.
 xlab is the label for x axis.
 ylab is the label for y axis.
 main is the Title of the chart.
 col is used to give colors to both the points and lines
example:
v <- c(7,12,28,3,41)
# Plot the bar chart.
plot(v,type = "o")
BCA V R

R – Boxplots
Boxplots are a measure of how well data is distributed across a data set. This
divides the data set into three quartiles. This graph represents the minimum,
maximum, average, first quartile, and the third quartile in the data set.
Syntax
boxplot(x, data, notch, varwidth, names, main)
 x is a vector or a formula.
 data is the data frame.
 notch is a logical value. Set as TRUE to draw a notch.
 varwidth is a logical value. Set as true to draw width of the box
proportionate to the sample size.
 names are the group labels which will be printed under each boxplot.
 main is used to give a title to the graph
example
data<-
data.frame(Group_A=c(25,28,30,32,35,37,38,39,40,41,42),Group_B=c(22,24,2
6,29,31,33,36,37,38,40,43))
boxplot(data,main="Boxplaplot of Group A and
B",xlab="Groups",ylab="values",col=c("lightblue","lightgreen"),border="black
")
BCA V R

R – Scatterplots
Scatterplots show many points plotted in the Cartesian plane. Each point
represents the values of two variables. One variable is chosen in the horizontal
axis and another in the vertical axis.
Syntax
plot(x, y, main, xlab, ylab, xlim, ylim, axes)
 x is the data set whose values are the horizontal coordinates.
 y is the data set whose values are the vertical coordinates.
 main is the tile of the graph.
 xlab is the label in the horizontal axis.
 ylab is the label in the vertical axis.
 xlim is the limits of the values of x used for plotting.
 ylim is the limits of the values of y used for plotting.
 axes indicates whether both axes should be drawn on the plot.
Example:
> x<-1:10
> y<-c(2,4,5,7,8,10,11,13,14,16)
>plot(x,y,main="scatterplotexample",xlab="X-axis",ylab="Y-
axis",col="blue",pch=16,xlim=c(0,11),ylim=c(0,17))
BCA V R

Common probability distributions

In R probability distribution function with respect to probability density starts
with ‘d’, the cumulative distribution function always begins with ‘p’ ,inverse
cumulative distribution begins with ‘q’ and functions that produces random
variables begins with ‘r’
Normal Distribution
Normal Distribution is a probability function used in statistics that tells about
how the data values are distributed. It is the most important probability
distribution function used in statistics because of its advantages in real case
scenarios. For example, the height of the population, shoe size, IQ level, rolling a
dice, and many more.
The normal distribution(Gaussian Distribution) is defined by the following
probability density function, where μ is the population mean and σ2 is the
variance. It is represented as N(μ, σ2).
In R, there are 4 built-in functions to generate normal distribution:
 dnorm()
 pnorm()
 qnorm()
 rnorm()
dnorm()
dnorm() function in R programming measures density function of distribution. In
statistics, it is measured by below formula-
f(x) = e−(x − μ)^2/2σ^2/σ√2π
BCA V R

Syntax :
dnorm(x, mean, sd)
where,
– x represents the data set of values
– mean(x) represents the mean of data set x. It’s default value is 0
– sd(x) represents the standard deviation of data set x. It’s default value is 1
Example:
# creating a sequence of values
# between -15 to 15 with a difference of 0.1
x = seq(-15, 15, by=0.1)

y = dnorm(x, mean(x), sd(x))

# Plot the graph.
plot(x, y)

pnorm()
pnorm() function is the cumulative distribution function which measures the
probability that a random number X takes a value less than or equal to x i.e., in
statistics it is given by-

Syntax:
pnorm(x, mean, sd,lower.tail)
– x represents the data set of values
– mean(x) represents the mean of data set x. It’s default value is 0
– sd(x) represents the standard deviation of data set x. It’s default value is 1
– lower.tail represents a logical value including whether to compute lower tail
probability. It’s default value is TRUE
Example
# creating a sequence of values
# between -10 to 10 with a difference of 0.1
BCA V R

x <- seq(-10, 10, by=0.1)

y <- pnorm(x, mean = 2.5, sd = 2)
plot(x, y)

qnorm()
qnorm() function is the inverse of pnorm() function. It takes the probability value
and gives output which corresponds to the probability value. It is useful in
finding the percentiles of a normal distribution.
Syntax:
qnorm(p, mean, sd)
– mean(x) represents the mean of data set x. It’s default value is 0
– sd(x) represents the standard deviation of data set x. It’s default value is 1
– p is vector of probabilities
Example:
# Create a sequence of probability values
# incrementing by 0.02.
x <- seq(0, 1, by = 0.02)
y <- qnorm(x, mean(x), sd(x))
plot(x, y)

rnorm()
rnorm() function in R programming is used to generate a vector of random
numbers which are normally distributed.
BCA V R

Syntax:
rnorm(x, mean, sd)
– x represents the data set of values
– mean(x) represents the mean of data set x. It’s default value is 0
– sd(x) represents the standard deviation of data set x. It’s default value is 1
Example
# Create a vector of 1000 random numbers
# with mean=90 and sd=5
x <- rnorm(10000, mean=90, sd=5)
# Create the histogram with 50 bars
hist(x, breaks=50)

Poisson Distribution
The Poisson distribution represents the probability of a provided number of cases
happening in a set period of space or time if these cases happen with an
identified constant mean rate (free of the period since the ultimate event).
The probability mass function of the Poisson distribution is:

Where:
 X is a random variable following a Poisson distribution
 k is the number of times an event occurs
 P(X = k) is the probability that an event will occur k times
 e is Euler’s constant (approximately 2.718)
 is the average number of times an event occurs
 ! is the factorial function

There are four Poisson functions available in R:

BCA V R

• dpois
• ppois
• qpois
• rpois
dpois()
The function dpois() calculates the probability of a random variable that is
available within a certain range.
Syntax:
dpois(k, lambda, log)
where,
 K: number of successful events happened in an interval
 lambda: mean per interval
 log: If TRUE then the function returns probability in form of log
Example:
dpois(2, 3)
dpois(6, 6)

Output:
[1] 0.2240418
[1] 0.1606231

ppois()
The function ppois() calculates the probability of a random variable that will be
equal to or less than a number.
Syntax:
ppois(q,lambda,lower.tail,log)
where,
 q: number of successful events happened in an interval
 lambda: mean per interval
 lower.tail: If TRUE then left tail is considered otherwise if the FALSE
right tail is considered
 log: If TRUE then the function returns probability in form of log
Example:
ppois(2, 3)
ppois(6, 6)
Output:
[1] 0.4231901
[1] 0.6063028
qpois()
BCA V R

The function qpois() is used for generating quantile of a given Poisson’s

distribution. In probability, quantiles are marked points that divide the graph of a
probability distribution into intervals (continuous ) which have equal
probabilities.
Syntax:
qpois(q, lambda, lower.tail, log)
where,
 K: number of successful events happened in an interval
 lambda: mean per interval
 lower.tail: If TRUE then left tail is considered otherwise if the FALSE
right tail is considered
 log: If TRUE then the function returns probability in form of log
Example
y <- c(.01, .05, .1, .2)
qpois(y, 2)
qpois(y, 6)
Output:
[1] 0 0 0 1
[1] 1 2 3 4

rpois()
The function rpois() is used for generating random numbers from a given
Poisson’s distribution.
Syntax:
rpois(q, lambda)
where,
 q: number of random numbers needed
 lambda: mean per interval
Example
rpois(2, 3)
rpois(6, 6)
Output:
[1] 2 3
[1] 6 7 6 10 9 4

Binomial Distribution
The binomial distribution model deals with finding the probability of success of
an event which has only two possible outcomes in a series of experiments. For
example, tossing of a coin always gives a head or a tail. The probability of
finding
BCA V R

exactly 3 heads in tossing a coin repeatedly for 10 times is estimated during the
binomial distribution.

R has four in-built functions to generate binomial distribution. They are

described below.
dbinom(x, size, prob)
pbinom(x, size, prob)
qbinom(p, size, prob)
rbinom(n, size, prob)
Following is the description of the parameters used −
 x is a vector of numbers.
 p is a vector of probabilities.
 n is number of observations.
 size is the number of trials.
 prob is the probability of success of each trial.
dbinom()
This function gives the probability density distribution at each point.
Example
# Create a sample of 50 numbers which are incremented by 1.
x <- seq(0,50,by = 1)
# Create the binomial distribution.
y <- dbinom(x,50,0.5)
# Plot the graph for this sample.
plot(x,y)
BCA V R

pbinom()
This function gives the cumulative probability of an event. It is a single value
representing the probability.
Example
# Probability of getting 26 or less heads from a 51 tosses of a coin.
x <- pbinom(26,51,0.5)
print(x)
When we execute the above code, it produces the following result –
[1] 0.610116
qbinom()
This function takes the probability value and gives a number whose cumulative
value matches the probability value.
Example
# How many heads will have a probability of 0.25 will come out when a coin
# is tossed 51 times.
x <- qbinom(0.25,51,1/2)
print(x)
When we execute the above code, it produces the following result −
[1] 23
rbinom()
This function generates required number of random values of given probability
from a given sample.
Example
# Find 8 random values from a sample of 150 with probability of 0.4.
x <- rbinom(8,150,.4)
print(x)
BCA V R

When we execute the above code, it produces the following result −

[1] 58 61 59 66 55 60 61 67

Continuous uniform distribution in R

A uniform distribution is a probability distribution in which every value between
an interval from a to b is equally likely to be chosen. The probability that we will
obtain a value between x1 and x2 on an interval from a to b can be found using
the formula:
P(obtain value between x1 and x2)=(x2-x1)/(b-a)
The uniform distribution has the following properties:
 The mean of the distribution is μ = (a + b) / 2
 The variance of the distribution is σ2 = (b – a)2 / 12
 The distribution’s standard deviation, or SD, is σ = √σ2
dunif() method in R programming language is used to generate density function.
It calculates the uniform density function in R language in the specified interval
(a, b).
Syntax:
dunif(x, min = 0, max = 1, log = FALSE)

Parameter:
 x: input sequence
 min, max= range of values
 log: indicator, of whether to display the output values as probabilities.
The result produced will be for each value of the interval. Hence, a sequence will
be generated.
Example 1:
# generating a sequence of values
x <- 5:10
print ("dunif value")

# calculating density function

dunif(x, min = 1, max = 20)
Output
[1] “dunif value”
[1] 0.05263158 0.05263158 0.05263158 0.05263158 0.05263158 0.05263158
All values are equal and this is the reason why it is called uniform distribution.
Let us plot it for a better picture.
Example 2:
min <- 0
BCA V R

max <- 100

# Specify x-values for qunif function

xpos <- seq(min, max , by = 0.5)

# supplying corresponding y coordinations

ypos <- dunif(xpos, min = 10, max = 80)

# plotting the graph

plot(ypos , type="o")

The punif() method in R is used to calculate the uniform cumulative distribution

function, this is, the probability of a variable X taking a value lower than x (that
is, x <= X). If we need to compute a value x > X, we can calculate 1 – punif(x).

Syntax:
punif(q, min = 0, max = 1, lower.tail = TRUE)

All the independent probabilities that satisfy the comparison condition will be
added.
Example:
min <- 0
max <- 60
# calculating punif value
punif (15 , min =min , max = max)
BCA V R

Output
[1] 0.25
Example 2:
# Grid of X-axis values
x <- seq(-0.5, 1.5, 0.01)

# Uniform distribution between 0 and 1

plot(x, punif(x), type = "l", main = "Uniform CDF", ylab = "F(x)", lwd = 2, col
= "red")

qunif() method is used to calculate the corresponding quantile for any probability
(p) for a given uniform distribution. To use this simply the function had to be
called with the required parameters.

Syntax:
qunif(p, min = 0, max = 1)

Parameter :
 p – The vector of probabilities
 min , max – The limits for calculation of quantile function
Example
min <- 0
max <- 1
BCA V R

# Specify x-values for qunif function

xpos <- seq(min, max , by = 0.02)

# supplying corresponding y coordinations

ypos <- qunif(xpos, min = 10, max = 100)

# plotting the graph

plot(xpos,ypos)

The runif() function in R programming language is used to generate a sequence

of random following the uniform distribution.
Syntax:
runif(n, min = 0, max = 1)
Parameter:
 n= number of random samples
 min=minimum value(by default 0)
 max=maximum value(by default 1)
Example 1:
print("Random 15 numbers between 1 and 3")
runif(15, min=1, max=3)
Output
[1] “Random 15 numbers between 1 and 3”
[1] 1.534 1.772 1.027 1.765 2.739 1.681 1.964 2.199 1.987 1.372 2.655 2.337
2.588 1.216 2.447
Example 2:
# n = 1000
hist(runif(1000), main = "n = 10000", xlim = c(-0.2, 1.25),
BCA V R

xlab = "", prob = TRUE)

lines(x, dunif(x), col = "red", lwd = 2)

Bernoulli Distribution

Bernoulli Distribution is a special case of Binomial distribution where only a

single trial is performed. It is a discrete probability distribution for a Bernoulli
trial (a trial that has only two outcomes i.e. either success or failure). For
example, In R it can be represented as a coin toss where the probability of getting
the head is 0.5 and getting a tail is 0.5. It is a probability distribution of a random
variable that takes value 1 with probability p and the value 0 with probability
q=1-p. The Bernoulli distribution is a special case of the binomial distribution
with n=1.

The probability mass function f of this distribution, over possible outcomes k, is

given by :

dbern()
dbern() function in R programming measures the density function of the Bernoulli
distribution.
Syntax: dbern(x, prob, log = FALSE)

Parameter:
 x: vector of quantiles
 prob: probability of success on each trial
 log: logical; if TRUE, probabilities p are given as log(p)
BCA V R

Example:
# Importing the Rlab library
library(Rlab)

# x values for the dbern() function

x <- c(0, 1, 3, 5, 7, 10)

# Using dbern() function to obtain the corresponding Bernoulli PDF

y <- dbern(x, prob = 0.5)

# Plotting dbern values

plot(x, y, type = "o")

pbern()
pbern() function in R programming giver the distribution function for the
Bernoulli distribution.
Syntax: pbern(q, prob, lower.tail = TRUE, log.p = FALSE)

Parameter:
 q: vector of quantiles
 prob: probability of success on each trial
 lowe.tail: logical
 log.p: logical; if TRUE, probabilities p are given as log(p).
Example:
# import Rlab library
library(Rlab)

# x values for the

# pbern( ) function
x <- seq(0, 10, by = 1)
BCA V R

# using pbern( ) function

# to x to obtain corresponding
# Bernoulli CDF
y <- pbern(x, prob = 0.7)

# plot pbern values

plot(y, type = "o")

qbern()
qbern() gives the quantile function for the Bernoulli distribution. A quantile
function in statistical terms specifies the value of the random variable such that
the probability of the variable being less than or equal to that value equals the
given probability.

Syntax: qbern(p, prob, lower.tail = TRUE, log.p = FALSE)

Parameter:
 p: vector of probabilities.
 prob: probability of success on each trial.
 lower.tail: logical
 log.p: logical; if TRUE, probabilities p are given as log(p).
Example:
# import Rlab library
library(Rlab)
# x values for the
# qbern( ) function
x <- seq(0, 1, by = 0.2)
BCA V R

# using qbern( ) function

# to x to obtain corresponding
# Bernoulli QF
y <- qbern(x, prob = 0.5,lower.tail = TRUE, log.p = FALSE)

# plot qbern values

plot(y, type = "o")

rbern()
rbern() function in R programming is used to generate a vector of random
numbers which are Bernoulli distributed.

Syntax: rbern(n, prob)

Parameter:
 n: number of observations.
 prob: number of observations.
Example:
# import Rlab library
library(Rlab)
set.seed(9999)
# sample size
N <- 100
# generate random variables using
# rbern() function
BCA V R

random_values <- rbern(N, prob = 0.5)

# print the values

print(random_values)

# plot of randomly
# drawn density
hist(random_values,breaks = 10,main = "")

Output:

[1] 0 0 0 1 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 0
[41] 1 0 1 0 1 1 0 1 1 0 0 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 1 0 1 1 0
[81] 1 0 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 1 0 1

Student t Distribution
The t-distribution, also known as the Student's t-distribution, is a type of
probability distribution that is similar to the normal distribution with its bell
shape but has heavier tails. It is used for estimating population parameters for
small sample sizes or unknown variances.
BCA V R

dt() function in R is used to find the value of probability density function (pdf)
of the Student’s t-distribution given a random variable x,
Syntax: dt(x, df)
Parameters:
 x is the quantiles vector
 df is the degrees of freedom(degrees of freedom determines the shape of
distribution, as degree increases, it becomes normal distribution)
Example:
x_dt <- seq(- 10, 10, by = 0.01)
y_dt <- dt(x_dt, df = 3)
plot(y_dt)

pt() function is used to get the cumulative distribution function (CDF) of a t-

distribution
Syntax: pt(q, df, lower.tail = TRUE)
BCA V R

Parameter:
 q is the quantiles vector
 df is the degrees of freedom
 lower.tail – if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X >
x].
Example:
x_pt <- seq(- 10, 10, by = 0.01) # Specify x-values for pt function
y_pt <- pt(x_pt, df = 3) # Apply pt function
plot(y_pt) # Plot pt values

The qt() function is used to get the quantile function or inverse cumulative
density function of a t-distribution.
Syntax: qt(p, df, lower.tail = TRUE)
Parameter:
 p is the vector of probabilities
 df is the degrees of freedom
 lower.tail – if TRUE (default), probabilities are P[X ≤ x], otherwise, P[X >
x].
Example:
x_qt <- seq(0, 1, by = 0.01) # Specify x-values for qt function
y_qt <- qt(x_qt, df = 3) # Apply qt function
plot(y_qt) # Plot qt values
BCA V R

rt() function is used to generate random deviates from a student’s t-distribution

Syntax: rt(n, df)
Parameter:
• n is the number of observation to generate
• df is the degrees of freedom
Example:
set.seed(91929) # Set seed for reproducibility
N <- 10000 # Specify sample size
y_rt <- rt(N, df = 3) # Draw N log normally distributed values
y_rt # Print values to RStudio console
hist(y_rt, breaks = 100,main = "") # Plot of randomly drawn student t density

Beautiful Graphics in R
No ratings yet
Beautiful Graphics in R
238 pages
Grade 7: Measures of Variability
100% (1)
Grade 7: Measures of Variability
11 pages
R For Data Exploration
No ratings yet
R For Data Exploration
52 pages
Basics of Data Analysis and Graphics in
No ratings yet
Basics of Data Analysis and Graphics in
103 pages
Genetica Cuantitativa
No ratings yet
Genetica Cuantitativa
120 pages
Business Analytics Unit 4
No ratings yet
Business Analytics Unit 4
24 pages
Data Visualization in R Sem-III 2021 PDF
No ratings yet
Data Visualization in R Sem-III 2021 PDF
57 pages
Training in R For Data Statistics
No ratings yet
Training in R For Data Statistics
113 pages
05 Charts and Graphs in R
No ratings yet
05 Charts and Graphs in R
51 pages
Unit V Statistics R
No ratings yet
Unit V Statistics R
60 pages
Graphics in R
No ratings yet
Graphics in R
8 pages
Charts and Graphs in R
No ratings yet
Charts and Graphs in R
50 pages
DSR - Unit 2-2.1 ExploringBasicgraphs
No ratings yet
DSR - Unit 2-2.1 ExploringBasicgraphs
51 pages
DV - Unit 2
No ratings yet
DV - Unit 2
73 pages
SSMDA
No ratings yet
SSMDA
37 pages
1485 (Ebook PDF) Essentials of Statistics For The Behavioral Sciences 10th Edition PDF Download
100% (2)
1485 (Ebook PDF) Essentials of Statistics For The Behavioral Sciences 10th Edition PDF Download
54 pages
IDS Unit-5
No ratings yet
IDS Unit-5
39 pages
On Eda
No ratings yet
On Eda
60 pages
Module IV
No ratings yet
Module IV
43 pages
Business Analytics Unit - IV Notes - 60637706 - 2025 - 05!15!02 - 16
No ratings yet
Business Analytics Unit - IV Notes - 60637706 - 2025 - 05!15!02 - 16
28 pages
P6ADBMS
No ratings yet
P6ADBMS
34 pages
R - Charts and Graphs
No ratings yet
R - Charts and Graphs
21 pages
Unit III - R Programming
No ratings yet
Unit III - R Programming
21 pages
R UNIT 3 STatistic N Probabilty
No ratings yet
R UNIT 3 STatistic N Probabilty
17 pages
R Complete
No ratings yet
R Complete
24 pages
Unit 4 Ba Shivdas
No ratings yet
Unit 4 Ba Shivdas
17 pages
Visualizing Data in R
No ratings yet
Visualizing Data in R
20 pages
02 Graphs and Chart in R-2012
No ratings yet
02 Graphs and Chart in R-2012
24 pages
Data Visualization
No ratings yet
Data Visualization
46 pages
Graph Plotting in R Programming
No ratings yet
Graph Plotting in R Programming
12 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
Lab01 Note R
No ratings yet
Lab01 Note R
7 pages
Exploratory Data Analysis - NOTES
No ratings yet
Exploratory Data Analysis - NOTES
31 pages
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
No ratings yet
Unit - 2: Data Manipulation With R & Data Visualization in Watson Studio
58 pages
Unit3 R
No ratings yet
Unit3 R
19 pages
Minitab Guide
No ratings yet
Minitab Guide
14 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
Lecture 2 - R Graphics PDF
No ratings yet
Lecture 2 - R Graphics PDF
68 pages
Data Analysis2
No ratings yet
Data Analysis2
16 pages
DV Unit 2 Update
No ratings yet
DV Unit 2 Update
13 pages
Experiment 3
No ratings yet
Experiment 3
43 pages
STAT 1000 - Worksheet 2
No ratings yet
STAT 1000 - Worksheet 2
14 pages
Module 5-6
No ratings yet
Module 5-6
12 pages
Grpahs and Charts in R
No ratings yet
Grpahs and Charts in R
12 pages
Unit 3
No ratings yet
Unit 3
11 pages
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
No ratings yet
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
63 pages
R Unit5
No ratings yet
R Unit5
12 pages
R Chart Exercise
No ratings yet
R Chart Exercise
9 pages
R-Charts and Graphs
No ratings yet
R-Charts and Graphs
16 pages
Exp-6 SDMA
No ratings yet
Exp-6 SDMA
7 pages
Practical 7 Visulization
No ratings yet
Practical 7 Visulization
9 pages
Introduction To R Charts Graphs AN 15 09 2024
No ratings yet
Introduction To R Charts Graphs AN 15 09 2024
8 pages
Enma 104 1.4
50% (2)
Enma 104 1.4
23 pages
Importing The Files
No ratings yet
Importing The Files
14 pages
Charts
No ratings yet
Charts
8 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 1
10 pages
Guide To Create: Beautiful Graphics in R
No ratings yet
Guide To Create: Beautiful Graphics in R
48 pages
2 R - Zajecia - 4 - Eng
No ratings yet
2 R - Zajecia - 4 - Eng
7 pages
Mendenhall R
No ratings yet
Mendenhall R
14 pages
Nota
No ratings yet
Nota
47 pages
Blend Astm Final Dosage Units Calculations Version 10-14-16
No ratings yet
Blend Astm Final Dosage Units Calculations Version 10-14-16
21 pages
Control Chart Constants and Formulae
No ratings yet
Control Chart Constants and Formulae
3 pages
Hands On
No ratings yet
Hands On
6 pages
STANDARD NORMAL DISTRIBUTION: Table Values Represent AREA To The LEFT of The Z Score
No ratings yet
STANDARD NORMAL DISTRIBUTION: Table Values Represent AREA To The LEFT of The Z Score
1 page
Raja Daniyal (0000242740) 8614 - Assignment 1
No ratings yet
Raja Daniyal (0000242740) 8614 - Assignment 1
30 pages
Assingnment
No ratings yet
Assingnment
2 pages
CC Unit - 3
No ratings yet
CC Unit - 3
11 pages
Minitab Statguide Time Series
No ratings yet
Minitab Statguide Time Series
72 pages
Unit4 R
No ratings yet
Unit4 R
21 pages
Skewness and Kurtosis
No ratings yet
Skewness and Kurtosis
5 pages
Methods For Describing Sets of Data
No ratings yet
Methods For Describing Sets of Data
47 pages
Void Double Double: DBL - Sort.h Histogram.h
No ratings yet
Void Double Double: DBL - Sort.h Histogram.h
7 pages
Business Moments 1
No ratings yet
Business Moments 1
9 pages
ISO Limits and Fits Table - The Right Fits and Clearance For Bearings and Seals
No ratings yet
ISO Limits and Fits Table - The Right Fits and Clearance For Bearings and Seals
9 pages
Exploring Research 9th Edition Salkind Solutions Manual PDF Download
100% (2)
Exploring Research 9th Edition Salkind Solutions Manual PDF Download
33 pages
Unit 2
No ratings yet
Unit 2
47 pages
OPIANA - MIDTERM+Problem-set-4-5-6-7-and-8 - 9-10
No ratings yet
OPIANA - MIDTERM+Problem-set-4-5-6-7-and-8 - 9-10
73 pages
4th Periodical Exam PROSTAT (Grade 11)
No ratings yet
4th Periodical Exam PROSTAT (Grade 11)
274 pages
Formulaf
No ratings yet
Formulaf
5 pages
Descriptive Statistics 50 102 New
No ratings yet
Descriptive Statistics 50 102 New
53 pages
Cost Pract1 7
No ratings yet
Cost Pract1 7
43 pages
Experiment-1: Aim: - To Calculate Mean, Median, Mode, Standard Deviation and
No ratings yet
Experiment-1: Aim: - To Calculate Mean, Median, Mode, Standard Deviation and
3 pages
Unit 4 Class, Objects And. Form Handling
No ratings yet
Unit 4 Class, Objects And. Form Handling
20 pages
GCE AS Level Representation of Data Advantages and Disadvantages of Different Representations of Data
No ratings yet
GCE AS Level Representation of Data Advantages and Disadvantages of Different Representations of Data
5 pages
Module 1 - Basic Concepts of Statistics
No ratings yet
Module 1 - Basic Concepts of Statistics
6 pages
Cross Tab Rank Spearman
No ratings yet
Cross Tab Rank Spearman
2 pages
Swe370 Data Mining
No ratings yet
Swe370 Data Mining
12 pages
R Assignment 1
No ratings yet
R Assignment 1
6 pages
Uji Reabilitas, Path Analysis, Asumsi Klasik, Sem Pls
No ratings yet
Uji Reabilitas, Path Analysis, Asumsi Klasik, Sem Pls
13 pages
Learn C++
From Everand
Learn C++
Durgesh
4.5/5 (9)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Graphs with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
4/5 (2)
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet

Unit3 R

Uploaded by

Unit3 R

Uploaded by

BCA V R

Creating a Horizontal Bar Chart in R

# Creating the histogram.

# Plot the chart.

Common probability distributions

y = dnorm(x, mean(x), sd(x))

x <- seq(-10, 10, by=0.1)

There are four Poisson functions available in R:

The function qpois() is used for generating quantile of a given Poisson’s

R has four in-built functions to generate binomial distribution. They are

When we execute the above code, it produces the following result −

Continuous uniform distribution in R

# calculating density function

max <- 100

# Specify x-values for qunif function

# supplying corresponding y coordinations

# plotting the graph

The punif() method in R is used to calculate the uniform cumulative distribution

# Uniform distribution between 0 and 1

# Specify x-values for qunif function

# supplying corresponding y coordinations

# plotting the graph

The runif() function in R programming language is used to generate a sequence

xlab = "", prob = TRUE)

Bernoulli Distribution is a special case of Binomial distribution where only a

The probability mass function f of this distribution, over possible outcomes k, is

# x values for the dbern() function

# Using dbern() function to obtain the corresponding Bernoulli PDF

# Plotting dbern values

# x values for the

# using pbern( ) function

# plot pbern values

Syntax: qbern(p, prob, lower.tail = TRUE, log.p = FALSE)

# using qbern( ) function

# plot qbern values

Syntax: rbern(n, prob)

random_values <- rbern(N, prob = 0.5)

# print the values

pt() function is used to get the cumulative distribution function (CDF) of a t-

rt() function is used to generate random deviates from a student’s t-distribution

You might also like