0% found this document useful (0 votes)
45 views31 pages

Unit 3 R As A Set of Statistical Tables

Uploaded by

divyashree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views31 pages

Unit 3 R As A Set of Statistical Tables

Uploaded by

divyashree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

UNIT 3: R as a set of Statistical tables

PROBABILITY AND STATISTICS

Probability And Statistics are the two important concepts in Maths. Probability is all
about chance. Whereas statistics is more about how we handle various data using
different techniques. It helps to represent complicated data in a very easy and
understandable way. Statistics and probability are usually introduced in Class 10, Class
11 and Class 12 students are preparing for school exams and competitive
examinations. The introduction of these fundamentals is briefly given in your academic
books and notes. The statistic has a huge application nowadays in data science
professions. The professionals use the stats and do the predictions of the business. It
helps them to predict the future profit or loss attained by the company.

What is Probability?
Probability denotes the possibility of the outcome of any random event. The meaning of
this term is to check the extent to which any event is likely to happen. For
example, when we flip a coin in the air, what is the possibility of getting a head? The
answer to this question is based on the number of possible outcomes. Here the
possibility is either head or tail will be the outcome. So, the probability of a head to
come as a result is 1/2.

The probability is the measure of the likelihood of an event to happen. It measures the
certainty of the event. The formula for probability is given by;

P(E) = Number of Favourable Outcomes/Number of total outcomes

P(E) = n(E)/n(S)

Here,

n(E) = Number of event favourable to event E

n(S) = Total number of outcomes


What is Statistics?
Statistics is the study of the collection, analysis, interpretation, presentation, and
organization of data. It is a method of collecting and summarising the data. This has
many applications from a small scale to large scale. Whether it is the study of the
population of the country or its economy, stats are used for all such data analysis.

Statistics has a huge scope in many fields such as sociology, psychology, geology,
weather forecasting, etc. The data collected here for analysis could be quantitative or
qualitative. Quantitative data are also of two types such as: discrete and continuous.
Discrete data has a fixed value whereas continuous data is not a fixed data but has a
range. There are many terms and formulas used in this concept. See the below table
to understand them.

Terms Used in Probability and Statistics


There are various terms utilised in the probability and statistics concepts, Such as:

 Random Experiment
 Sample Sample
 Random variables
 Expected Value
 Independence
 Variance
 Mean

Let us discuss these terms one by one.

Random Experiment
An experiment whose result cannot be predicted, until it is noticed is called a random
experiment. For example, when we throw a dice randomly, the result is uncertain to us.
We can get any output between 1 to 6. Hence, this experiment is random.

Sample Space
A sample space is the set of all possible results or outcomes of a random experiment.
Suppose, if we have thrown a dice, randomly, then the sample space for this
experiment will be all possible outcomes of throwing a dice, such as;
Sample Space = { 1,2,3,4,5,6}

Random Variables
The variables which denote the possible outcomes of a random experiment are called
random variables. They are of two types:

1. Discrete Random Variables


2. Continuous Random Variables

Discrete random variables take only those distinct values which are countable. Whereas
continuous random variables could take an infinite number of possible values.

Independent Event
When the probability of occurrence of one event has no impact on the probability of
another event, then both the events are termed as independent of each other. For
example, if you flip a coin and at the same time you throw a dice, the probability of
getting a ‘head’ is independent of the probability of getting a 6 in dice.

Mean
Mean of a random variable is the average of the random values of the possible
outcomes of a random experiment. In simple terms, it is the expectation of the possible
outcomes of the random experiment, repeated again and again or n number of times. It
is also called the expectation of a random variable.

Expected Value
Expected value is the mean of a random variable. It is the assumed value which is
considered for a random experiment. It is also called expectation, mathematical
expectation or first moment. For example, if we roll a dice having six faces, then the
expected value will be the average value of all the possible outcomes, i.e. 3.5.

Variance
Basically, the variance tells us how the values of the random variable are spread around
the mean value. It specifies the distribution of the sample space across the mean.
List of Probability Topics
Basic probability topics are:

Addition Rule of Probability Binomial Probability Bayes Theorem

Compound Events Compound Probability Complementary Events

Conditional Probability Complementary Events Coin Toss Probability

Dependent Events Experimental Probability Geometric Probability

Independent Events Multiplication Rule of Probability Mutually Exclusive Events

Properties of Probability Probability Line Probability without Replacement

Random Variables Simple Event Sample Space

Tree Diagram Theoretical Probability Types of Events

Experimental Probability Axiomatic Probability

List of Statistical Topics


Basic Statistics topics are:

Box and Whisker Plots Comparing Two Means Comparing Two Proportions
Categorical Data Central Tendency Correlation
Data Handling Degree of freedom Empirical Rule
Frequency Distribution Table Five Number Summary Graphical Representation of Data
Histogram Mean Median
Mode Data Range Relative Frequency
Population and Sample Scatter Plots Standard Deviation
Ungrouped Data Variance Data Sets

Probability and Statistics Formulas


Probability Formulas: For two events A and B:
Probability Range Probability of an event ranges from 0 to 1 i.e. 0 ≤ P(A) ≤ 1

Rule of Complementary Events P(A’) + P(A) = 1

Rule of Addition P(A∪B) = P(A) + P(B) – P(A∩B)

Mutually Exclusive Events P(A∪B) = P(A) + P(B)

Independent Events P(A∩B) = P(A)P(B)

Disjoint Events P(A∩B) = 0

Conditional Probability P(A|B) = P(A∩B)/P(B)

Bayes Formula P(A|B) = P(B|A) P(A)/P(B)

Statistics Formulas : Some important formulas are listed below:

Let x be an item given and n is the total number of items.

Mean = (Sum of all the terms)/(Total number of terms)


Mean

Mean=x―=∑xn
M=(n+12)th, if n=odd

Median
M=(n2)thterm+(n2+1)thterm2, if n=even

Mode The most frequently occurring value


Standard Deviation S.D(σ)=∑i=1n(xi−x¯)2n
Variance V(σ2)=∑i=1n(xi−x¯)2n

Solved Examples
Here are some examples based on the concepts of statistics and probability to
understand better. Students can practice more questions based on these solved
examples to excel in the topic. Also, make use of the formulas given in this article in the
above section to solve problems based on them.
Example 1: Find the mean and mode of the following data: 2, 3, 5, 6, 10, 6, 12, 6, 3,
4.

Solution:

Total Count: 10

Sum of all the numbers: 2+3+5+6+10+6+12+6+3+7=60

Mean = (sum of all the numbers)/(Total number of items)

Mean = 60/10 = 6

Again, Number 6 is occurring for 3 times, therefore Mode = 6. Answer

Example 2: A bucket contains 5 blue, 4 green and 5 red balls. Sudheer is asked to
pick 2 balls randomly from the bucket without replacement and then one more
ball is to be picked. What is the probability he picked 2 green balls and 1 blue
ball?

Solution: Total number of balls = 14

Probability of drawing

1 green ball = 4/14

another green ball = 3/13

1 blue ball = 5/12

Probability of picking 2 green balls and 1 blue ball = 4/14 * 3/13 * 5/12 = 5/182.

Example 3: What is the probability that Ram will choose a marble at random and
that it is not black if the bowl contains 3 red, 2 black and 5 green marbles.

Solution: Total number of marble = 10

Red and Green marbles = 8

Find the number of marbles that are not black and divide by the total number of
marbles.
So P(not black) = (number of red or green marbles)/(total number of marbles)

= 8 /10

= 4/5

Example 4: Find the mean of the following data:

55, 36, 95, 73, 60, 42, 25, 78, 75, 62

Solution: Given,

55 36 95 73 60 42 25 78 75 62

Sum of observations = 55 + 36 + 95 + 73 + 60 + 42 + 25 + 78 + 75 + 62 = 601

Number of observations = 10

Mean = 601/10 = 60.1

Example 5: Find the median and mode of the following marks (out of 10) obtained
by 20 students:

4, 6, 5, 9, 3, 2, 7, 7, 6, 5, 4, 9, 10, 10, 3, 4, 7, 6, 9, 9

Solution: Given,

4, 6, 5, 9, 3, 2, 7, 7, 6, 5, 4, 9, 10, 10, 3, 4, 7, 6, 9, 9

Ascending order: 2, 3, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7, 7, 9, 9, 9, 9, 10, 10

Number of observations = n = 20

Median = (10th + 11th observation)/2

= (6 + 6)/2

=6

Most frequent observations = 9

Hence, the mode is 9.


Descriptive Analysis in R Programming
In Descriptive statistics in R Programming Language, we describe
our data with the help of various representative methods using
charts, graphs, tables, excel files, etc. In the descriptive analysis,
we describe our data in some manner and present it in a
meaningful way so that it can be easily understood.
Most of the time it is performed on small data sets and this
analysis helps us a lot to predict some future trends based on the
current findings. Some measures that are used to describe a data
set are measures of central tendency and measures of variability
or dispersion.
Process of Descriptive Statistics in R
 The measure of central tendency
 Measure of variability

Measure of central tendency


It represents the whole set of data by a single value. It gives us
the location of central points. There are three main measures of
central tendency:
 Mean
 Mode
 Median
Measure of variability
In Descriptive statistics in R measure of variability is known as the
spread of data or how well is our data is distributed. The most
common variability measures are:
 Range
 Variance
 Standard deviation

Need of Descriptive Statistics in R


Descriptive Analysis helps us to understand our data and is a very
important part of Machine Learning. This is due to Machine
Learning being all about making predictions. On the other hand,
statistics is all about drawing conclusions from data, which is a
necessary initial step for Machine Learning. Let’s do this
descriptive analysis in R.
Descriptive Analysis in R
Descriptive analyses consist of describing simply the data using
some summary statistics and graphics. Here, we’ll describe how
to compute summary statistics using R software.
Import your data into R:
Before doing any computation, first of all, we need to prepare our
data, save our data in external .txt or .csv files and it’s a best
practice to save the file in the current directory. After that import,
your data into R as follow:
R

# R program to illustrate
# Descriptive Analysis

# Import the data using read.csv()


myData = read.csv("CardioGoodFitness.csv",
stringsAsFactors = F)
# Print the first 6 rows
print(head(myData))
Output:
Product Age Gender Education MaritalStatus Usage Fitness
Income Miles
1 TM195 18 Male 14 Single 3 4 29562 112
2 TM195 19 Male 15 Single 2 3 31836 75
3 TM195 19 Female 14 Partnered 4 3 30699 66
4 TM195 19 Male 12 Single 3 3 32973 85
5 TM195 20 Male 13 Partnered 4 2 35247 47
6 TM195 20 Female 14 Partnered 3 3 32973 66

R functions for computing descriptive analysis:


Average in R Programming
An average is a number expressing the central or typical value in
a set of data, in particular the mode, median, or (most
commonly) the mean, which is calculated by dividing the sum of
the values in the set by their number.

where:
 xi represents the data points.
 n represents the total number of data points.
Suppose there are 8 data points. 2, 4, 4, 4, 5, 5, 7, 9 and the
average of these 8 data points is,

Computing Average in R Programming


To compute the average of values, R provides a pre-defined
function mean(). This function takes a Numerical Vector as an
argument and results in the average/mean of that Vector. The
basic Syntax is:
mean(x, na.rm)
Parameters:
 x: Numeric Vector
 na.rm: Boolean value to ignore NA value

Lets discuss one example for Computing Average in R


Programming.
R
# R program to get average of a list

# Taking a list of elements


list = c(2, 4, 4, 4, 5, 5, 7, 9)

# Calculating average using mean()


print(mean(list))
Output:
[1] 5

Example 2: Calculating average using mean() for list of


elements
R
# R program to get average of a list

# Taking a list of elements


list = c(2, 40, 2, 502, 177, 7, 9)

# Calculating average using mean()


print(mean(list))
Output:
[1] 105.5714

Variance in R Programming Language


Variance is the sum of squares of differences between all
numbers and means. The mathematical formula for variance is
as follows,

where,
 µ as mean
 N is the total number of elements or frequency of distribution.
Let’s consider the same dataset that we have taken in average.
First, calculate the deviations of each data point from the mean,
and square the result of each,

Variance=9+1+1+1+0+0+4+16/8=4

Computing Variance in R Programming


We can calculate the variance by using var() function in R. the
basic syntax for this:
var(x)
Where,
x: numeric vector
Lets discuss one example for Computing Variance in R
Programming.
R
# R program to get variance of a list

# Taking a list of elements


list = c(2, 4, 4, 4, 5, 5, 7, 9)

# Calculating variance using var()


print(var(list))
Output:
[1] 4.571429

Standard Deviation in R Programming Language


Standard Deviation is the square root of variance. It is a measure
of the extent to which data varies from the mean. The
mathematical formula for calculating standard deviation is as
follows,

StandardDeviation=Vvariance

Standard Deviation for the above data,


StandardDeviation=V4 =2

Computing Standard Deviation in R


One can calculate the standard deviation by using sd() function
in R. The basic syntax for this:
sd(x)
Parameters:
x: numeric vector
Lets discuss one example for Computing Standard Deviation in R.
R
# R program to get
# standard deviation of a list
# Taking a list of elements
list = c(2, 4, 4, 4, 5, 5, 7, 9)

# Calculating standard
# deviation using sd()
print(sd(list))
Output:
[1] 2.13809
Calculating All Three Metrics for a Dataset
Let’s calculate the mean, variance, and standard deviation for
the following dataset:
R
# Define the dataset
data <- c(12, 15, 18, 22, 30, 35)

# Calculate the mean


mean_value <- mean(data)
print(paste("Mean:", mean_value))

# Calculate the variance


variance_value <- var(data)
print(paste("Variance:", variance_value))

# Calculate the standard deviation


sd_value <- sd(data)
print(paste("Standard Deviation:", sd_value))
Output:
[1] "Mean: 22"

[1] "Variance: 79.6"

[1] "Standard Deviation: 8.92188320927819"

Mean, Median and Mode in R Programming


Statistical measures like mean, median, and mode are essential
for summarizing and understanding the central tendency of a
dataset. In R, these measures can be calculated easily using built-
in functions. This article will provide a comprehensive guide on
how to calculate mean, median, and mode in R Programming
Language.

Mean, Median and Mode in R Programming

Dataset used for Calculating the Mean, Median, and Mode


in R Programming
Before doing any computation, first of all, we need to prepare our
data and save our data in external .txt or .csv files and it’s a best
practice to save the file in the current directory. After that import,
your data into R as follow:
Dataset Link: CardioGoodFitness
R

# R program to import data into R

# Import the data using read.csv()


myData = read.csv("CardioGoodFitness.csv",
stringsAsFactors=F)
# Print the first 6 rows
print(head(myData))

Output:
Product Age Gender Education MaritalStatus Usage Fitness
Income Miles
1 TM195 18 Male 14 Single 3 4 29562 112
2 TM195 19 Male 15 Single 2 3 31836 75
3 TM195 19 Female 14 Partnered 4 3 30699 66
4 TM195 19 Male 12 Single 3 3 32973 85
5 TM195 20 Male 13 Partnered 4 2 35247 47
6 TM195 20 Female 14 Partnered 3 3 32973 66
Mean in R Programming Language
It is the sum of observations divided by the total number of
observations. It is also defined as average which is the sum
divided by count.
[Mean(μ)=1N∑i=1Nxi][Mean(μ)=N1∑i=1Nxi]
R

# R program to illustrate
# Descriptive Analysis

# Import the data using read.csv()


myData = read.csv("CardioGoodFitness.csv",
stringsAsFactors=F)

# Compute the mean value


mean = mean(myData$Age)
print(mean)
Output:
[1] 28.78889

Median in R Programming Language


It is the middle value of the data set. It splits the data into two
halves. If the number of elements in the data set is odd then the
center element is median and if it is even then the median would
be the average of two central elements.
[Median={xN+12if N is oddxN2+xN2+12if N is even]
[Median={x2N+12x2N+x2N+1if N is oddif N is even]
R

# R program to illustrate
# Descriptive Analysis

# Import the data using read.csv()


myData = read.csv("CardioGoodFitness.csv",
stringsAsFactors=F)

# Compute the median value


median = median(myData$Age)
print(median)
Output:
[1] 26
Mode in R Programming Language
It is the value that has the highest frequency in the given data
set. The data set may have no mode if the frequency of all data
points is the same. Also, we can have more than one mode if we
encounter two or more data points having the same frequency.
There is no inbuilt function for finding mode in R, so we can create
our own function for finding the mode or we can use the package
called modest.
[Mode=The value that appears most frequently in the dataset]
[Mode=The value that appears most frequently in the dataset]
Creating a user-defined function for finding Mode
There is no in-built function for finding mode in R. So let’s create a
user-defined function that will return the mode of the data
passed. We will be using the table() method for this as it creates a
categorical representation of data with the variable names and
the frequency in the form of a table. We will sort the column Age
column in descending order and will return the 1 value from the
sorted values.
R

# Import the data using read.csv()


myData = read.csv("CardioGoodFitness.csv",
stringsAsFactors=F)

mode = function(){
return(sort(-table(myData$Age))[1])
}

mode()
Output:
25: -25

Covariance and Correlation in R Programming


Covariance and Correlation are terms used in statistics to
measure relationships between two random variables. Both of
these terms measure linear dependency between a pair of
random variables or bivariate data. They both capture a different
component of the relationship, despite the fact that they both
provide information about the link between variables. Let’s
investigate the theory underlying correlation and covariance:
We can discuss some of the main difference between them as
below:In this article, we are going to
discuss cov(), cor() and cov2cor() functions in R which use
covariance and correlation methods of statistics and probability
theory.
Covariance in R Programming Language
In R programming, covariance can be measured using
the cov() function. Covariance is a statistical term used to
measure the direction of the linear relationship between the data
vectors. Mathematically,

where,
x represents the x data vector
y represents the y data vector
represents mean of x data vector
represents mean of y data vector
N represents total observations
Covariance Syntax in R
Syntax: cov(x, y, method)
where,
 x and y represents the data vectors
 method defines the type of method to be used to compute
covariance. Default is “pearson”.
Example:
R

# Data vectors
x <- c(1, 3, 5, 10)

y <- c(2, 4, 6, 20)

# Print covariance using different methods


print(cov(x, y))
print(cov(x, y, method = "pearson"))
print(cov(x, y, method = "kendall"))
print(cov(x, y, method = "spearman"))
Output:
[1] 30.66667
[1] 30.66667
[1] 12
[1] 1.666667

Correlation in R Programming Language


cor() function in R programming measures the correlation
coefficient value. Correlation is a relationship term in statistics
that uses the covariance method to measure how strongly the
vectors are related. Mathematically,

where,
x represents the x data vector
y represents the y data vector
xˉ xˉ represents mean of x data vector
yˉ yˉ represents mean of y data vector

Correlation in R
Syntax: cor(x, y, method)
where,
 x and y represents the data vectors
 method defines the type of method to be used to compute
covariance. Default is “pearson”.
Example:
R

# Data vectors
x <- c(1, 3, 5, 10)
y <- c(2, 4, 6, 20)

# Print correlation using different methods


print(cor(x, y))

print(cor(x, y, method = "pearson"))


print(cor(x, y, method = "kendall"))
print(cor(x, y, method = "spearman"))
Output:
[1] 0.9724702
[1] 0.9724702
[1] 1
[1] 1
probability distributions in R

• Many statistical tools and techniques used in data analysis


are based on probability.
• Probability measures how likely it is for an event to occur on
a scale from 0 (the event never occurs) to 1 (the event
always occurs).
• A probability distribution describes how a random variable is
distributed; it tells us which values a random variable is most
likely to take on and which values are less likely.
• R comes with built-in implementations of many probability
distributions.
• Each probability distribution in R is associated with four
functions which follow a naming convention:
• The d-prefix function calculates the probability density
function (PDF) of a continuous probability distribution, or the
probability mass function (PMF) of a discrete probability
distribution, at a specific value of the random variable.
• The p-prefix function calculates the cumulative distribution
function (CDF) of a probability distribution, which gives the
probability of observing a value less than or equal to a given
value of the random variable
• The q-prefix function calculates the quantile of a probability
distribution, which is the inverse of the CDF.
• The r-prefix function generates random numbers from a
probability distribution
Normal Distribution in R

In a random collection of data from independent sources, it is


generally observed that the distribution of data is normal. Which
means, on plotting a graph with the value of the variable in the
horizontal axis and the count of the values in the vertical axis we
get a bell shape curve. The center of the curve represents the
mean of the data set. In the graph, fifty percent of values lie to
the left of the mean and the other fifty percent lie to the right of
the graph. This is referred as normal distribution in statistics.

R has four in built functions to generate normal distribution. They


are described below.

dnorm(x, mean, sd)


pnorm(x, mean, sd)
qnorm(p, mean, sd)
rnorm(n, mean, sd)

Following is the description of the parameters used in above


functions −

 x is a vector of numbers.
 p is a vector of probabilities.
 n is number of observations(sample size).
 mean is the mean value of the sample data. It's default
value is zero.
 sd is the standard deviation. It's default value is 1.

dnorm()

This function gives height of the probability distribution at each


point for a given mean and standard deviation.

Live Demo
# Create a sequence of numbers between -10 and 10
incrementing by 0.1.
x <- seq(-10, 10, by = .1)

# Choose the mean as 2.5 and standard deviation as 0.5.


y <- dnorm(x, mean = 2.5, sd = 0.5)

# Give the chart file a name.


png(file = "dnorm.png")

plot(x,y)

# Save the file.


dev.off()

When we execute the above code, it produces the following result



pnorm()

This function gives the probability of a normally distributed


random number to be less that the value of a given number. It is
also called "Cumulative Distribution Function".

# Create a sequence of numbers between -10 and 10


incrementing by 0.2.
x <- seq(-10,10,by = .2)
# Choose the mean as 2.5 and standard deviation as 2.
y <- pnorm(x, mean = 2.5, sd = 2)

# Give the chart file a name.


png(file = "pnorm.png")
# Plot the graph.
plot(x,y)

# Save the file.


dev.off()

When we execute the above code, it produces the following result


qnorm()

This function takes the probability value and gives a number


whose cumulative value matches the probability value.

# Create a sequence of probability values incrementing by 0.02.


x <- seq(0, 1, by = 0.02)

# Choose the mean as 2 and standard deviation as 3.


y <- qnorm(x, mean = 2, sd = 1)

# Give the chart file a name.


png(file = "qnorm.png")

# Plot the graph.


plot(x,y)

# Save the file.


dev.off()

When we execute the above code, it produces the following result


rnorm()
This function is used to generate random numbers whose
distribution is normal. It takes the sample size as input and
generates that many random numbers. We draw a histogram to
show the distribution of the generated numbers.

Live Demo

# Create a sample of 50 numbers which are normally distributed.


y <- rnorm(50)

# Give the chart file a name.


png(file = "rnorm.png")

# Plot the histogram for this sample.


hist(y, main = "Normal DIstribution")

# Save the file.


dev.off()

When we execute the above code, it produces the following result



Binomial Distribution in R Programming

The binomial distribution model deals with finding the probability


of success of an event which has only two possible outcomes in a
series of experiments. For example, tossing of a coin always gives
a head or a tail. The probability of finding exactly 3 heads in
tossing a coin repeatedly for 10 times is estimated during the
binomial distribution.

R has four in-built functions to generate binomial distribution.


They are described below.

dbinom(x, size, prob)


pbinom(x, size, prob)
qbinom(p, size, prob)
rbinom(n, size, prob)

Following is the description of the parameters used −

 x is a vector of numbers.
 p is a vector of probabilities.
 n is number of observations.
 size is the number of trials.
 prob is the probability of success of each trial.

dbinom()

This function gives the probability density distribution at each


point.

# Create a sample of 50 numbers which are incremented by 1.


x <- seq(0,50,by = 1)

# Create the binomial distribution.


y <- dbinom(x,50,0.5)

# Give the chart file a name.


png(file = "dbinom.png")
# Plot the graph for this sample.
plot(x,y)

# Save the file.


dev.off()

When we execute the above code, it produces the following result


pbinom()

This function gives the cumulative probability of an event. It is a


single value representing the probability.

# Probability of getting 26 or less heads from a 51 tosses of a


coin.
x <- pbinom(26,51,0.5)

print(x)

When we execute the above code, it produces the following result


[1] 0.610116

qbinom()

This function takes the probability value and gives a number


whose cumulative value matches the probability value.

# How many heads will have a probability of 0.25 will come out
when a coin
# is tossed 51 times.
x <- qbinom(0.25,51,1/2)

print(x)

When we execute the above code, it produces the following result


[1] 23

rbinom()

This function generates required number of random values of


given probability from a given sample.

# Find 8 random values from a sample of 150 with probability of


0.4.
x <- rbinom(8,150,.4)

print(x)
When we execute the above code, it produces the following result

[1] 58 61 59 66 55 60 61 67

You might also like