0% found this document useful (0 votes)
13 views8 pages

Rfile D

file

Uploaded by

Muskan Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views8 pages

Rfile D

file

Uploaded by

Muskan Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

What is R?

R is a language and environment for statistical computing and graphics. It is a GNU


project which is like the S language and environment which was developed at Bell
Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues.
R can be considered as a different implementation of S. There are some important
differences, but much code written for S runs unaltered under R.

R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical
tests, time-series analysis, classification, clustering, …) and graphical techniques, and is
highly extensible. The S language is often the vehicle of choice for research in statistical
methodology, and R provides an Open-Source route to participation in that activity.

One of R’s strengths is the ease with which well-designed publication-quality plots can be
produced, including mathematical symbols and formulae where needed. Great care has been
taken over the defaults for the minor design choices in graphics, but the user retains full
control.

R is available as Free Software under the terms of the Free Software Foundation’s GNU
General Public License in source code form. It compiles and runs on a wide variety of UNIX
platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.

R is an integrated suite of software facilities for data manipulation, calculation, and graphical
display. It includes

 an effective data handling and storage facility,

 a suite of operators for calculations on arrays, in particular matrices,

 a large, coherent, integrated collection of intermediate tools for data analysis,

 graphical facilities for data analysis and display either on-screen or on hardcopy, and

 a well-developed, simple, and effective programming language which includes


conditionals, loops, user-defined recursive functions and input and output facilities.
R Studio

RStudio is an integrated development environment for R, a programming


language for statistical computing and graphics. It is available in two formats: RStudio
Desktop is a regular desktop application while RStudio Server runs on a remote server and
allows accessing RStudio using a web browser. The RStudio IDE is a product of Posit PBC
(formerly RStudio PBC, formerly RStudio Inc.).

The RStudio integrated development environment (IDE) is available with the GNU Affero
Genera l Public License version 3. The AGPL v3 is an open-source license that guarantees
the freedom to share the code.

RStudio Desktop and RStudio Server are both available in free and fee-based (commercial)
editions. OS support depends on the format/edition of the IDE. Prepackaged distributions of
RStudio Desktop are available for Windows, macOS, and Linux. RStudio Server and Server
Pro run on Debian, Ubuntu, Red Hat Linux, CentOS, openSUSE and SLES.

The RStudio IDE is partly written in the C++ programming language and uses the Qt
framework for its graphical user interface. The bigger percentage of the code is written in
Java. JavaScript is also used.

Work on the RStudio IDE started around December 2010, and the first public beta
version (v0.92) was officially announced in February 2011. Version 1.0 was released on 1
November 2016. Version 1.1 was released on 9 October 2017.

The RStudio IDE provides a mechanism for executing R functions interactively from within
the IDE through the Addins menu. This enables packages to include Graphical User
Interfaces (GUIs) for increased accessibility. Popular R packages that use this feature
include:

 bookdown – a knitr extension to create books


 colourpicker – a graphical tool to pick colours for plots
 datasets.load – a graphical tool to search and load datasets
 googleAuthR – Authenticate with Google APIs

Features of R Studio

RStudio offers numerous helpful features:

 A user-friendly interface.

 The ability to write and save reusable scripts.

 Easy access to all the imported data and created objects (like variables, functions,
etc.).

 Exhaustive help on any object.

 Code autocompletion.

 The ability to create projects to organize and share your work with your collaborators
more efficiently.

 Plot previewing.

 Easy switching between terminal and console.

 Operational history tracking.

 Plenty of articles from RStudio Support on how to use the IDE.

 Code development.

 Code editing.

 Code review.

 Debugging.

 Data modelling.

 Continuous integration.
Uses of R Studio

R Studio is widely used in various fields for statistical analysis, data visualization, and
programming in the R language. Here are some common uses of R Studio:

1. Data Analysis and Statistics: R Studio is extensively used for statistical analysis and
data exploration. It provides a rich set of statistical functions and packages for
descriptive statistics, hypothesis testing, regression analysis, and more. Analysts and
statisticians use R Studio to perform data manipulation, cleansing, and transformation
tasks.

2. Data Visualization: R Studio's integrated graphics capabilities allow users to create


high-quality visualizations, including scatter plots, bar charts, histograms, heatmaps,
and more. The ggplot2 package, a popular data visualization library in R, is often used
in conjunction with R Studio for creating complex and customized plots.

3. Reproducible Research: R Studio supports R Markdown, enabling the creation of


reproducible research documents that combine narrative text, R code, and
visualizations. This is valuable for creating reports and documents where others can
easily reproduce and verify the analysis.

4. Time Series Analysis: R Studio is suitable for time series analysis and forecasting.
Time series packages such as forecast and timeSeries are frequently used for
modeling and predicting trends in time-dependent data.

5. Finance and Economics: Professionals in finance and economics use R Studio for
analyzing financial data, conducting econometric modeling, and creating
visualizations to understand market trends and economic indicators.
6. Academic Research: Many researchers and academics use R Studio for statistical
analysis and data visualization in various fields such as social sciences, psychology,
economics, and more.

T Test

In statistics, the T-test is one of the most common test which is used to determine whether the
mean of the two groups is equal to each other. The assumption for the test is that both groups
are sampled from a normal distribution with equal fluctuation. The null hypothesis is that the
two means are the same, and the alternative is that they are not identical.

Welch Two Sample t-test


It is a statistical hypothesis that investigates if there is a significant difference between the
mean of two independent groups that may have unequal variance. The test is comparing the
means of two groups while considering the variability within each group.

Step 1 – Collect the data

Here, In our case we have conducted a sample of height on the basis of gender and we want
to know is the average height of males more than the average height of females. To test this
we collected a simple random sample of 12 individuals out of which 2 are females and the
rest are males with their respective heights in cm.
Step 2 – Perform and Interpret the Two Sample t-test

Next, we will use the t.test() command to perform a two sample t-test:

We are using the formula method to obtain t-test results, where Height is a numerical
vector and Sex is a binary category column of the my_data dataset.
Result –

1. This code is performing a Welch Two Sample t-test in R.


2. The test is being conducted on the "Height" variable, which is being compared
between two groups: "male" and "female".
3. The output of the test includes the t-statistic (t = -3.1649), the degrees of freedom (df
= 2.1972), and the p-value (p-value = 0.07702).
4. The alternative hypothesis is that the true difference in means between the two groups
is not equal to 0.
5. The 95% confidence interval for the difference in means is also provided (-65.446344
to 7.246344), as well as the sample means for each group (157.5 for female and 186.6
for male).
6. Overall, this test is used to determine if there is a significant difference in the mean
uptake between the two groups, and the results suggest that there is a significant
difference.

You might also like