Rfile D
Rfile D
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical
tests, time-series analysis, classification, clustering, …) and graphical techniques, and is
highly extensible. The S language is often the vehicle of choice for research in statistical
methodology, and R provides an Open-Source route to participation in that activity.
One of R’s strengths is the ease with which well-designed publication-quality plots can be
produced, including mathematical symbols and formulae where needed. Great care has been
taken over the defaults for the minor design choices in graphics, but the user retains full
control.
R is available as Free Software under the terms of the Free Software Foundation’s GNU
General Public License in source code form. It compiles and runs on a wide variety of UNIX
platforms and similar systems (including FreeBSD and Linux), Windows and MacOS.
R is an integrated suite of software facilities for data manipulation, calculation, and graphical
display. It includes
graphical facilities for data analysis and display either on-screen or on hardcopy, and
The RStudio integrated development environment (IDE) is available with the GNU Affero
Genera l Public License version 3. The AGPL v3 is an open-source license that guarantees
the freedom to share the code.
RStudio Desktop and RStudio Server are both available in free and fee-based (commercial)
editions. OS support depends on the format/edition of the IDE. Prepackaged distributions of
RStudio Desktop are available for Windows, macOS, and Linux. RStudio Server and Server
Pro run on Debian, Ubuntu, Red Hat Linux, CentOS, openSUSE and SLES.
The RStudio IDE is partly written in the C++ programming language and uses the Qt
framework for its graphical user interface. The bigger percentage of the code is written in
Java. JavaScript is also used.
Work on the RStudio IDE started around December 2010, and the first public beta
version (v0.92) was officially announced in February 2011. Version 1.0 was released on 1
November 2016. Version 1.1 was released on 9 October 2017.
The RStudio IDE provides a mechanism for executing R functions interactively from within
the IDE through the Addins menu. This enables packages to include Graphical User
Interfaces (GUIs) for increased accessibility. Popular R packages that use this feature
include:
Features of R Studio
A user-friendly interface.
Easy access to all the imported data and created objects (like variables, functions,
etc.).
Code autocompletion.
The ability to create projects to organize and share your work with your collaborators
more efficiently.
Plot previewing.
Code development.
Code editing.
Code review.
Debugging.
Data modelling.
Continuous integration.
Uses of R Studio
R Studio is widely used in various fields for statistical analysis, data visualization, and
programming in the R language. Here are some common uses of R Studio:
1. Data Analysis and Statistics: R Studio is extensively used for statistical analysis and
data exploration. It provides a rich set of statistical functions and packages for
descriptive statistics, hypothesis testing, regression analysis, and more. Analysts and
statisticians use R Studio to perform data manipulation, cleansing, and transformation
tasks.
4. Time Series Analysis: R Studio is suitable for time series analysis and forecasting.
Time series packages such as forecast and timeSeries are frequently used for
modeling and predicting trends in time-dependent data.
5. Finance and Economics: Professionals in finance and economics use R Studio for
analyzing financial data, conducting econometric modeling, and creating
visualizations to understand market trends and economic indicators.
6. Academic Research: Many researchers and academics use R Studio for statistical
analysis and data visualization in various fields such as social sciences, psychology,
economics, and more.
T Test
In statistics, the T-test is one of the most common test which is used to determine whether the
mean of the two groups is equal to each other. The assumption for the test is that both groups
are sampled from a normal distribution with equal fluctuation. The null hypothesis is that the
two means are the same, and the alternative is that they are not identical.
Here, In our case we have conducted a sample of height on the basis of gender and we want
to know is the average height of males more than the average height of females. To test this
we collected a simple random sample of 12 individuals out of which 2 are females and the
rest are males with their respective heights in cm.
Step 2 – Perform and Interpret the Two Sample t-test
Next, we will use the t.test() command to perform a two sample t-test:
We are using the formula method to obtain t-test results, where Height is a numerical
vector and Sex is a binary category column of the my_data dataset.
Result –