0% found this document useful (0 votes)

732 views8 pages

R Studio Project

The document provides a history of the R programming language. R was created in the 1990s at the University of Auckland as an implementation of S with lexical scoping. It is widely used for statistical analysis and includes many statistical and graphical techniques. R is extensible through user-submitted packages and interfaces with other languages. The main implementation is in R, C, and Fortran.

Uploaded by

Shreya Nandan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

732 views8 pages

R Studio Project

Uploaded by

Shreya Nandan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

HISTORY OF R

ABSTRACT
R is a programming language and free software environment for statistical computing and
graphics supported by the R Foundation for Statistical Computing. The R language is widely
used among statisticians and data miners for developing statistical software and data
analysis. Polls, data mining surveys, and studies of scholarly literature databases show
substantial increases in popularity; as of February 2020, R ranks 13th in the TIOBE index, a
measure of popularity of programming languages.

A GNU package, source code for the R software environment is written primarily

in C, Fortran, and R itself and is freely available under the GNU General Public License. Pre-
compiled binary versions are provided for various operating systems. Although R has
a command line interface, there are several 3rd party graphical user interfaces, such
as RStudio, an integrated development environment, and JUPYTER, a notebook interface.

INTRODUCTION

HISTORY

R is an implementation of the S programming language combined with lexical

scoping semantics, inspired by Scheme. S was created by John Chambers in 1976, while
at Bell Labs. There are some important differences, but much of the code written for S runs
unaltered.
R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New
Zealand, and is currently developed by the R Development Core Team (of which Chambers is
a member). R is named partly after the first names of the first two R authors and partly as a
play on the name of S. The project was conceived in 1992, with an initial version released in
1995 and a stable beta version in 2000.

STATISTICAL FEATURES

R and its libraries implement a wide variety of statistical and graphical techniques,

including linear and nonlinear modeling, classical statistical tests, time-series analysis,
classification, clustering, and others. R is easily extensible through functions and extensions,
and the R community is noted for its active contributions in terms of packages. Many of R's
standard functions are written in R itself, which makes it easy for users to follow the
algorithmic choices made. For computationally intensive tasks, C, C++, and Fortran code can
be linked and called at run time. Advanced users can write C, C+
+, Java, .NET or Python code to manipulate R objects directly. R is highly extensible through
the use of user-submitted packages for specific functions or specific areas of study. Due to
its S heritage, R has stronger object-oriented programming facilities than most statistical
computing languages. Extending R is also eased by its lexical scoping rules.

Another strength of R is static graphics, which can produce publication-quality graphs,

including mathematical symbols. Dynamic and interactive graphics are available through
additional packages.

R has Rd, its own LaTeX-like documentation format, which is used to supply comprehensive
documentation, both online in a number of formats and in hard copy.

PROGRAMMING FEATURES

Like other similar languages such as APL and MATLAB, R supports matrix arithmetic.

R's data structures include vectors, matrices, arrays, data frames (similar to tables in
a relational database) and lists. Arrays are stored in column-major order. R's extensible object
system includes objects for (among others): regression models, time-series and geo-spatial
coordinates. The scalar data type was never a data structure of R. Instead, a scalar is
represented as a vector with length one.

Many features of R derive from Scheme. R uses S-expressions to represent both data and

code. Functions are first-class and can be manipulated in the same way as data objects,
facilitating meta-programming, and allow multiple dispatch. Variables in R are lexically
scoped and dynamically typed. Function arguments are passed by value, and are lazy—that is
to say, they are only evaluated when they are used, not when the function is called.

R supports procedural programming with functions and, for some functions, object-oriented

programming with generic functions. A generic function acts differently depending on
the classes of arguments passed to it. In other words, the generic function dispatches the
function (method) specific to that class of object. For example, R has a generic print function
that can print almost every class of object in R with a simple print(object name) syntax.

Although used mainly by statisticians and other practitioners requiring an environment for
statistical computation and software development, R can also operate as a general matrix
calculation toolbox – with performance benchmarks comparable to GNU
Octave or MATLAB.

PACKAGES

The capabilities of R are extended through user-created packages, which allow specialised

statistical techniques, graphical devices, import/export capabilities, reporting tools
(Rmarkdown, knitr, Sweave), etc. These packages are developed primarily in R, and
sometimes in Java, C, C++, and Fortran. The R packaging system is also used by researchers
to create compendia to organise research data, code and report files in a systematic way for
sharing and public archiving.

A core set of packages is included with the installation of R, with more than 15,000
additional packages (as of September 2018) available at the Comprehensive R Archive
Network (CRAN), Bioconductor, Omegahat, GitHub, and other repositories.

The "Task Views" page (subject list) on the CRAN website lists a wide range of tasks (in
fields such as Finance, Genetics, High Performance Computing, Machine Learning, Medical
Imaging, Social Sciences and Spatial Statistics) to which R has been applied and for which
packages are available. R has also been identified by the FDA as suitable for interpreting data
from clinical research.

Other R package resources include Crantastic, a community site for rating and reviewing all
CRAN packages, and R-Forge, a central platform for the collaborative development of R
packages, R-related software, and projects. R-Forge also hosts many unpublished beta
packages, and development versions of CRAN packages. Microsoft maintains a daily
snapshot of CRAN, that dates back to Sept. 17, 2014.

The Bioconductor project provides R packages for the analysis of genomic data. This
includes object-oriented data-handling and analysis tools for data
from Affymetrix, cDNA microarray, and next-generation high-throughput
sequencing methods.

INTERFACES

The most specialized integrated development environment (IDE) for R is RStudio. A similar

development interface is R Tools for Visual Studio. Some generic IDEs like Eclipse, also
offer features to work with R.

Graphical user interfaces with more of a point-and-click approach include Rattle GUI, R

Commander, and RKWard.

Some of the more common editors with varying levels of support for R
include Emacs (Emacs Speaks Statistics), Vim (Nvim-R plugin), Neovim (Nvim-R
plugin), Kate, LyX, Notepad++, Visual Studio Code, WinEdt, and Tinn-R.

R functionality is accessible from several scripting languages such

as Python, Perl, Ruby, F#, and Julia. Interfaces to other, high-level programming languages,
like Java and .NET C# are available as well.

IMPLEMENTATIONS

The main R implementation is written in R, C, and Fortran, and there are several other
implementations aimed at improving speed or increasing extensibility. A closely related
implementation is pqR (pretty quick R) by Radford M. Neal with improved memory
management and support for automatic multithreading. Renjin and FastR
are Java implementations of R for use in a Java Virtual Machine. CXXR, rho, and Riposte are
implementations of R in C++. Renjin, Riposte, and pqR attempt to improve performance by
using multiple processor cores and some form of deferred evaluation. Most of these
alternative implementations are experimental and incomplete, with relatively few users,
compared to the main implementation maintained by the R Development Core Team.

TIBCO built a runtime engine called TERR, which is part of Spotfire.

Microsoft R Open is a fully compatible R distribution with modifications for multi-threaded
computations.

COMMUNITIES

R has local communities worldwide for users to network, share ideas, and learn.

There is a growing number of R events bringing its users together, such as conferences (e.g.
useR!, WhyR?, conectaR, SatRdays), meetups, as well as R-Ladies groups that promote
gender diversity.

LITERATURE REVIEW
1.TEXT MINING SCIENTIFIC ARTICLES USING R STUDIO
The aim of this study is to develop a solution for text mining scientific articles using
the R language in the " Knowledge Extraction and Machine Learning " course.
Automatic text summary of papers is a challenging problem whose approach would
allow researchers to browse large article collections and quickly view highlights and
drill down for details. The proposed solution is based in social network analysis, topic
models and bipartite graph approaches. Our method defines a bipartite graph between
documents and topics built using the Latent Dirichlet Allocation topic model. The
topics are connected to generate a network of topics, which is converted to bipartite
graph, using topics collected in the same document. Hence, it is revealed to be a very
promising technique for providing insights about summarizing scientific article
collections.

2. AN ANALYTICAL REVIEW STUDY ON BIG DATA ANALYSIS USING R STUDIO

International Journal of Engineering Technologies and Management Research,, 2019

I. Journal
A larger amount of data gives a better output but also working with it can become a challenge
due to processing limitations. Nowadays companies are starting to realize the importance of
using more data in order to support decision for their strategies. It was said and proved
through study cases that "More data usually beats better algorithms". With this statement
companies started to realize that they can chose to invest more in processing larger sets of
data rather than investing in expensive algorithms. During the last decade, large statistics
evaluation has seen an exponential boom and will absolutely retain to witness outstanding
tendencies due to the emergence of new interactive multimedia packages and extraordinarily
incorporated systems driven via the speedy growth in statistics services and microelectronic
gadgets. Up to now, maximum of the modern mobile structures are especially centered to
voice communications with low transmission fees.

Doi: 10.5281/zenodo.3266146

Publication Date: 2019

Publication Name: International Journal of Engineering Technologies and Management

Research.

MODEL RESEARCH IDEA RESULT

The R language is well established as the language for doing statistics, data analysis, data-
mining algorithm development, stock trading, credit risk scoring, market basket analysis and
all manner of predictive analytics. However, given the deluge of data that must be processed
and analyzed today, many organizations have been reticent about deploying R beyond
research into production applications. A large number of fields and sectors, ranging from
economic and business activities to public administration, from national security to scientific
researches in many areas, involve with Big Data problems. On the one hand, Big Data is
extremely valuable to produce productivity in businesses and evolutionary breakthroughs in
scientific disciplines, which give us a lot of opportunities to make great progresses in many
fields. There is no doubt that the future competitions in business productivity and
technologies will surely converge into the Big Data explorations. On the other hand, Big Data
also arises with many challenges, such as difficulties in data capture, data storage, and data
analysis and data visualization. The main objective of this paper is emphasizing the
significance and relevance of Big Data in our business system, society administration and
scientific research. They have purposed potential techniques to solve the problem, including
cloud computing, quantum computing and biological computing. To capture the value from
“Big Data”, we need to develop new techniques and technologies for analyzing it. Until now,
scientists have developed a wide variety of techniques and technologies to capture, curate,
analyze and visualize Big Data. We need tools (platforms) to make sense of “Big Data”.
Current tools concentrate on three classes, namely, batch processing tools, stream processing
tools, and interactive analysis tools. Most batch processing tools are based on the Apache
Hadoop infrastructure, such as Mapreduce , R Programming and Dryad. The interactive
analysis processes the data in an interactive environment, allowing users to undertake their
own analysis of information.

CONCLUSION
There are a number of reasons why R studio is preferred:

There are many answers to this question, but some of the most important are:

1.R and RStudio are free.

One of the biggest perks of working with R and RStudio is that both are available free of
charge. Whereas other, proprietary statistics packages are often stuck in the dark ages of
development (the 1990s, for example), and can be incredibly expensive to purchase, R is a
free alternative that allows users of all experience levels to contribute to its development.

2.Analyses done using R are reproducible.

As many scientific fields embrace the idea of reproducible analyses, proprietary point-and-
click systems actually serve as a hindrance to this process. If you need to re-run your
analysis using one of these systems, you’ll need to carefully copy-and-paste your results
into your text editor, potentially from beginning to end. As anyone who has done this sort
of copy-and-pasting knows, this approach is both prone to errors and incredibly tedious.

If, on the other hand, you use the workflows described in this book, your analyses will be
reproducible, thus eliminating the copy-and-paste dance. And, as you can probably guess,
it is much better to be able to update your code and data inputs and then re-run all of your
analysis with the push of a button than to have to worry about manually moving your
results from one program to another. Reproducibility also helps you as a programmer,
since your most frequent collaborator is likely to be yourself a few months or years down
the road. Instead of having to carefully write down all the steps you took to find the correct
drop-down menu option, your entire code is stored, and immediately reusable.

3.Using R makes collaboration easier.

This approach also helps with collaboration since, as you will see later, you can share a
single R Markdown file containing all of your analysis, documentation, comments, and
code with others. This reduces the time needed to work with others and reduces the
likelihood of errors being made in following along with point-and-click analyses. The
mantra here is to Say No to Copy-And-Paste! both for your sanity and for the sake of
science.

4.Struggling through programming helps you learn.

We all know that learning isn’t easy. Do you have trouble remembering how to follow a
list of more than 10 steps or so? Do you find yourself going back over and over again
because you can’t remember what step comes next in the process? This is extremely
common, especially if you haven’t done the procedure in awhile. Learning by following a
procedure is easy in the short-term, but can be extremely frustrating to remember in the
long-term. If done well, programming promotes long-term thinking to short-term fixes.

R PROGRAMMING QUESTION BANK Answer
100% (1)
R PROGRAMMING QUESTION BANK Answer
20 pages
Data Analytics-Lab Manual
No ratings yet
Data Analytics-Lab Manual
19 pages
R Language 1st Unit Deep
100% (3)
R Language 1st Unit Deep
61 pages
Function Modeling & Information Flow
100% (1)
Function Modeling & Information Flow
12 pages
Se Unit-2
No ratings yet
Se Unit-2
36 pages
Chapter-1:-Introduction To R Language: 1.1 History and Overview
No ratings yet
Chapter-1:-Introduction To R Language: 1.1 History and Overview
7 pages
Unit I Predictive Analytics
No ratings yet
Unit I Predictive Analytics
39 pages
Tutorialspoint For R PDF
100% (2)
Tutorialspoint For R PDF
34 pages
NEW Bayesian - Approaches.in - Oncology.using.R.and - OpenBUGS
100% (1)
NEW Bayesian - Approaches.in - Oncology.using.R.and - OpenBUGS
260 pages
Module 1-1
No ratings yet
Module 1-1
38 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
42 pages
Time Series Econometrics
100% (1)
Time Series Econometrics
223 pages
Unit - 1 Notes R Programming
No ratings yet
Unit - 1 Notes R Programming
52 pages
R Module 1
No ratings yet
R Module 1
34 pages
Modern Data Science With R-775437 Chapters
No ratings yet
Modern Data Science With R-775437 Chapters
10 pages
R Programming Lab Manual
No ratings yet
R Programming Lab Manual
73 pages
BCA-SEP-lesson Plan - R-Programming
No ratings yet
BCA-SEP-lesson Plan - R-Programming
5 pages
Ai Unit 1
No ratings yet
Ai Unit 1
149 pages
R22-Ids-Question Bank
No ratings yet
R22-Ids-Question Bank
4 pages
MCA - BigData Notes
No ratings yet
MCA - BigData Notes
136 pages
Mini Project Report On Heart Disease Pre
No ratings yet
Mini Project Report On Heart Disease Pre
23 pages
BioPerl Tutorial
100% (1)
BioPerl Tutorial
12 pages
SC&RP - Unit 5
No ratings yet
SC&RP - Unit 5
36 pages
Facets of Data
No ratings yet
Facets of Data
6 pages
PPT1
No ratings yet
PPT1
93 pages
Chapter 1 Databases and Database Users
100% (1)
Chapter 1 Databases and Database Users
7 pages
Chapter 2 Introduction To R and Python
No ratings yet
Chapter 2 Introduction To R and Python
35 pages
Big Data Analytics in Weather Forecasting
No ratings yet
Big Data Analytics in Weather Forecasting
29 pages
Book
100% (1)
Book
388 pages
Applied Statistics For Bioinformatics PDF
No ratings yet
Applied Statistics For Bioinformatics PDF
278 pages
Compiled Notes: Mscfe 610 Econometrics
No ratings yet
Compiled Notes: Mscfe 610 Econometrics
44 pages
R Programming
No ratings yet
R Programming
11 pages
R Lesson (1 of 2) PDF
No ratings yet
R Lesson (1 of 2) PDF
182 pages
R Programming in Data Science
No ratings yet
R Programming in Data Science
23 pages
R Programming UNIT-1
No ratings yet
R Programming UNIT-1
48 pages
Software Engineering Notes (Unit-III)
No ratings yet
Software Engineering Notes (Unit-III)
21 pages
Major Project Report - AKTU
No ratings yet
Major Project Report - AKTU
15 pages
Final-BCA V and VI Sem Syllabus
No ratings yet
Final-BCA V and VI Sem Syllabus
25 pages
OOSE Lab Report
No ratings yet
OOSE Lab Report
30 pages
Data Analytics and Reporting - Notes Unit 1 and 2
No ratings yet
Data Analytics and Reporting - Notes Unit 1 and 2
11 pages
OOSE Unit 1 Notes
No ratings yet
OOSE Unit 1 Notes
21 pages
Software Risk, Configuration Management
No ratings yet
Software Risk, Configuration Management
35 pages
01 - Introduction To Big Data Analytics PDF
No ratings yet
01 - Introduction To Big Data Analytics PDF
38 pages
R Data Analyst DAR
No ratings yet
R Data Analyst DAR
6 pages
Iwt Practical
No ratings yet
Iwt Practical
20 pages
Chi Merge
No ratings yet
Chi Merge
5 pages
Assignment - 3 BI
No ratings yet
Assignment - 3 BI
7 pages
Naac-Fdp-Gates Institute of Technology PDF
No ratings yet
Naac-Fdp-Gates Institute of Technology PDF
151 pages
SE 7204 BIG Data Analysis Unit I Final
No ratings yet
SE 7204 BIG Data Analysis Unit I Final
66 pages
Dominators Global Data Flow Analysis
No ratings yet
Dominators Global Data Flow Analysis
30 pages
Modeling and Analysis: Heuristic Search Methods and Simulation
No ratings yet
Modeling and Analysis: Heuristic Search Methods and Simulation
16 pages
8.2 - SW Engineering - Effort Estimation - FP - COCOMO Model - New
No ratings yet
8.2 - SW Engineering - Effort Estimation - FP - COCOMO Model - New
21 pages
Introduction To Data Science With R Programming Language Lab Record
No ratings yet
Introduction To Data Science With R Programming Language Lab Record
20 pages
R Lnaguager
No ratings yet
R Lnaguager
38 pages
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
No ratings yet
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
63 pages
Unit VIII - Query Processing and Security
No ratings yet
Unit VIII - Query Processing and Security
29 pages
Heart
No ratings yet
Heart
28 pages
CNS-MODEL (New) S
No ratings yet
CNS-MODEL (New) S
5 pages
Education Loan Prediction Analysis
No ratings yet
Education Loan Prediction Analysis
5 pages
Cocomo Model1
100% (1)
Cocomo Model1
12 pages
DAA Unit-1
No ratings yet
DAA Unit-1
19 pages
Discrete-Choice Logit Models With R
No ratings yet
Discrete-Choice Logit Models With R
38 pages
Statistical Analysis With R - A Quick Start
100% (1)
Statistical Analysis With R - A Quick Start
47 pages
Reporting and Query Tools and Applications: Tool Categories
No ratings yet
Reporting and Query Tools and Applications: Tool Categories
13 pages
Tutorial On "R" Programming Language
No ratings yet
Tutorial On "R" Programming Language
25 pages
Assignment I Data Analytics
No ratings yet
Assignment I Data Analytics
3 pages
Introduction To R Programming
No ratings yet
Introduction To R Programming
23 pages
Modern Optimization With R Use R 2nd Ed 2021 3030728188 9783030728182 - Compress
No ratings yet
Modern Optimization With R Use R 2nd Ed 2021 3030728188 9783030728182 - Compress
264 pages
R Module 1 Notes
No ratings yet
R Module 1 Notes
15 pages
Synopsis: Stock Agent - A Java Stock Market Trading Program
No ratings yet
Synopsis: Stock Agent - A Java Stock Market Trading Program
27 pages
Mastering Predictive Analytics With R - Sample Chapter
No ratings yet
Mastering Predictive Analytics With R - Sample Chapter
57 pages
Data Science Using Python - Day 1-2
No ratings yet
Data Science Using Python - Day 1-2
25 pages
Syllabus
No ratings yet
Syllabus
8 pages
A Free MATLAB Alternative Called Octave
No ratings yet
A Free MATLAB Alternative Called Octave
6 pages
Dtwclust
No ratings yet
Dtwclust
41 pages
OpenSAP Ds1 Week 1 Unit 1 INTRODS Presentation
No ratings yet
OpenSAP Ds1 Week 1 Unit 1 INTRODS Presentation
16 pages
15 066 PDF
No ratings yet
15 066 PDF
5 pages
Tutorial01 R Introduction
No ratings yet
Tutorial01 R Introduction
39 pages
Big Data - Lab 1
No ratings yet
Big Data - Lab 1
17 pages
Network Ggplot2
No ratings yet
Network Ggplot2
33 pages
Integrating Power BIand BIM360 Through Forgefor Dynamic
No ratings yet
Integrating Power BIand BIM360 Through Forgefor Dynamic
26 pages
The Igraph Software Package For Complex Network Research
No ratings yet
The Igraph Software Package For Complex Network Research
9 pages
Nmmapsdata
No ratings yet
Nmmapsdata
5 pages
Mfuzzgui PDF
No ratings yet
Mfuzzgui PDF
7 pages
SMART Ecological Records - v1
No ratings yet
SMART Ecological Records - v1
2 pages
Data Analytics Training Courses
No ratings yet
Data Analytics Training Courses
7 pages
Mastering Parallel Programming with R
From Everand
Mastering Parallel Programming with R
Simon R. Chapple
No ratings yet
Beginning C# 3.0: An Introduction to Object Oriented Programming
From Everand
Beginning C# 3.0: An Introduction to Object Oriented Programming
Jack Purdum
No ratings yet
Optimizing Hadoop for MapReduce
From Everand
Optimizing Hadoop for MapReduce
Khaled Tannir
No ratings yet
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet

R Studio Project

Uploaded by

R Studio Project

Uploaded by

HISTORY OF R

A GNU package, source code for the R software environment is written primarily

R is an implementation of the S programming language combined with lexical

R and its libraries implement a wide variety of statistical and graphical techniques,

Another strength of R is static graphics, which can produce publication-quality graphs,

Like other similar languages such as APL and MATLAB, R supports matrix arithmetic.

Many features of R derive from Scheme. R uses S-expressions to represent both data and

R supports procedural programming with functions and, for some functions, object-oriented

The capabilities of R are extended through user-created packages, which allow specialised

The most specialized integrated development environment (IDE) for R is RStudio. A similar

Graphical user interfaces with more of a point-and-click approach include Rattle GUI, R

R functionality is accessible from several scripting languages such

TIBCO built a runtime engine called TERR, which is part of Spotfire.

2. AN ANALYTICAL REVIEW STUDY ON BIG DATA ANALYSIS USING R STUDIO

International Journal of Engineering Technologies and Management Research,, 2019

Publication Name: International Journal of Engineering Technologies and Management

MODEL RESEARCH IDEA RESULT

1.R and RStudio are free.

2.Analyses done using R are reproducible.

3.Using R makes collaboration easier.

4.Struggling through programming helps you learn.

You might also like