0% found this document useful (0 votes)
42 views6 pages

Why R Is The Best Coding Language For Data Journalism

Uploaded by

Ritesh Chandra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views6 pages

Why R Is The Best Coding Language For Data Journalism

Uploaded by

Ritesh Chandra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Off the Charts newsletter: Why

R is the best coding language


for data journalism

Dec 18th 2024 Save Share Give

This article is adapted from an edition of our Off the Charts newsletter
originally published in October 2021. Off the Charts is a weekly, subscriber-
only guide to The Economist ’s award-winning data journalism and the ideas,
processes and tools that go into creating it.
Sign up for Off the Charts .
As part of our mini-series on programming languages, James
Fransham makes the case for R being the best language for data
journalism and our team shares their tips for getting started with it.

Data journalism is a pursuit whose success relies on being able to crunch


numbers. Lots of them. For many years data journalism—a term that
began to grow in popularity around 2010—mostly relied on the power of
spreadsheets. But pivot tables, vlookups and other spreadsheet functions
only get you so far. With much more data now available, being able to
perform more powerful and flexible operations is a vital part of the data
journalist’s toolkit. Two programming languages, Python and R, vie for
supremacy. But which is best for our work? Dolly Setton favours Python .

Here is my case for R.

R is an open-source language. It is free to use and available to everyone.


The language was spun out of another one called S in the early 1990s by
two academics working at the University of Auckland in New Zealand. It
was developed to focus on statistical problems and so it naturally handles
data.
R has lots of in-built functions, but it’s the language’s packages that give
it real versatility. These nifty extensions are bundles of code, data and
documentation that can be imported into R with a single line of code.
And there are lots of them. The number of packages has increased from
2,700 in 2010 to more than 20,000 today. Want to know the weather in
Timbuktu yesterday? There are packages for that Or Tesla’s share price?
.

There are packages for that Want to perform some esoteric statistical
.

function? There are packages for those, too R’s packages are organised
.

into task views, so if you’re taking on a new subject that’s a good place to
start.

R Studio neatly organises your R code and its outputs

All versions of R and its packages are available on the Comprehensive R


Archive Network, more commonly known as CRAN a network of servers ,

hosted by academic institutions around the world. The basic interface of


the R software is byzantine. But the R Studio application—built by a for-
profit organisation in America, although the basic version is also open-
source—provides a sleek “integrated development environment” ( IDE )
that sits atop of R. R Studio helps you learn R’s syntax quickly and has
good debugging tools which makes programming almost headache-free.
The popularity of R owes a lot to the pioneering work of a handful of
individuals. Perhaps the most celebrated is Hadley Wickham, a Kiwi
academic and an employee of the company that owns R Studio. Mr
Wickham has developed a number of packages, known as the
“Tidyverse”, that make handling data and number crunching in R more
intuitive for new users.
What really sets R apart is ggplot2. This visualisation package, which was
also developed by Mr Wickham, allows you to create different charts and
visualisations with just a few lines of code. The “gg” stands for “grammar
of graphics”: visualisations can be changed or augmented with additional
simple blocks of code, much like building with Lego. For the work we do,
ggplot allows us to iterate through visual ideas and quickly see what
might be most suitable.
R makes coding fun and flexible. You can become competent in just a few
months with no previous programming experience. And there is a whole
community of like-minded programmers who can help you along the way,
such as those found on Stack Overflow, R-bloggers, R-weekly and R-
Ladies. So if you ever get stuck there are plenty of places to turn to.
Below our team shares some advice for those who want to get started
with using R.

Philipp Hauber, Data scientist


There is no shortage of good introductions to R. But if I had to learn how
to code from scratch I would go with this no-frills tutorial by Norm
Matloff at the University of California. It’s hands-on, requires no prior
experience and gives a good understanding of data analysis in R (and
programming in general). Even experienced R users can pick up a thing
or two.
Marie Segger, Data journalist
Writing long scripts can easily get messy. I occasionally hide some of the
mess by creating a section for it. These sections can be collapsed, so you
don’t have to look at all of the code you have written at once and can
instead skip over these lines. You can open them by clicking on the little
arrow on the line number. I only recently learned that you can give these
code chunks titles. A little overview of all the chapters helps you
seamlessly jump between the sections of your code.

Alex Selby-Boothroyd, Head of data journalism


The best advice I can give if you’ve never used R before is to not be
scared and to just give it a go. When I had to use it for the first time (late
on a Sunday for something due on Monday morning), I had a data frame
up and running and was tentatively tweaking my first charts in ggplot
within a couple of hours, thanks to Google and Stack Overflow. The
hardest part was actually installing it in the first place. Just make you sure
you have admin access to your machine, or the phone number of a
friendly IT-support colleague.

Rosamund Pearce, Visual data journalist


A drawback of making charts in R is that none of the styling is done
automatically. Converting everything to the right colours, fonts and
dimensions can be tedious. I’ve got two strategies for making this less
painful. First, I try to get the chart as far along as I can in R itself, then I
use a theme, a design template in R, that I add to every chart to bring it a
little closer to The Economist ’s style by removing the default grey
background and changing other basic design features. Second, I’ve
written a script to automate some of the styling once I take the chart into
Illustrator for the finishing touches. My script does menial tasks for me,
such as converting “1e+03” to “1,000” (R is fond of scientific notation) so I
can focus on more important style decisions.

You might also like