We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1
Best Practice for R : : CHEAT SHEET
Software Packages Databases Learning More
Write code in the RStudio IDE Packages should be loaded in one place with • Use {DBI} and {odbc} to connect to SQL • For common data science tasks, successive calls to library() • Use helper functions to create connections see R for Data Science (2e) Use quarto for literate programming connect_to_db <- function(db) { • For package development, Use the tidyverse for normal DBI::dbConnect( see R Packages (2e) Use git to version-control your wrangling, plotting etc code and analysis odbc::odbc(), Database = db, • For advanced programming, Use tidymodels for modelling and # Hard-code common options here see Advanced R (2e) Use GitHub to collaborate with machine learning ) • For app development, # Connect using the helper other people } con <- connect_to_db(“DWH") see Mastering Shiny Use {shiny}, {bslib} and Projects {bs4Dash} for app development Functions WRITING FUNCTIONS: WORKFLOW Use r-lib packages like {rlang}, {cli} PROJECT CREATION & {glue} for low-level programming • Write functions to reduce repetition or a <- complex operation on a 1. Repetitive, complex • Create a new project in RStudio using Use {renv} in long-term projects to increase clarity b <- complex operation on b code; purpose File > New Project > New Directory track dependency packages • Write many small functions that call c <- complex operation on c clarified by • Do put projects in a single, local folder like each other d <- complex operation on d comments C:\Users\your-name\Documents GitHub stars are a good proxy for a package's • Define functions in dedicated • Don't put projects in locations controlled by quality. Not sure whether to use a package? If scripts with corresponding names 2. Complex logic operate_on <- function(x) { OneDrive / iCloud (these don’t play well with it has >200 stars on GitHub it's probably good! abstracted into complex operation on x Git) functions NAMING CONVENTIONS }
PROJECT STRUCTURE Getting Help ✗ Bad (noun-like) ✓ Good (verb-like)
a <- operate_on (a) totals_getter() compute_totals() 3. Repetition Most projects should be structured like this: CREATE A REPREX b <- (b) operate_on reduced; clearer modeller_func() fit_model() c <- (c) operate_on code; less need for my-project/ .gitignore tells git which • A minimal, reproducible example should files not to track d <- (d) operate_on comments demonstrate the issue as simply as possible project_data() import_datasets() .gitignore • Copy your example code and run R code to run on .Rprofile startup reprex::reprex() to embed errors/messages/ outputs as comments Styling For other styling guidance, refer to the Tidyverse style guide
R/ Scripts in R/ should • Use your reprex in a question on Teams or NAMING THINGS
define functions for # Good (lower_snake_case everywhere): 01-import.R use elsewhere Stackoverflow • Use lower_snake_case for most objects add1 <- function(x) x + 1 02-tidy.R print("Hello " + "world!") (functions, variables etc) first_letters <- letters[1:3] Use folders SQL/, iris_sample <- slice_sample(iris, n = 5) SQL/ data/ etc for other file #> Error in "Hello " + "world!": non-numeric argument to • Title_Snake_Case may be used for column types binary operator names # Bad (non-syntactic, not lower_snake_case): costs.sql • Use only syntactic names where possible `add 1` <- function(x) x + 1 Use a top-level R This reprex minimally demonstrates an script to run everything (include only numbers, letters, underscores FirstLetters <- letters[1:3] run-all.R error when attempting to use + for and periods, and don't start with a number) iris.sample <- slice_sample(iris, n = 5) Python-style string concatenation renv/ Records of package versions; created using ETIQUETTE WHEN ASKING QUESTIONS WHITESPACE renv.lock renv::init() # Good (lots of spaces, indents always by +2): Don't Do • Add spaces after commas and around df <- iris |> my-project.Rproj operators like |>, %>%, +, -, *, /, = and <- mutate( A .Rproj file Post screenshots Use reprex::reprex() and paste README.md makes this • Indentation increases should always be by Sepal.Area = Sepal.Width * Sepal.Length, directory an of your code your code as text exactly 2 spaces Petal.Area = Petal.Width * Petal.Length Write the main facts RStudio project about the project here • Add linebreaks when lines get longer ) Use dput() or tibble::tribble() to than 80 characters. Include big files # Bad (inconsistent spacing and indentation): include a data sample • When there are many arguments in a call, df<-iris |> NB, usethis::use_description() + give each argument its own line (including mutate(Sepal.Area=Sepal.Width*Sepal.Length, usethis::use_namespace() will Ensure your code only Ignore messages the first one!) Petal.Area=Petal.Width*Petal.Length) turn this structure into a package! fails where you're or warnings expecting it to CC BY SA Jacob Scott • github.com/wurli • Updated: 2023-11