Data-Import With Tidyverse

Uploaded by

gpguosiuke1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views2 pages

Data-Import With Tidyverse

Uploaded by

gpguosiuke1

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Data import with the tidyverse : : CHEATSHEET

Read Tabular Data with readr

read_*(file, col_names = TRUE, col_types = NULL, col_select = NULL, id = NULL, locale, n_max = Inf, One of the first steps of a project is to import OTHER TYPES OF DATA
skip = 0, na = c("", "NA"), guess_max = min(1000, n_max), show_col_types = TRUE) See ?read_delim outside data into R. Data is o en stored in Try one of the following
tabular formats, like csv files or spreadsheets. packages to import other types of files:
A|B|C
A B C read_delim("file.txt", delim = "|") Read files with any delimiter. If no The front page of this sheet shows • haven - SPSS, Stata, and SAS files
1 2 3 delimiter is specified, it will automatically guess. how to import and save text files into • DBI - databases
1|2|3 4 5 NA To make file.txt, run: write_file("A|B|C\n1|2|3\n4|5|NA", file = "file.txt")
4|5|NA R using readr. • jsonlite - json
The back page shows how to import • xml2 - XML
A B C read_csv("file.csv") Read a comma delimited file with period • httr - Web APIs
A,B,C spreadsheet data from Excel files
1 2 3 decimal marks. • rvest - HTML (Web Scraping)
1,2,3 4 5 NA write_file("A,B,C\n1,2,3\n4,5,NA", file = "file.csv") using readxl or Google Sheets using
4,5,NA googlesheets4. • readr::read_lines() - text data

read_csv2("file2.csv") Read semicolon delimited files with comma

Column Specification with readr
A B C
A;B;C
1.5 2 3 decimal marks.
1,5;2;3 4.5 5 NA write_file("A;B;C\n1,5;2;3\n4,5;5;NA", file = "file2.csv")
4,5;5;NA Column specifications define what data type each
USEFUL COLUMN ARGUMENTS
column of a file will be imported as. By default
A B C read_tsv("file.tsv") Read a tab delimited file. Also read_table(). readr will generate a column spec when a file is Hide col spec message
ABC read_*(file, show_col_types = FALSE)
1 2 3 read_fwf("file.tsv", fwf_widths(c(2, 2, NA))) Read a fixed width file. read and output a summary.
123 4 5 NA write_file("A\tB\tC\n1\t2\t3\n4\t5\tNA\n", file = "file.tsv")
4 5 NA spec(x) Extract the full column specification for Select columns to import
the given imported data frame. Use names, position, or selection helpers.
read_*(file, col_select = c(age, earn))
USEFUL READ ARGUMENTS spec(x)
# cols(
A B C No header 1 2 3 Skip lines # age = col_integer(), age is an
integer Guess column types
1 2 3 read_csv("file.csv", col_names = FALSE) 4 5 NA read_csv("file.csv", skip = 1) # edu = col_character(),
4 5 NA # earn = col_double() To guess a column type, read_ *() looks at the
A B C Read a subset of lines # ) first 1000 rows of data. Increase with guess_max.
x y z Provide header 1 2 3 read_csv("file.csv", n_max = 1) edu is a read_*(file, guess_max = Inf)
A B C read_csv("file.csv", earn is a double (numeric) character
1 2 3 col_names = c("x", "y", "z")) A B C Read values as missing
4 5 NA
NA 2 3 read_csv("file.csv", na = c("1")) COLUMN TYPES DEFINE COLUMN SPECIFICATION
4 5 NA
Read multiple files into a single table Each column type has a function and Set a default type
read_csv(c(“f1.csv”, “f2.csv”, “f3.csv"), Specify decimal marks corresponding string abbreviation. read_csv(
A;B;C
id = "origin_file") read_delim("file2.csv", locale = file,
1,5;2;3,0 locale(decimal_mark = ",")) • col_logical() - "l"
col_type = list(.default = col_double())
• col_integer() - "i" )
• col_double() - "d"
Save Data with readr • col_number() - "n"
Use column type or string abbreviation
read_csv(
• col_character() - "c" file,
write_*(x, file, na = "NA", append, col_names, quote, escape, eol, num_threads, progress) • col_factor(levels, ordered = FALSE) - "f" col_type = list(x = col_double(), y = "l", z = "_")
• col_datetime(format = "") - "T" )
A B C write_delim(x, file, delim = " ") Write files with any delimiter. • col_date(format = "") - "D" Use a single string of abbreviations
A,B,C • col_time(format = "") - "t"
1 2 3
write_csv(x, file) Write a comma delimited file. # col types: skip, guess, integer, logical, character
4 5 NA 1,2,3 • col_skip() - "-", "_" read_csv(
4,5,NA write_csv2(x, file) Write a semicolon delimited file. • col_guess() - "?" file,
col_type = "_?ilc"
write_tsv(x, file) Write a tab delimited file. )

CC BY SA Posit So ware, PBC • [email protected] • posit.co • readr.tidyverse.org • readxl.tidyverse.org • googlesheets4.tidyverse.org • HTML cheatsheets at pos.it/cheatsheets • readxl 1.4.3 • googlesheets4 1.1.1 • Updated: 2024-05
ft
ft
Import Spreadsheets
with readxl with googlesheets4
READ EXCEL FILES READ SHEETS
A B C D E A B C D E
1 x1 x2 x3 x4 x5 x1 x2 x3 x4 x5 1 x1 x2 x3 x4 x5 x1 x2 x3 x4 x5
2 x z 8 x NA z 8 NA 2 x z 8 x NA z 8 NA
3 y 7 9 10 y 7 NA 9 10 READXL COLUMN SPECIFICATION 3 y 7 9 10 y 7 NA 9 10 GOOGLESHEETS4 COLUMN SPECIFICATION
s1 s1
Column specifications define what data type Column specifications define what data type
each column of a file will be imported as. each column of a file will be imported as.
read_excel(path, sheet = NULL, range = NULL) read_sheet(ss, sheet = NULL, range = NULL)
Read a .xls or .xlsx file based on the file extension. Read a sheet from a URL, a Sheet ID, or a dribble
Use the col_types argument of read_excel() to Use the col_types argument of read_sheet()/
See front page for more read arguments. Also from the googledrive package. See front page for
set the column specification. range_read() to set the column specification.
read_xls() and read_xlsx(). more read arguments. Same as range_read().
read_excel("excel_file.xlsx")
Guess column types Guess column types
To guess a column type, read_ excel() looks at SHEETS METADATA To guess a column type read_sheet()/
READ SHEETS the first 1000 rows of data. Increase with the URLs are in the form: range_read() looks at the first 1000 rows of data.
guess_max argument. https://docs.google.com/spreadsheets/d/ Increase with guess_max.
A B C D E read_excel(path, sheet = read_excel(path, guess_max = Inf) read_sheet(path, guess_max = Inf)
NULL) Specify which sheet SPREADSHEET_ID/edit#gid=SHEET_ID
to read by position or name. Set all columns to same type, e.g. character gs4_get(ss) Get spreadsheet meta data. Set all columns to same type, e.g. character
read_excel(path, sheet = 1) read_excel(path, col_types = "text") read_sheet(path, col_types = "c")
s1 s2 s3
read_excel(path, sheet = "s1") gs4_find(...) Get data on all spreadsheet files.
Set each column individually sheet_properties(ss) Get a tibble of properties Set each column individually
read_excel( for each worksheet. Also sheet_names(). # col types: skip, guess, integer, logical, character
excel_sheets(path) Get a
vector of sheet names. path, read_sheets(ss, col_types = "_?ilc")
s1 s2 s3
col_types = c("text", "guess", "guess","numeric") WRITE SHEETS
excel_sheets("excel_file.xlsx")
) A B C write_sheet(data, ss =
1 x 4 1 1 x 4 NULL, sheet = NULL) COLUMN TYPES
A B C D E To read multiple sheets: 2 y 5 2 2 y 5
Write a data frame into a
COLUMN TYPES l n c D L
A B C D E 1. Get a vector of sheet 3 z 6 3 3 z 6
new or existing Sheet. TRUE 2 hello 1947-01-08 hello
s1
names from the file path. logical numeric text date list FALSE 3.45 world 1956-10-21 1
A B C D E gs4_create(name, ...,
2. Set the vector names to TRUE 2 hello 1947-01-08 hello
s1 s2 A B C D sheets = NULL) Create a
be the sheet names. FALSE 3.45 world 1956-10-21 1 • skip - "_" or "-" • date - "D"
1 new Sheet with a vector
s1 s2 3. Use purrr::map() and • guess - "?" • datetime - "T"
• skip • logical • date 2 of names, a data frame,
s1 s2 s3 purrr::list_rbind() to read • logical - "l" • character - "c"
• guess • numeric • list s1 or a (named) list of data
multiple files into one • integer - "i" • list-column - "L"
• text frames.
path <- "your_file_path.xlsx" data frame. • double - "d" • cell - "C" Returns
A B C
path |> 1 x1 x2 x3
sheet_append(ss, data, • numeric - "n" list of raw cell data.
x1 x2 x3
excel_sheets() |> Use list for columns that include multiple data 2 1 x 4 sheet = 1) Add rows to
2 y 5
set_names() |> types. See tidyr and purrr for list-column data. 3 z 6 3 2 y 5 the end of a worksheet. Use list for columns that include multiple data
map(read_excel, path = path) |> 4 3 z 6 types. See tidyr and purrr for list-column data.
s1
list_rbind()

OTHER USEFUL EXCEL PACKAGES CELL SPECIFICATION FOR READXL AND GOOGLESHEETS4 FILE LEVEL OPERATIONS
For functions to write data to Excel files, see: Use the range argument of readxl::read_excel() or googlesheets4 also o ers ways to modify other
• openxlsx googlesheets4::read_sheet() to read a subset of cells from a aspects of Sheets (e.g. freeze rows, set column
• writexl A B C D E sheet. width, manage (work)sheets). Go to
1 1 2 3 4 5 2 3 4 read_excel(path, range = "Sheet1!B1:D2") googlesheets4.tidyverse.org to read more.
For working with non-tabular Excel data, see: 2 x y z NA y z read_sheet(ss, range = "B1:D2")
• tidyxl 3 6 7 9 10 For whole-file operations (e.g. renaming, sharing,
s1 Also use the range argument with cell specification functions placing within a folder), see the tidyverse
cell_limits(), cell_rows(), cell_cols(), and anchored(). package googledrive at
googledrive.tidyverse.org.

Knowledge Cartography 2014
No ratings yet
Knowledge Cartography 2014
555 pages
CRC Data Science
No ratings yet
CRC Data Science
443 pages
Encoding 1
100% (1)
Encoding 1
54 pages
Data Science Wrangling
No ratings yet
Data Science Wrangling
121 pages
PLCC Overview
100% (2)
PLCC Overview
27 pages
Foreword
No ratings yet
Foreword
1,318 pages
ISCOM5104G-GP (T) Configuration Guide (Web) (Rel - 01)
No ratings yet
ISCOM5104G-GP (T) Configuration Guide (Web) (Rel - 01)
31 pages
Dell Case Study Harvard Business School Answers
100% (2)
Dell Case Study Harvard Business School Answers
11 pages
Gce Npv20n2.en
100% (1)
Gce Npv20n2.en
52 pages
Specimen MS - Paper 1H Edexcel Maths (A) IGCSE
No ratings yet
Specimen MS - Paper 1H Edexcel Maths (A) IGCSE
12 pages
The History of Volleyball Project
0% (1)
The History of Volleyball Project
7 pages
R Programming Cheatsheet
100% (2)
R Programming Cheatsheet
6 pages
42 - 62 FH16 64T - 610HP - Prime Mover
No ratings yet
42 - 62 FH16 64T - 610HP - Prime Mover
5 pages
Cpar 5
No ratings yet
Cpar 5
73 pages
Data Import, Export and Analysis Using R
No ratings yet
Data Import, Export and Analysis Using R
190 pages
TSC To Nist 800-53
No ratings yet
TSC To Nist 800-53
148 pages
O-540-A Ilustrate Parts Catalog PC-115-1
No ratings yet
O-540-A Ilustrate Parts Catalog PC-115-1
72 pages
E-Commerce Mis CH 10
No ratings yet
E-Commerce Mis CH 10
42 pages
NICU Discharge Plan
No ratings yet
NICU Discharge Plan
58 pages
Human Behavior Insights (Ft. Kunal Shah) X
No ratings yet
Human Behavior Insights (Ft. Kunal Shah) X
64 pages
ML QB 5
No ratings yet
ML QB 5
44 pages
Chapter 17 Embankments
No ratings yet
Chapter 17 Embankments
67 pages
Testbank: Chapter 13 Diversification Strategy: True/False Questions
No ratings yet
Testbank: Chapter 13 Diversification Strategy: True/False Questions
8 pages
R Programming Unit 2
No ratings yet
R Programming Unit 2
46 pages
Bohler Art of Interpretation PDF
No ratings yet
Bohler Art of Interpretation PDF
20 pages
211108-2017-Spouses Latonio v. McGeorge Food Industries20180221-6791-1nj34pi
No ratings yet
211108-2017-Spouses Latonio v. McGeorge Food Industries20180221-6791-1nj34pi
8 pages
Pengembangan Konsep Smart Village Bagi Desa-Desa Di Indonesia Developing The Smart Village Concept For Indonesian Villages
No ratings yet
Pengembangan Konsep Smart Village Bagi Desa-Desa Di Indonesia Developing The Smart Village Concept For Indonesian Villages
17 pages
R Programming UNIT 2
No ratings yet
R Programming UNIT 2
119 pages
How To Set Up A Firewall
No ratings yet
How To Set Up A Firewall
2 pages
Old Home: Story Draft
No ratings yet
Old Home: Story Draft
2 pages
M3 Dar
No ratings yet
M3 Dar
52 pages
D) Capitalism vs. Communism (Animal Farm) (FORMATIVE 1)
No ratings yet
D) Capitalism vs. Communism (Animal Farm) (FORMATIVE 1)
10 pages
Ece 534 Summer 2018 Fundamentals of Power Electronics
No ratings yet
Ece 534 Summer 2018 Fundamentals of Power Electronics
2 pages
Csvsimple l3
No ratings yet
Csvsimple l3
62 pages
English Reading Activites - Santiago Arellano - 14-12
No ratings yet
English Reading Activites - Santiago Arellano - 14-12
3 pages
Unit 2 Reading and Writing Files
No ratings yet
Unit 2 Reading and Writing Files
33 pages
Data Import
No ratings yet
Data Import
2 pages
Bread
No ratings yet
Bread
13 pages
Package Readr': October 1, 2020
No ratings yet
Package Readr': October 1, 2020
51 pages
SEU - DS510 - Module 4 Input-Output and Data Structure
No ratings yet
SEU - DS510 - Module 4 Input-Output and Data Structure
68 pages
Gries Stefan Thomas (2013) - Statistics For Linguistics With R - 2
No ratings yet
Gries Stefan Thomas (2013) - Statistics For Linguistics With R - 2
100 pages
Lecture Week2
No ratings yet
Lecture Week2
72 pages
Data Import
No ratings yet
Data Import
2 pages
Daria Reflection
No ratings yet
Daria Reflection
1 page
Practical Preprocessing and Data Cleaning
No ratings yet
Practical Preprocessing and Data Cleaning
51 pages
Curso 2 Data in Out Listas
No ratings yet
Curso 2 Data in Out Listas
30 pages
Job Description
No ratings yet
Job Description
7 pages
MBA Sem 1 Unit 3 Fundamentals of R
No ratings yet
MBA Sem 1 Unit 3 Fundamentals of R
41 pages
Chapter 03 Wrangling
No ratings yet
Chapter 03 Wrangling
40 pages
BIO259 Note
No ratings yet
BIO259 Note
55 pages
Read and Write CSV Files in R
No ratings yet
Read and Write CSV Files in R
39 pages
B.Tech (CSE) Ritu Raj Chanda : Submitted by
No ratings yet
B.Tech (CSE) Ritu Raj Chanda : Submitted by
11 pages
R - Data Input
No ratings yet
R - Data Input
6 pages
Chapter 3
No ratings yet
Chapter 3
15 pages
1.importing Data From External Files
No ratings yet
1.importing Data From External Files
33 pages
Data Table PDF
No ratings yet
Data Table PDF
101 pages
R Master Sheet - All Codes, Inbuilt Functions and Packages Needed For The Course
No ratings yet
R Master Sheet - All Codes, Inbuilt Functions and Packages Needed For The Course
2 pages
Readr
No ratings yet
Readr
34 pages
R Module 4 - Data - IO
No ratings yet
R Module 4 - Data - IO
21 pages
Load Excel Files As Data Frames
No ratings yet
Load Excel Files As Data Frames
23 pages
Data Minig and Techniquezz
No ratings yet
Data Minig and Techniquezz
48 pages
R Programming Materials
No ratings yet
R Programming Materials
51 pages
Modulel IV
No ratings yet
Modulel IV
48 pages
2013 - Notes - R Trinker'S - Notes
No ratings yet
2013 - Notes - R Trinker'S - Notes
274 pages
Reading Files in R Programming Language
No ratings yet
Reading Files in R Programming Language
33 pages
Internship Summary
No ratings yet
Internship Summary
3 pages
Cheat Sheet - Week 4, Input Data Into R: Function Example Options (Parameters) Description
No ratings yet
Cheat Sheet - Week 4, Input Data Into R: Function Example Options (Parameters) Description
1 page
R Lab
No ratings yet
R Lab
7 pages
Data Import Cheatsheet
No ratings yet
Data Import Cheatsheet
2 pages
I R A E D: Mport EAD ND Xport ATA
No ratings yet
I R A E D: Mport EAD ND Xport ATA
28 pages
Lecture 4
No ratings yet
Lecture 4
27 pages
03 Data Input Output
No ratings yet
03 Data Input Output
43 pages
Module II Notes - 1
No ratings yet
Module II Notes - 1
6 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
20 pages
R Basics Continued - Factors and Data Frames - Intro To R and RStudio For Genomics
No ratings yet
R Basics Continued - Factors and Data Frames - Intro To R and RStudio For Genomics
17 pages
R Programming Lab
No ratings yet
R Programming Lab
8 pages
Data Import::: Cheat Sheet
No ratings yet
Data Import::: Cheat Sheet
2 pages
Mod3 Tables EPP
No ratings yet
Mod3 Tables EPP
9 pages
R Basic and Advanced
No ratings yet
R Basic and Advanced
9 pages
Week 7
No ratings yet
Week 7
10 pages
Unit 1 R Reading-Writing Files
No ratings yet
Unit 1 R Reading-Writing Files
8 pages
R Docs
No ratings yet
R Docs
45 pages
R Prog
No ratings yet
R Prog
27 pages
Data Import
No ratings yet
Data Import
2 pages
DSCI 100 Cheat Sheet
No ratings yet
DSCI 100 Cheat Sheet
3 pages
R Programming Cheat Sheet: Ata Tructures
No ratings yet
R Programming Cheat Sheet: Ata Tructures
2 pages
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
From Everand
Bash Shell from Zero to Hero: An SRE's Practical Guide to Terminal Skills, Scripting, and Automation
Nolan Reeves
No ratings yet
Windows Command Prompt A-N
From Everand
Windows Command Prompt A-N
Prometheus MMS
5/5 (2)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
MVS JCL Utilities Quick Reference, Third Edition
From Everand
MVS JCL Utilities Quick Reference, Third Edition
Robert Wingate
5/5 (1)

Data-Import With Tidyverse

Uploaded by

Data-Import With Tidyverse

Uploaded by

Data import with the tidyverse : : CHEATSHEET

Read Tabular Data with readr

read_csv2("file2.csv") Read semicolon delimited files with comma

You might also like