Conversion Guide R Python Data Manipulation
Conversion Guide R Python Data Manipulation
003 Software Tools — Data Science Afshine Amidi & Shervine Amidi
Conversion Guide between R and Python: R Data type Python Data type Description
character String-related data
Data manipulation object
String-related data that can
factor
be put in bucket, or ordered
numeric float64 Numerical data
Afshine Amidi and Shervine Amidi int int64 Numeric data that are integer
POSIXct datetime64 Timestamps
August 21, 2020
r File management – The table below summarizes the useful commands to make sure the r Filtering – We can filter rows according to some conditions as follows:
working directory is correctly set:
R
df %>%
Category R Command Python Command ..filter(some_col some_operation some_value_or_list_or_col)
setwd(path) os.chdir(path)
where some_operation is one of the following:
Paths getwd() os.getcwd()
file.path(path_1, ..., path_n) os.path.join(path_1, ..., path_n) Category R Command Python Command
list.files( == / != == / !=
path, include.dirs = TRUE os.listdir(path)
) Basic <, <=, >=, > <, <=, >=, >
read.csv(path_to_csv_file) pd.read_csv(path_to_csv_file) Advanced %in% (val_1, ..., val_n) .isin([val_1, ..., val_n])
r Mathematical operations – The table below sums up the main mathematical operations
r Exploring the data – The table below summarizes the main functions used to get a complete that can be performed on columns:
overview of the data:
Operation R Command Python Command
Category R Command Python Command √
x sqrt(x) np.sqrt(x)
df %>% select(col_list) df[col_list]
bxc floor(x) np.floor(x)
Look at data df %>% head(n) / df %>% tail(n) df.head(n) / df.tail(n)
dxe ceiling(x) np.ceil(x)
df %>% summary() df.describe()
df %>% str() df.dtypes / df.info()
Data types
df %>% NROW() / df %>% NCOL() df.shape
Data frame transformation
r Common transformations – The common data frame transformations are summarized in
r Data types – The table below sums up the main data types that can be contained in columns: the table below: