Python | Pandas Reverse split strings into two List/Columns using str.rsplit()
Last Updated :
25 Jun, 2020
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages.
Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas provide a method to split string around a passed separator or delimiter. After that, the string can be stored as a list in a series or can also be used to create multiple column data frame from a single separated string.
rsplit()
works in a similar way like the
.split()
method but
rsplit()
starts splitting from the right side. This function is also useful when the separator/delimiter occurs more than once.
.str has to be prefixed everytime before calling this method to differentiate it from the Python’s default function otherwise, it will give an error.
Syntax:
Series.str.rsplit(pat=None, n=-1, expand=False)
Parameters:
pat: String value, separator or delimiter to separate string at.
n: Numbers of max separations to make in a single string, default is -1 which means all.
expand: Boolean value, returns a data frame with different value in different columns if True. Else it returns a series with list of strings
Return type: Series of list or Data frame depending on expand Parameter
To download the Csv file used, click
here.
In the following examples, the data frame used contains data on some NBA players. The image of data frame before any operations is attached below.
Example #1: Splitting string from right side into list
In this example, the string in the Team column is split at every occurrence of "t". n parameter is kept 1, hence the max number of splits in the same string is 1. Since rsplit() is used, the string will be separated from the right side.
Python3
# importing pandas module
import pandas as pd
# reading csv file from url
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv")
# dropping null value columns to avoid errors
data.dropna(inplace = True)
# new data frame with split value columns
data["Team"]= data["Team"].str.rsplit("t", n = 1, expand = False)
# display
data
Output:
As shown in the output image, the string was splitted at the "t" in "Celtics" and at the "t" in "Boston". This is because the separation happened in reverse order. Since the expand parameter was kept False, a list was returned.
Example #2: Making separate columns from string using .rsplit()
In this example, the Name column is separated at space (” “), and the expand parameter is set to True, which means it will return a data frame with all separated strings in a different column. The Data frame is then used to create new columns and the old Name column is dropped using .drop() method.
n parameter is kept 1, since there can be middle names (More than one white space in string) too. In this case rsplit() is useful as it counts from the right side and hence the middle name string will be included in the first name column because max number of separations is kept 1.
Python3 1==
# importing pandas module
import pandas as pd
# reading csv file from url
data = pd.read_csv("https://media.geeksforgeeks.org/wp-content/uploads/nba.csv")
# dropping null value columns to avoid errors
data.dropna(inplace = True)
# new data frame with split value columns
new = data["Name"].str.split(" ", n = 1, expand = True)
# making separate first name column from new data frame
data["First Name"]= new[0]
# making separate last name column from new data frame
data["Last Name"]= new[1]
# Dropping old Name columns
data.drop(columns =["Name"], inplace = True)
# df display
data
Output:
As shown in the output image, the two new columns were made and old Name column was dropped.
Similar Reads
Pandas - Split strings into two List/Columns using str.split() Pandas str.split() method is used for manipulating strings in a DataFrame. This method allows you to split strings based on a specified delimiter and create new columns or lists within a Series. In this guide, we'll explore how to use the str.split() method with examples, making it easier to handle
3 min read
Split a String into columns using regex in pandas DataFrame Given some mixed data containing multiple values as a string, let's see how can we divide the strings using regex and make multiple columns in Pandas DataFrame. Method #1: In this method we will use re.search(pattern, string, flags=0). Here pattern refers to the pattern that we want to search. It ta
3 min read
Python | Pandas Series.str.cat() to concatenate string Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.Pandas str.cat() is used to concatenate strings to the passed caller series of string.
3 min read
Reorder Columns in a Specific Order Using Python Polars Polars is a powerful DataFrame library in Rust and Python that is known for its speed and efficiency. It's designed to handle large datasets with ease, making it an excellent choice for data analysis and manipulation. One common task in data manipulation is reordering the columns of a data frame. Th
3 min read
Python | Pandas str.join() to join string/list elements with passed delimiter Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas str.join() method is used to join all elements in list present in a series with
2 min read
Python | Pandas Series.str.strip(), lstrip() and rstrip() Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages that makes importing and analyzing data much easier. Pandas provide 3 methods to handle white spaces(including New lines) in any text data
4 min read